0% found this document useful (0 votes)
33 views

Advance Calculus Based Apllication

This document provides information about the proceedings from the 5th European Conference on Numerical Mathematics and Advanced Applications held in Prague, Czech Republic in August 2003. It contains invited plenary lectures and contributed papers on various topics in numerical mathematics, such as finite element methods, parallel solvers, boundary conditions for hyperbolic equations, fictitious domain methods, domain decomposition, nonlinear elliptic equations, lattice Boltzmann models, and more. The proceedings were edited by Miloslav Feistauer, Vit Dolejsi, Petr Knobloch, and Karel Najzar from the Charles University in Prague.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Advance Calculus Based Apllication

This document provides information about the proceedings from the 5th European Conference on Numerical Mathematics and Advanced Applications held in Prague, Czech Republic in August 2003. It contains invited plenary lectures and contributed papers on various topics in numerical mathematics, such as finite element methods, parallel solvers, boundary conditions for hyperbolic equations, fictitious domain methods, domain decomposition, nonlinear elliptic equations, lattice Boltzmann models, and more. The proceedings were edited by Miloslav Feistauer, Vit Dolejsi, Petr Knobloch, and Karel Najzar from the Charles University in Prague.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 873

Numerical Mathematics and Advanced Applications

M.Feistauer • V. Dolejsi • P. Knobloch • K. Najzar


Editors

Numerical Mathematics
and
Advanced Applications
Proceedings of ENUMATH 2003
the 5th European Conference
on Numerical Mathematics and
Advanced Applications
Prague, August 2003

~ Springer
Editors
Miloslav Feistauer
Vit Dolejsf
Petr Knobloch
Karel Najzar
Charles University Prague
Facuhy of Mathematics and Physics
Department of Numerical Mathematics
Sokolovska 83, 186 75 Praha 8
Czech Republic

email:
feist, dolejsi, knobloch, [email protected]

Library of Congress Control Number: 2004lO7015


Mathematics Subject Classification (2000): 65-XX, 74-XX, 76-XX, 78-XX

ISBN 978-3-642-62288-5 ISBN 978-3-642-18775-9 (eBook)


DOI 10.1007/978-3-642-18775-9
This work is subject to copyright. All rights are reserved, whether the whole or part of the
material is concemed, specifically the rights oftranslation, reprinting, reuse ofillustrations,
recitation, broadcasting, reproduction on microfilm or in any other way, and storage in
data banks. Duplication of this publication or parts thereof is permitted only under the
provisions of the German Copyright Law ofSeptember 9, 1965, in its current version, and
permission for use must always be obtained from Springer-Verlag. Violations are liable for
prosecution under the German Copyright Law.

© Springer-Verlag Berlin Heidelberg 2004


Originally published by Springer-Verlag Berlin Heidelberg New Y0I:k in 2004
Softcover reprint of the hardcover lst edition 2004
The use of designations, trademarks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
Cover Design: design 6- production GmbH, Heidelberg
Typesetting: Computer to film by author' s data
Printed on acid-free paper 40f3142XT 543210
Preface

These proceedings collect the major part of the lectures given at ENU-
MATH2003, the European Conference on Numerical Mathematics and Ad-
vanced Applications, held in Prague, Czech Republic, from 18 August to 22
August, 2003.
The importance of numerical and computational mathematics and sci-
entific computing is permanently growing. There is an increasing number of
different research areas, where numerical simulation is necessary. Let us men-
tion fluid dynamics, continuum mechanics, electromagnetism, phase transi-
tion, cosmology, medicine, economics, finance, etc. The success of applications
of numerical methods is conditioned by changing its basic instruments and
looking for new appropriate techniques adapted to new problems as well as
new computer architectures.
The ENUMATH conferences were established in order to provide a fo-
rum for discussion of current topics of numerical mathematics. They seek to
convene leading experts and young scientists with special emphasis on con-
tributions from Europe. Recent results and new trends are discussed in the
analysis of numerical algorithms as well as in their applications to challenging
scientific and industrial problems.
The first ENUMATH conference was organized in Paris in 1995, then
the series continued by the conferences in Heidelberg 1997, Jyvaskyla 1999
and Ischia Porto 2001. It was a great pleasure and honour for the Czech
numerical community that it was decided at Ischia Porto to organize the
ENUMATH2003 in Prague. It was the first time when this conference crossed
the former Iron Courtain and was organized in a postsocialist country.
The ENUMATH2003 was organized by the Faculty of Mathematics and
Physics of the Charles University in cooperation with the Department of
Mathematics of the Institute of Chemical Technology in Prague. The Charles
University, the oldest university in the Middle Europe, was founded in 1348.
In the middle ages, mathematics was studied in Prague at the Artistic Fac-
ulty, later at the Philosophical Faculty and in the 20th century it belonged to
the Faculty of Natural Sciences till 1952, when the Faculty of Mathematics
and Physics was founded. As follows from historical sources, already in the
15th century the students of the Charles University had the opportunity to
be trained in "Computational Mathematics". Kfisfan from Prachatice, who
vi Preface

is in the Czech history well-known as a Church reformer (he was a friend


of the famous reformer Jan Hus), or a personal doctor of medicine of the
Czech and German-Roman King Wenceslas IV, was a professor of astronomy
and mathematics at the Charles University. He wrote lecture notes with the
title "Algoritmus Prosaicus", where he describes among other approximate
methods for the realization of various mathematical operations. The contem-
porary Czech school of numerical and applied mathematics was born much
later in the second part of the 20th century. It is connected mainly with
the names of Ivo Babuska, who can be considered as a father of the Czech
numerical mathematics, Karel Rektorys, Milan Prager, Emil Vitasek, Milos
Zlamal, Alexander Zenisek, Ivan Hlavacek, a famous specialist in PDE's Jin-
dfich Necas (who influenced a number of Czech numerical analysts) and Jan
Polasek (who founded the Prague school of CFD).
These proceedings contain a selection of invited plenary lectures, papers
presented in minisymposia and works communicated within the sessions. All
contributions of these proceedings have been reviewed by members of the
Scientific Committee. At this occasion we want to thank the members of
the Program Committee (F. Brezzi, M. Feistauer, R. Glowinski, R. Jeltsch,
Yu. Kuznetsov, J. Periaux, R. Rannacher) and the members of the Scientific
Committee (0. Axelsson, C. Bernardi, C. Canuto, M. Griebel, R. Hoppe,
G. Kobelkov, M. Krizek, P. Neittaanmiiki, O. Pironneau, A. Quarteroni,
C. Schwab, E. Stili, W. Wendland) for their scientific support. We are grateful
to the plenary speakers R. Blaheta, A. Bermudez, T. Gallouet, J. Haslinger,
R. Hiptmair, T. Hughes, J. Rappaz, A. Russo, A. Tveito and V. Schulz for
coming to Prague and richly contributing to the success of the conference.
We are very much obliged to A. Klic, the head of the Department of
Mathematics of the Institute of Chemical Technology, and our colleagues J.
Felcman, J. Segethova and E. Plandorova from the local organizing com-
mittee for the cooperation in the organization of the conference. Finally, we
are gratefully indebted to O. Ulrych for the TEX editing and the prepara-
tion of the camera-ready manuscript of the proceedings. Lastly, we thank all
participants for coming and animating the meeting.
We believe that this volume will be an invaluable instrument for obtain-
ing an overview of the latest and newest results and aspects of numerical
mathematics and scientific computing and their applications.

M. Feistauer
V. DolejSi
P. Knobloch
K. Najzar
editors
Table of Contents

Part I Plenary Lectures

Numerical Analysis of Finite Element Methods for Eddy


Current Problems. Applications to Electrode Simulation
Alfredo Bermudez, Rodolfo Rodriguez, Pilar Salgado. . . . . . . . . . . . . . . . . 3

Space Decomposition Preconditioners and Parallel Solvers


Radim Blaheta. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 20

Boundary Conditions for Hyperbolic Equations or Systems


Thierry Gallouet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 39

Fictitious Domain Methods in Shape Optimization with


Applications in Free-Boundary Problems
Jaroslav Haslinger, Tomas Kozubek, Karl Kunisch, Gunter Peichl . . . .. 56

Part II Contributed Papers

Domain Decomposition Method for a Class of Non-Linear


Elliptic Equation with Arbitrary Growth Nonlinearity and
Data Measure
Nour Eddine Alaa, Jean Rodolphe Roche .......................... 79

Variants of Relaxation Schemes and the Lattice Boltzmann


Model Relaxation Systems
Mapundi Kondwani Banda ...................................... 89

A Time Semi-Implicit Relaxation Scheme for Two-Phase


Flows in Pipelines
Michael Baudin, Frederic Coquel, Quang-Huy Tran . ................. 102
Computational Study of Field Scale BTEX Transport and
Biodegradation in the Subsurface
Markus Bause .................................................. 112
viii Table of Contents

A Two-Level Stabilization Scheme for the N avier-Stokes


Equations
Roland Becker, Malte Braack . .................................... 123

A Posteriori Error Estimates for Parameter Identification


Roland Becker, Boris Vexler . ..................................... 131

On a Phase-Field Model with Advection


Michal Benes . .................................................. 141
Fast Evaluation of Eddy Current Integral Operators
Steffen Borm ................................................... 151
Adaptive Computation of Reactive Flows with Local Mesh
Refinement and Model Adaptation
Malte Braack, Alexandre Ern . .................................... 159
An Alternative to the Least-Squares Mixed Finite Element
Method for Elliptic Problems
Jan Brandts, Yanping Chen . ..................................... 169
Limit Analysis Method in Electrostatics
Igor A. Brigadnov . .............................................. 176
Finite Element Mesh Adjusted to Singularities Applied to
Axisymmetric and Plane Flow
Pavel Burda, Jaroslav Novotny, Bedfich Soused{k, Jakub S{stek ....... 186

The Edge Stabilization Method for Finite Elements in CFD


Erik Burman, Peter Hansbo ...................................... 196
Analysis and Computation of Dendritic Growth in Binary
Alloys Using a Phase-Field Model
Eric Burman, Marco Picasso, Jacques Rappaz ...................... 204
Discontinuous Galerkin Methods for Timoshenko Beams
Fatila Celiker, Bernardo Cockburn, Sukru Giizey, Ramdev Kanapady,
Sew-Chew Soon, Henrik K. Stolarski, Kummar Tamma .............. 221
Numerical Algorithms for Solving Elliptic-Parabolic
Problems
Raimondas Ciegis .............................................. 232

Stochastic Relaxation of Variational Integrals with


N on-attainable Infima
Dennis D. Cox, Petr Kloucek, Daniel R. Reynolds, Pavel Sol{n ....... 239
Table of Contents ix

A Pressure-Weighted Upwind Scheme in Unstructured


Finite-Element Grids
Masond Darbandi, Kinmars Mazaheri-Body, Shidvash Vakilipour . ..... 250
Discontinuous Galerkin Finite Element Method for the
Numerical Solution of Viscous Compressible Flows
Vit Dolejif{ ..................................................... 260
A Finite Volume Scheme on General Meshes for the Steady
N avier-Stokes Equations in Two Space Dimensions
Robert Eymard, Raphaie Herbin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Existence and Uniqueness of a Weak Solution
to a Stratigraphic Model
Robert Eymard, Thierry Gallouei, Veron'iqne Gervais, Roland Masson. 278
Combined Nonconforming/Mixed-hybrid Finite Element-
Finite Volume Scheme for Degenerate Parabolic Problems
Robert Eymard, Danielle Hilhorst, Martin Vohmlzk ..... ............. 288
Discrete Maximum Principle for Galerkin Finite Element
Solutions to Parabolic Problems on Rectangular Meshes
Istvan Famg6, R6bert Horvath, Sergey Korotov ..................... 298
Cubature-Differences Method for Singular Integro-differential
Equations
Alexander I. Fedotov ............................................ 308
Nonconforming Discretization Techniques for Overlapping
Domain Decompositions
Bernd Flem'isch, Michael Mair, Barbam Wahlm71th . ................. 316
On the Use of Implicit Updates in Minimum Curvature
Multi-step Quasi-Newton Methods
John A. Ford, Issam A. Moghmbi . ................................ 326
A Boundary Movement Identification Method for a Parabolic
Partial Differential Equation
Tom P. Fredman . ............................................... 336
On Computational Properties of a Posteriori Error Estimates
Based upon the Method of Duality Error Majorants
Maxim Frolov, Pekka Neittaanmiib, Sergey Repin . .................. 346
Efficient Algorithm for Local-Bound-Preserving Remapping
in ALE Methods
Rao Garimella, Milan K nchai'zk, Mikhail Shashkov ................. 358
x Table of Contents

Mimetic Finite Difference Methods for Diffusion Equations


on Unstructured Triangular Grid
Victor Ganzha, Richard Liska, Mikhail Shashkov, Christoph Zenger 368

On Computational Glaciology: FE-Simulation of Ice Sheet


Dynamics
Gunter Godert, Franz- Theo Suttmeier ............................. 378

Nonreflecting Boundary Conditions for Multiple Domain


Wave Scattering in Unbounded Media
Marcus J. Grote, Christoph Kirsch, Patrick Meury .................. 391
On the Choice of the Regularization Parameter in the Case
of the Approximately Given Noise Level of Data
Uno Hiimarik, Toomas Raus . ..................................... 400

Adaptive Discontinuous Galerkin Finite Element Methods


with Interior Penalty for the Compressible Navier-Stokes
Equations
Ral! Hartmann, Paul Houston . ................................... 410

On a Novel Technique for Parallel Unstructured Mesh


Generation in 3D
Jan Haskovec, Pavel Bolin . ...................................... 420

Adaptive Finite Element Methods for Turbulent Flow


Johan Hoffman, Claes Johnson . .................................. 430

Numerical Solution of a Nonlinear Evolution Equation


Describing Amorphous Surface Growth of Thin Films
Ronald H. W. Hoppe, Eva Nash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

Constrained Mountain Pass Algorithm for the Numerical


Solution of Semilinear Elliptic Problems
Jif{ Horak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Optimal Shape Design of Diesel Intake Ports with
Evolutionary Algorithm
Andras Horvath, Zoltan Horvath . ................................. 459

Numerical Simulation of Compressible Fluids with Moving


Boundaries: An Effective Method with Applications
Zoltan Horvath, Andras Horvath . ................................. 471

Discontinuous Galerkin Methods for the Time-Harmonic


Maxwell Equations
Paul Houston, Ilaria Perugia, Anna Schneebeli, Dominik Schotzau .... 483
Table of Contents xi

Mixed hp-Discontinuous Galerkin Finite Element Methods


for the Stokes Problem in Polygons
Paul Houston, Dominik Schotzau, Thomas P. Wihler ................ 493
A Postprocessing of Hopf Bifurcation Points
Dasa Janovska, Vladimir Janovsky . ............................... 502
Givens' Reduction of Quaternion-Valued Matrices to Upper
Hessenberg Form
Drahoslava Janovska, Gerhard Opfer . ............................. 510

Model of Compressible Flow and Transport in a Time-


Dependent Domain
Pavel Jiranek, Jiri Maryska, Jan Sembera . ......................... 521
Numerical Study of Convection of Multi-Component Fluid
in Porous Medium
Olga Kantur, Vyacheslav Tsybulin ................................ 531
Multi-yield Elastoplastic Continuum-Modeling and
Computations
Johanna Kienesberger, Jan Valdman .............................. 539

Celebrating Fifty Years of David M. Young's Successive


Overrelaxation Method
David R. Kincaid ............................................... 549

On the Relational Database Style Parallel Numerical


Programming
Bela Kiss, Anna Krebsz ......................................... 559
A Dynamical System Describing Evolution of the Implicit
Surfaces in Incompressible Viscous Liquids
Petr Kloucek, Michel V. Romerio, Jennifer L. Wightman ............ 569

Discrete Maximum Principles in Finite Element Modelling


Sergey Korotov, Michal Krizek . ................................... 580
A Posteriori Error Estimation in Terms of Linear Functionals
for Boundary Value Problems of Elliptic Type
Sergey Korotov, Pekka Neittaanmiiki, Sergey Repin . ................. 587
Numerical Solution of Flow in Backward Facing Step
Karel Kozel, Petr Louda, Petr Svacek ............................. 596

Periodicity Properties of Solutions to a Hysteresis Model in


Micromagnetics
Marlin K ruzz'k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Xli Table of Contents

Mixed Finite Element Method on Polygonal and Polyhedral


Meshes
Yuri Kuznetsov, Sergey Repin .................................... 615

Semi-discrete Schemes for Hamilton-Jacobi Equations on


Unstructured Grids
Doron Levy, Suhas Nayak ........................................ 623
Numerical Simulation of Dislocation Dynamics
Vojtech Minarik, Jan Kratochvzl, Karol Mikula, Michal Benes . ....... 631
Implicit FEM-FCT algorithm for compressible flows
Matthias Moller, Dmitri Kuzmin, Stefan Turek . .................... 641

A Singular Limit Method for the Stefan Problems


Hideki Murakawa, Tatsuyuki Nakaki . .............................. 651

Higher-Order Split-Step Schemes for the Generalized


Nonlinear Schrodinger Equation
Gulcin M. Muslu, Husnu A. Erbay ................................ 658
Numerical Methods and Simulation Techniques for Flow
with Shear and Pressure Dependent Viscosity
Abderrahim Ouazzi, Stefan Turek ................................. 668
Piecewise Polynomial Approximations for Linear Volterra
Integro-Differential Equations with Nonsmooth Kernels
Arvet Pedas .................................................... 677

On a Discontinuous Galerkin Method for Radiation-Diffusion


Problems
Ilaria Perugia, Dominik Schotzau, James Warsa .................... 687
Modeling of Multi-Phase Flows with a Level-Set Method
Sander P. van der Pijl, A. Segal, C. Vuik .......................... 698

Numerical Modeling of Bypass Flow


Vladimir Prokop, Karel Kozel . ................................... 708

A Posteriori Estimation of Dimension Reduction Errors


Sergey Repin, Stefan Sauter, Anton Smolianski ..................... 716

Analysis of a Multi-NumericsjMulti-Physics Problem


Beatrice Riviere ................................................ 726
The Discontinuous Galerkin Method for Singularly Perturbed
Problems
Hans-Gorg Roos, Helena Zarin ................................... 736
Table of Contents Xlll

A Finite-Volume Mass- and Vorticity-Conserving Shallow-


Water Model using Penta- /Hexagonal Grids
William Sawyer, Rolf leltsch ..................................... 746

Application of Parallel Computing Techniques for Problems


of Degenerated Diffusion
Milan Senkyr, liN Mikyska, Michal Benes . ........................ 756

The Finite Element Analysis of an Elliptic Problem with a


Nonlinear Newton Boundary Condition
Veronika SoboUkova ............................................. 766
Automatic Goal-Oriented hp-Adaptivity Without Error
Estimates
Pavel Solin, Leszek Demkowicz ................................... 775
A Compression Method for the Helmholtz Equation
Mirjam Stolper, Sergej Rjasanow ................................. 786
Application of a Stabilized FEM to Problems of Aeroelasticity
Petr Svacek, Miloslav Feistauer ................................... 796
A Numerical Approach to the Dynamical Behavior of
Initiated Pulses in Some Nonlinear Diffusion Equations
Kenji Tomoeda ................................................. 806

Fully Two-dimensional HLLEC Riemann Solver and


Associated Difference Schemes
Pavel Vachal, Richard Liska, Burton Wendroff ..................... 815
Deflation Accelerated Parallel Preconditioned Conjugate
Gradient Method in Finite Element Problems
Fred I. Vermolen, Kees Vuik, Guus Segal . ......................... 825
Advantages of Binomial Checkpointing for Memory-reduced
Adjoint Calculations
Andrea Walther, Andreas Griewank ............................... 834
An Efficient Multigrid FEM Solution Technique for
Incompressible Flow with Moving Rigid Bodies
Decheng Wan, Stefan Turek, Liudmila S. Rivkind ................... 844

Higher-Order FEM for a System of Nonlinear Parabolic


PDE's in 2D with A-Posteriori Error Estimates
Martin Zitka, Karel Segeth, Pavel Solin . ........................... 854
Part I

Plenary Lectures
Numerical Analysis of Finite Element Methods
for Eddy Current Problems. Applications to
Electrode Simulation

Alfredo Bermudez!, Rodolfo Rodriguez 2 and Pilar Salgado 3

1 Departamento de Matematica Aplicada, Universidade de Santiago de


Compostela, Spain, [email protected]
2 GI 2 MA, Departamento de Ingenieria Matematica, Universidad de Concepcion,
Chile, [email protected]
3 Departamento de Matematica Aplicada, Universidade de Santiago de
Compostela, Spain, mpilar@usc. es

Summary. The objective of this work is to introduce and numerically solve a 3D-
mathematical model for steady thermoelectrical behavior of electrodes in a metal-
lurgical electric furnace. The mathematical model couples the time-harmonic eddy
current model with the heat transfer equations in a bounded 3D-domain. An impor-
tant part of the paper deals with the analysis and numerical solution of the eddy
current model in a bounded domain.

1 Introduction

Silicon is produced industrially by reduction of silicon dioxide with carbon by


a reaction which can be written in a simple way as follows:

Si O 2 + 2C = Si + 2CO.
This reaction takes place in submerged arc furnaces which use three-phase
alternating current. A simple sketch of the furnace can be seen in Figure 1. It
consists of a cylindrical pot containing charge materials and three electrodes
disposed conforming an equilateral triangle.
Electrodes are the main components of reduction furnaces and their pur-
pose is to conduct the electric current which enters the electrode through the
"contact clamps" (see Figure 1). The electric current goes down crossing the
column length comprised between the contact clamps and the lower end of the
column generating heat by Joule effect. At the tip of the electrode an electric
arc is produced, reaching temperatures of about 2500 °C which are needed for
the reduction chemical reactions to take place.
Classical electrodes extensively used in industry include pure graphite, pre-
baked and S¢derberg electrodes. The latter are the most used in ferro-silicon
industry and they are composed by paste consisting of a carbon aggregate
and a tar binding which are fed into a steel casing; the casing have steel
4 A. Bermudez et al.

Contact clamps: (
current entrance

Fig. 1. A reduction furnace

fins attached to its inner part, which are placed radially in the cylinder. The
great amount of heat generated by Joule effect is partially employed to bake
the paste; this is a crucial process during which the initially soft/liquid non-
conductive paste at the top of the electrode becomes a solid conductor. The
advantages of S¢derberg electrodes with respect to pure graphite or prebaked
electrodes are that they are built in larger sizes and cost less. However, as the
electrode is consumed, it has to be slipped and the steel casing moves with the
carbon body so it melts and pollutes silicon. This is why they cannot be used
to obtain silicon metal or silicon with metallurgical quality, which is used as
alloying of other metals as aluminum. Thus prebaked electrodes have been for
many years the only alternative for commercial silicon metal production.
In the early nineties, the Spanish company Ferroatlantica S.L. built a new
compound electrode named ELSA ([14]) which serves for the production of
silicon metal. It seems to be the solution for all silicon furnaces because its
cost can be up to one third the price of a prebaked electrode.
ELSA electrode consists of a central column of baked carbonaceous mate-
rial, graphite or similar, surrounded by a S¢derberg-like paste (see Figure 2).
There is a steel casing without fins that contains the paste until it is baked
at the contact clamps zone. Two different slipping systems exist, one for the
casing and another one for the central column; the combination of both sys-
tems is necessary so as to slip the casing as little as possible and also to carry
out the correct extrusion of the carbon electrode. Then, unlike in the case of
S¢derberg electrodes, the casing is not consumed and it is possible to produce
silicon with metallurgical quality. The result is that the furnace operation is
similar to that of prebaked electrodes, but the compound electrode is less ex-
pensive. The disadvantage is that slipping velocity is not free as in prebaked
electrodes, jJecause the paste has to be baked before leaving the casing, so it is
necessary a minimum period of time between slippages. Thus, baking of paste
is a crucial point in the working of this type of electrodes.
Methods for Eddy Current Problems 5

Nipple

Support
system

".

Fig. 2. Sketch of ELSA electrode

In general, the design and control parameters of electrodes are very complex
and numerical simulation plays an important role at this point. Modeling the
involved phenomena in a computer allows us to analyze the influence of chang-
ing a parameter without the need of expensive and difficult tests. Thus, during
the last 20 years, an important number of mathematical models and computer
programs have been developed in order to simulate the thermo electrical be-
havior of classical electrodes (see for instance [15, 17, 18]). In particular, the
mathematical models based on cylindrical symmetry have been the most ex-
tensively used. However, ELSA electrode works in a different manner from the
classical electrodes. While classical electrode has only a constitutive material,
compound electrode combines a good electric current conductor as graphite
with a paste which becomes a good conductor only at high temperatures. Not
only the core of graphite is important in the movement of the column but also
in the distribution of current inside the electrode. Moreover, unlike S¢derbeg
electrodes, the non existence of fins gives a geometrical axisymmetry (see Fig-
ure 3).
This is why we first developed a finite element method based on cylindrical
symmetry to compute the electric current and temperature distribution in
a radial section of the electrode [5]. While the axisymmetric model has given
valuable information on important electrode parameters, the assumption of
cylindrical symmetry makes necessary to neglect the following facts:
The electromagnetic effect caused on one electrode by the two others, that
is the so called "proximity effect". This arises because the magnetic field
generated by each electrode induces eddy currents in the two others.
Thermal ,boundary conditions are not axisymmetric. Indeed, the tempera-
ture of the air around the electrode is greater on the surfaces oriented toward
the furnace center.
6 A. Bermudez et al.
Steel casing Graphite Steel casing

Paste Steel fins


Paste

ELSA SODERBERG

Fig. 3. Cross section of ELSA and S0derberg electrode

The current entrance in the electrode through the contact clamps is not ax-
isymmetric. The current is transferred to the contact clamps through copper
bus tubes which in its turn are connected to three transformers with differ-
ent phases. Then, in each electrode, half of the clamps receive current from
one transformer while the other ones are connected to a second transformer.
These points can only be considered by using a pure three-dimensional model.
Moreover, 3D-models are always needed to simulate S0derberg electrodes be-
cause the presence of fins breaks cylindrical symmetry (see Figure 3). Thus, we
have developed a three dimensional thermo electrical model which is enough
general to model any kind of electrodes and even the complete furnace. In this
paper, we describe two different mathematical models and analyze them from
mathematical and numerical points of view.
The electromagnetic problem is obtained from the time-harmonic Maxwell
equations assuming the frequency is low enough as to neglect the term in-
volving the displacement current in Ampere's law. This is the so-called eddy
current model. Because of many interesting applications in electrical engineer-
ing, numerical simulation of eddy current problems have led to a great number
of publications in recent years (see for instance [1, 2, 3, 10, 11, 12, 13]). We
notice that Maxwell equations concern the whole space, but we are interested
in solving the problem in a bounded domain, so we have to define suitable
boundary conditions and this need represents the main difficulty to study the
problem in a bounded domain. Thus, we start introducing the eddy current
problem in the whole furnace, including the electrodes and the air around, and
defining natural and essential boundary conditions. In a second step we change
this model, by introducing realistic boundary conditions, to compute the elec-
tromagnetic fields in only one electrode. Finally, we couple the electromagnetic
model with a thermal one. Coupling between Maxwell and heat transfer equa-
tions is due to Joule effect which is the source term in the heat equation, and
to the fact that thermoelectrical parameters depend on temperature.
The outline of the paper is as follows: In Section 2 we deal with the math-
ematical and numerical analysis of the electromagnetic problem in a bounded
3D domain which includes conductors and dielectrics. We introduce a weak
Methods for Eddy Current Problems 7

formulation which involves the magnetic field in the conductor domain and
a scalar magnetic potential in the dielectric one. This hybrid formulation is
discretized by using Nedelec edge finite elements for the magnetic field and
standard piecewise linear continuous elements for the magnetic potential. The
resulting discrete problems are studied and error estimates are obtained under
mild smoothness assumptions on the solution. Section 3 is devoted to propose
and analyze a finite element method to solve the electromagnetic problem only
in one electrode. We introduce a weak formulation of the problem in terms of
the magnetic field and deal with boundary conditions directly related with
the intensities which enter the domain. Lagrange multipliers are introduced
to impose these "non standard" boundary conditions and the resulting mixed
formulations are studied following classical techniques. In Section 4, we couple
the electromagnetic problem with the thermal one and give a result concern-
ing existence of solution. We end the paper by reporting, in Section 5, some
numerical results obtained for ELSA and Soderberg electrodes.

2 The electromagnetic problem in the whole furnace

In order to consider all the facts which are neglected in the axisymmetric
models, we start proposing a model to solve the eddy current problem in
a bounded domain like the one presented in Figure 4, which includes not
only conductors (the electrodes and wires supplying the electric current), but
dielectrics as well (the air).

2.1 The eddy current problem

Eddy currents are usually modeled by the low-frequency harmonic Maxwell


equations. Assuming alternating electric current of angular frequency w, they
are

curlH= J, (1)
iWfLH + curlE = 0, (2)
divB = 0, (3)
divD = p, (4)

with
B = fLH, D = EE, J = (TE, (5)
where H, J, B, E, and D are the complex amplitudes associated with the
magnetic field, the current density, the magnetic induction, the electric field
and the electric displacement, respectively; p is the electric charge density, fL
is the magnetic permeability, E is the electric permittivity and (T is the electric
conductivity.
8 A. Bermudez et al.

fD

Fig. 4. Sketch of the furnace Fig. 5. Sketch of a general domain

We will solve these equations in a bounded domain [l, which consists of


two parts, [le and [lD' occupied by conductors and dielectrics, respectively
(see Figure 5). The boundary of the domain [l also splits into two parts:
re := {)[len{)[l and r D := {)[lDn{)[l. Finally, we denote by 1; := {)[len{)[lD' the
interface between dielectric and conductors. The boundary conditions added
to the eddy current model are

Exn=g (6)
Hx n=f (7)
Methods for Eddy Current Problems 9

with g and f being given tangential vector fields (i.e., satisfying g. n = 0 on


I'c and f· n = 0 on ~) and n an outward unit normal vector to 8[2.
We remark that (6) is the natural condition for the conducting part of the
boundary, while (7) is imposed on the dielectric part and allows taking into
account all of the electromagnetic effects outside the domain.
We will introduce and analyze a finite element method to solve this problem
in domains of general topology. To attain this goal, we will consider a formula-
tion introduced by Bossavit and Verite [13]' which involves the magnetic field
in the conductor domain and a scalar magnetic potential in the dielectric one.
Then, as a first step, we start analyzing a weak formulation of the problem in
terms of the magnetic field.

2.2 A magnetic field formulation of the eddy current problem

Let us assume that [2 is simply connected, with a Lipschitz-continuous con-


nected boundary. The sub domains [2c and [2D are also assumed to have
Lipschitz-continuous boundaries, although not necessarily connected. Finally,
r;
the boundaries of Tc ' TD , and are also assumed to have Lipschitz-continuous
boundaries.
Let us consider the following closed subspaces of H( curl, [2),
v = {G E H(curl, [2): curl G = 0 in [2D}'
V °= {G E V: G x n = 0 m. H-001/ 2(TD )3} ,

where H~01/2(~)3 denotes the dual space of H;~2(~)3 which, in its turn, is
the space of functions defined on TD that extended by 0 on 8[2 \ TD belong to
Hl/2(8[2)3. We assume that j.t, E, 0" E LOO([2), and that there exist constants,
!:!:.' f, and IZ, such that
j.t(x) ~ !:!:. > 0, E(X) ~ f > 0, a.e. in [2,
O"(x) ~ IZ > 0, a.e. in [2c' O"(x) = ° in [2D.

We suppose that the boundary data g satisfies g x n E H;~2(I'c)3. On the


other hand, concerning the boundary data f, we suppose there exists a field
Hf E V such that

Then, multiplying the equation (2) by a test function of the space Vo,
integrating in [2, and using Green's formula, (1), (6), and (7), we obtain the
following weak formulation in terms of the magnetic field H.
Problem MP.- To find H E V such that

jn p,H . G- + 11 - curl H· curl G-= (g x n, G x n)


(8)
iw r VG E Vo. (9)
nc 0" c
10 A. Bermudez et al.

Theorem 1. If there exists Hf E V such that Hf x n = fin Hoo1/2(rD)3, then


problem MP has a unique solution.

Once the magnetic field H is known, the current density J and the electric
field E can be computed in conductors, namely, J = curl Hand E = (lJ) In.
(I "e
These are the magnitudes actually needed in most applications and satisfy the
Maxwell equations (1)-(5) and boundary conditions (6)-(7) (see Theorem 3.2
in [7]).

2.3 Introducing a magnetic potential

In this section we show how problem MP can be transformed by replacing the


magnetic field in the dielectric domain DD by a (scalar) magnetic potential.
Let De = U;:g D~, with D~ being the union of all the connected compo-
nents of De such that D \ D~ is simply connected, and D~, j = 1, ... , J, the
remaining connected components of De (see Figure 5).
We assume that for each D~, j = 1, ... , J, there exists an open "cut"
- '-J-
surface E j C DD such that aEj C aDD and DD := DD \ U;:o E j is pseudo-
Lipschitz and simply connected (see Figure 5). We also assume that each of
these surfaces E j is connected, and E j n Ek = 0 for j #- k (see, for instance,
[4]).
For any function (j E H1(QD)' we denote by [(jh j the jump of (j through
E j . The gradient of (j in V' (Q D ) can be extended to L2 (DD)3 and will be
denoted by grad (j.
Let 8 be the linear space of W (Q D ) defined by

8 = {(j E H1(Q D ) : [(jh j = constant, j = 1, ... , J} .

Then, for (j E W(QD)' we have that grad(j E H(curl, DD) if and only if
(j E 8, in which case curl (grad (j) = 0 (see Lemma 3.11 in [4]). Then, for
all G E V there exist (j E 8 such that Gin = grad (j.
D
We introduce the following notation: for GeE L2 (De)3 and G D E L2 (DD)3 ,
we denote by (GeIG D) the field G E L 2(D)3 defined a.e. by

G(x) '= {Ge(X) if x E De'


. GD(x) if X E DD'

Let us denote by W the linear space given by

W := { (G, (j) E H( curl, De) X (8 Ie): (GI grad (j) E H( curl, D) } .

Similarly, we define the closed subspace of W

W o := {
(G,- - tJr
tJr) E W: grad - x n -1/2 (~) 3} .
. Hoo
= 0 m
Methods for Eddy Current Problems 11

By using this notation we can define the following problem:


Problem HP.- To find (H, ($) E W such that

~ d:i
gra '¥ x n =
f 2n
. H-l/2(J:)3

1 1~ 1
00 D'

iw fiH· G+ curl H· curl G + iw /1 grad ($. grad J; =


nc nc (J" nD
= (g x n,G x nlrc \f(G,.Ji) E Woo

This is the well known magnetic field/magnetic potential hybrid formu-


lation of the eddy current problem introduced by Bossavit and Verite [13].
The main advantage with respect to formulation (8)-(9) lies in the fact that
a vector field is replaced by a scalar one in the dielectric domain.

Theorem 2. Under the assumptions of Theorem 1, problem HP has a unique


solution (H, ($), with (HI grad ($) being the unique solution of problem MP.

2.4 Numerical solution

In this section we first introduce a discretization of problem MP and then we


obtain a discrete version of problem HP equivalent to the previous one.
Let us assume D, Dc, and DD are Lipschitz polyhedra, and consider a family
of regular tetrahedral meshes {Th } of D such that, for every mesh Th , each
element K E Th is contained either in Dc or in DD (h stands as usual for the
corresponding mesh-size).
The magnetic field is discretized by using Nedelec edge finite elements (see
[19]). In particular, H is approximated in each tetrahedron K by a polynomial
vector field in the space

Then, fields in H( curl, D) will be approximated in the following finite di-


mensional space:

In order to use these elements to discretize problem MP, we have to use an


approximation fr of the boundary data f such that a discrete version of equation
(8) can hold true. To attain this goal, we will use the two-dimensional Nedelec
interpolant of n x f on the triangular mesh induced by Th on the polyhedral
surface I;,. This interpolant and several of its properties are described in detail
in [7].
Then, in order to discretize problem MP, we introduce the following finite-
dimensional spaces,
12 A. Bermudez et al.

Vh := {Gh E Nh([J): curl Gh = 0 in [JD}'


V~ := {Gh E Vh: G h x n = 0 on r D},

and obtain the following discrete magnetic problem:


Problem DMP.- Find Hh E Vh such that

pHh . Gh + 1~
nc (J"
curl Hh . curl Gh = /, g x n . Gh x n
rc
\lGhEV~.

Theorem 3. Let us assume that the solution H of problem MP satisfies


Hln E W(curl, [Jc) and Hln E W([JD)3, with r E (~, 1]. Then, f1 is well
c D
defined by the 2D Nedelec interpolant of n x f, problem DMP has a unique
solution Hh and the following error estimate holds:

However, problem DMP is actually just a "theoretical" method in that its


solution requires to impose somehow the curl-free condition in the definition
of Vh to trial and test functions. Then, we will handle this curl-free condition
by introducing a discrete multiple-valued magnetic potential in the dielectric
domain.
We assume that the cut surfaces E j are polyhedral and that the meshes are
compatible with them, in the sense that each E j is union of faces of tetrahedra
n -
K E T h , for each mesh T h . Therefore, Th D := {K E T h : K C [JD} can also
be seen as a mesh of [JD.
In order to introduce an approximation of the space 8, let us denote

Then, we consider the family of finite dimensional subspaces of 8 given by

The following lemma shows that each curl-free vector field in N h([JD) ad-
mits a multiple-valued potential in 8 h (see [7]).

Lemma 1. Let G h E L2 ([JD)3. Then G h E N h ([JD) with curl Gh = 0 in [JD


if and only if there exists ;Ph E 8 h such that G h = grad;Ph in [JD. Such;Ph is
unique up to an additive constant.

Let us introduce the following families of finite-dimensional approximations


of Wand WO, respectively:
Methods for Eddy Current Problems 13

Wh:= {(Gh'~h) EN"h(flc ) X (8 h/C): (Ghlgrad~h) E H(cUrl,fl)},

W~:= {(Gh'~h) E Wh: grad~h X n = 0 on rD}.


Thus, we define the following discrete problem which is equivalent to problem
DMP:
Problem DHP.- To find (Hh' JS h) E Wh such that

grad JS h n = fj on r
1 1 ..!.
X D,

iw
Jl c
j1H h· G h +
Jl c (7
curlHh· curlGh + iw
JJlr
D
j1gradJSh · gradtT;h

Theorem 4. Let us assume that the solution (H, JS) of problem HP satisfies
HEW (curl, flc) and grad JS E W (flD)3, with r E (~, 1]. Then, problem
DHP is well posed, it has a unique solution (Hh,JS h ), and

IIH - HhIIH(curl,Jlc) + II gradJS - gradJShlb(JlD)3

::::; Ch r [IIHllw(cUrl,Jlc) + II gradJSIIW(JlD)3] .

Effective procedures to solve numerically the problem DMP are described in


[7]. In particular, numerical techniques to impose the following constraints are
studied:
1. (Ghl grad ~h) E H(curl, fl), which arise in the definition of Who
2. [~hh:J = constant, which arise in the definition of 8h.
3. The boundary condition grad JSh x n = fI on rD.
The first constraint is imposed by eliminating the degrees of freedom of Gh
associated with the edges C E n
in terms of those of JSh corresponding to the
vertices of the mesh on this interface.
l'he second constraint is handled by distinguishing the degrees of freedom
of If/h on one side of the surface E j from those on the other side, and by
eliminating ones of them in terms of the others and the current intensities
through each conductor fl~.
The third constraint is imposed by means of a Lagrange multiplier, in-
creasing in this way the number of unknowns but with the advantage that the
computer implementation is quite straightforward.
We have developed a MATLAB code which implements the method de-
scribed above. To validate the computer code and to test the performance and
convergence properties of the method, we have solved a problem with known
analytical solution (see [7] for further details).
14 A. Bermudez et al.

3 The electromagnetic problem in one electrode

3.1 Statement of the problem

The model described in the previous section presents some drawbacks. First, it
is highly complex and its numerical solution takes a lot of time. On the other
hand, it is difficult to obtain the boundary data f from realistic data such as
intensities or potentials, which usually are the only data we know. Then, we are
going to propose an alternative approach which consists in solving the eddy
current problem in one electrode which is a particular bounded conducting
domain. We are going to analyze a weak formulation of this problem in terms
of the magnetic field, considering realistic boundary conditions from the point
of view of applications. In particular, following Bossavit [12], we will consider
boundary conditions directly related with the input current intensities which
enter the electrode. We will impose these boundary conditions by means of
Lagrange multipliers and study the resulting mixed formulations.
Since we only consider the conducting domain, we will get an important
saving in computer time when compared with the model of the whole furnace,
and we will still be able to consider some important effects which are not taken
into account by the axisymmetric models, although not the proximity effect.
We consider a bounded conducting domain D having a Lipschitz-conti-
nuous and connected boundary. However, it is not necessary that D be simply
connected. Let 8D be the boundary of the domain D which splits into two
parts: 8D = t;, u t;. The surface r;, corresponds to the tip of the electrode
where the electric arc arises. In its turn, the rest of the electrode boundary
splits as follows:
- -0 -1 -N
r;=~ u~ U .. ·u~ ,
where ~n, n = 1, ... , N, are the parts of the boundary connected to the wires
supplying electric current to the electrode, and SO = ~ \ (~1 U .. :.-u ~N) is the
remaining part (see Figure 6). We also assume ~n n r;, = 0 and ~n n ~m = 0,
m, n = 1, ... , N, m # n.
Our goal is to solve the eddy current equations (1)~(5) subject to the
following boundary conditions:

Exn=O on r;" (10)


r curlH.n=I
lrn n , n=l, ... ,N, (11)
J

Exn=O on ~n, n=l, ... ,N, (12)


curlH· n = 0 on ~o, (13)
jLH·n=O on 8D, (14)

where the only data In, n = 1, ... ,N, are the current intensities through each
wire.
Methods for Eddy Current Problems 15

I; (electric arc)

Fig. 6. Example of domain

Condition (10) is the natural one to model the free current exit on the
electrode tip. Conditions (11) and (13) take into account the input intensities
and the fact that there is no current flow through r;D, respectively. Conditions
(12) and (14) have been proposed by Bossavit [12] in a more general setting.
They will appear as natural boundary conditions of the weak formulation of our
problem. The former implies the assumption that the electric current is normal
to the surface on the current entrance, whereas the latter means that the
magnetic field is tangential to the conductor surface. Of course, condition (14)
is not always fulfilled, but it is a good approximation in our model problem.
Next, we analyze a weak formulation of this problem in terms of the mag-
netic field and propose a finite element method for its numerical solution.

3.2 Analysis of the weak formulation of the problem

To obtain a weak formulation of the eddy current problem (1)-(5) with bound-
ary conditions (10)-(14) in terms of the magnetic field, we notice that the
boundary condition (14) implies that the tangential component of E on the
boundary of [2 is a gradient. In particular, we obtain that E x n = -V¢ x n
on 8[2 for some scalar function ¢ with ¢ = 0 on ~, because of (10).
Moreover, because of (12), ¢Irn must be constant. Then, multiplying the
equation (2) by a test function G such that curl G . n = 0 on r;D and
J

frn curl G . n = 0, n = 1, ... , N, using Green's formula, and taking into


J
account that E = ~ curl H, we obtain

iw 1 -+1-
n
pH· G 1 curl H· curl G
n(J
- = O.

Let X := H( curl, [2) and a: X x X ----+ C be the sesquilinear continuous


and elliptic form defined by
16 A. Bermudez et al.

a(H, G) := iw 1 -+ 11-
n
fLH· G
n(}
curlH· curlG.-
Let £ be the following closed subspace of H~b2(r;):

£ := {v E H~b2(r;): vlr,n = constant, n = 1, ... , N}.


Given I = (h, ... ,IN) E eN, let us consider the closed linear manifold of
X,

W(I):={GEX: (cUrlG.n,v)r,=t f nInD VVE£},


n=)r,
and its associated subspace

W(O) = { G EX: (curl G . n, VI r, = 0 Vv E £ } .

We introduce the following problem:


Problem PI.- For any I E eN, find H E W(I) such that

a(H,G) = 0 VG E W(O).

Theorem 5. Given I E eN, problem PI has a unique solution H.


To avoid dealing with functions that satisfy the constraints involved in
W(I) and W(O), we consider a mixed formulation of the problem. It consists in
handling the boundary conditions (11) and (13) in a weak sense by introducing
a Lagrange multiplier defined on r;.
Let b be the sesquilinear form defined in X x £ by

b(G,v):= (curlG.n,vl r .
J

The mixed problem associated with problem PI is the following:


Problem MPI.- Given I E eN, find HEX and), E £ such that

a(H, G) + b(G,),) = 0 VGEX,


N
b(H,v) = L /, In D Vv E £.
n=l r,n

Theorem 6. Given I E eN, let HEX be the solution of problem PI. Then,
there exists a unique), E £ such that (H,)') is the only solution of problem
MPI. Furthermore, the following estimate holds:
Methods for Eddy Current Problems 17

The proof is based on the classical Babuska-Brezzi theory. In particular we


prove the inf-sup condition for the bilinear form b by using results concerning
vector potentials in ]R.3 (see [9]).
Theorem 3.5 in [9] shows that the solution of problem MPI, together with
E = ~ curlH and J = curlH, satisfy the Maxwell equations (1)-(5) and
the boundary conditions (10)-(14) in a suitable weak sense. Moreover, from
that theorem, we also have that the Lagrange multiplier is an electric surface
potential on I;, namely,

n x (E x n) = -n x (V.\* x n) =: - grad r.\* on I;,


A* being a lifting of A to D such that A* E Hl(D) and A*lrE = O.

3.3 Finite element discretization

In this section we introduce a discretization of the mixed problem MPI and


study its convergence properties. To this goal, we assume that D is a Lipschitz
polyhedron and that r;n are polyhedral surfaces for all n = 0, ... ,N. Conse-
quently, r;" is also a polyhedral surface. We also assume that (J is piecewise
smooth (e.g., C2 ) on a polyhedral partition of D.
We consider a family of shape-regular tetrahedral meshes {Th } of D. We
assume that the meshes are compatible with the splitting of the boundary of
the domain in the sense that, VK E Tit with a face T lying on 8D,
- either T c ~ or T c ~n for some n = 0, ... , N;
- (JIT is smooth.

The magnetic field, which is a function of X = H( curl, D), is discretized by


the lowest-order Nedelec edge finite elements described in Section 2, i.e. we
define X h = Ah (D) as an approximation of X.
Let TitI'o be the triangular mesh induced by Tit on the polyhedral surface I;
and consider the following finite-dimensional space:

The Lagrange multiplier will be approximated in the finite dimensional space

Lh := {Vh E Qk(I;): vhlr;n = constant, n = 1, ... , N}.

We define the following discrete problem


Problem DMPI.- Given I E cF, find Hh E X hand Ah E Lh such that

a(Hh, G h) + b(Gh, Ah) = 0 VG h E Xh,


N
b(Hh' Vh) = ] ; ~n In Dh VVh E Lh.
18 A. Bermudez et al.

Theorem 7. Given I E reP, problem DMPI attains a unique solution (Hh, Ah)
Furthermore, if the solution (H, A) of problem MPI satisfies H E Hr (curl, D)
with 1/2 < r :::; 1, then the following error estimate holds true:

References

1. Alonso, A., Valli, A. (1997): A domain decomposition approach for heteroge-


neous time-harmonic Maxwell equations, Comput. Methods Appl. Mech. En-
grg., 143, 97~112.
2. Alonso, A., Valli, A. (1999): An optimal domain decomposition preconditioner
for low-frequency time-harmonic Maxwell equations, Math. Comp., 68, 607~631.
3. Ammari, H., Buffa, A., Nedelec, J.-C. (2000): A justification of eddy currents
model for the Maxwell equations, SIAM J. Appl. Math., 60, 1805~1823.
4. Amrouche, C., Bernardi, C., Dauge. M., Girault, V. (1998): Vector potentials in
three-dimensional non-smooth domains, Math. Methods Appl. Sci., 21, 823~864.
5. Bermudez, A., Bullon, J., Pena, F., Salgado, P. (2003), A numerical method for
transient simulation of metallurgical compound electrodes, Finite Elem. Anal.
Des., 39, 283~299.
6. Bermudez, A., Munoz, R (1999): Existence of solution of a coupled problem
arising in the thermoelectrical simulation of an electrode, Quart. of Appl. Math.,
57, 621~636.
7. Bermudez, A., Rodriguez, R, Salgado, P. (2002): A finite element method with
Lagrange multipliers for low-frequency harmonic Maxwell equations, SIAM J.
Numer. AnaL, 40, 1823~1849.
8. Bermudez, A., Rodriguez, R., Salgado, P. (2003): Numerical analysis of the
electric field formulation of an eddy currents problem, C. R Acad. Sci. Paris,
Serie I, 337, 359~364.
9. Bermudez, A., Rodriguez, R., Salgado, P., Numerical treatment of realistic
boundary conditions for the eddy current problem in an electrode via Lagrange
multipliers, Math. Comp., (to appear).
10. Bossavit, A. (1991): The computation of eddy-currents in dimension 3 by using
mixed finite elements and boundary elements in association, Math. Comput.
Modelling, 15, 33~42.
11. Bossavit, A. (1999): "Hybrid" electric-magnetic methods in eddy-current prob-
lems, Comput. Methods Appl. Mech. Engrg., 178, 383~391.
12. Bossavit, A. (2000): Most general "non-local" boundary conditions for the
Maxwell equation in a bounded region, COMPEL, 19, 239~245.
13. Bossavit, A., Verite, J.C. (1982): A mixed FEM-BIEM method to solve 3-D
eddy current problems, IEEE Trans. Mag., 18, 431~435.
14. Bullon, J., Gallego,V. (1994): The use of a compound electrode for the produc-
tion of silicon metal. Electric Furnace Conference Proceedings, Vol. 52, Iron &
Steel Society, Warrendale, PA, 371~374.
15. D'Ambrosio, P., Letizia, I. (1980): Temperature and internal stress distribution
of carbon electrodes used in an electric arc furnace for the production of silicon
metal. Proceedings of Carbon'80, Baden-Baden, 526~529.
Methods for Eddy Current Problems 19

16. Howison, S.D., Rodrigues, J.F., Shillor, M. (1993): Stationary solutions to the
thermistor problem, J. Math. Anal. Appl., 1742, 573-588.
17. Innva::r, R., Fidje, K., Sira, T. (1987): 3-dimensional calculations on smelting
electrodes. Reprinted in MIC-Model. Identif. Control 8, 103-115.
18. Innva::r, R., Olsen, L. (1980): Practical use of mathematical models for Soderberg
electrodes. In: Electric Furnace Conference Proceedings, vol. 38. Iron & Steel
Society, Warrendale, PA, 40-47.
19. Nedelec, J.-C. (1980): Mixed finite elements in ]R3, Numer. Math., 35, 315-34l.
20. Salgado, P. (2002): Mathematical and numerical analysis of some electromag-
netic problems. Application to the simulation of metallurgical electrodes. Ph.
D. Thesis, Universidade de Santiago de Compostela, Spain.
Space Decomposition Preconditioners
and Parallel Solvers

Radim Blaheta

Institute of Geonics AS CR
Studentski 1768, 70800 Ostrava- Pomba, Czech Republic
[email protected]. cz

Summary. This paper uses the general framework of space decomposition - sub-
space correction for providing an overview of Schwarz-type preconditioners. The con-
sidered preconditioners are one-level and two-level Schwarz methods based on an
overlapping domain decomposition, a two-level method with the coarse grid space
created by aggregation and a new two-level method with interfaces in the coarse
grid space. Beside the description and analysis, we discuss also some implementation
details, the use of inexact subproblem solvers etc. The efficiency of preconditioners
is illustrated by numerical examples.

1 Introduction

This paper concerns the numerical solution of large scale linear systems arising
from the finite element (FE) solution of boundary value problems. Especially,
the attention is payed to that numerical methods, which are suitable for im-
plementation on high performance parallel computers.
This motivates the interest in space decomposition (SD) pre conditioners
described in Section 2. These preconditioners provide a general framework,
which has many specific applications. This paper concentrates on Schwarz-type
preconditioners, which are usually based on the overlapping domain decompo-
sition (DD). This class itself involves many variants of the basic technique and
some of them will be reported in this paper. There are also many other use-
ful decompositions, which can be used for the construction of preconditioners.
Let us mention two examples: the hierarchical decomposition of FE spaces (cf.
[13, 20, 10]) or displacement decomposition for elasticity problems (cf. e.g. [4]
and the references therein).
An overview of the standard one-level and two-level overlapping DD meth-
ods can be found in Sections 3 and 4. These methods are also called Schwarz
methods according to the pioneering work of H.A. Schwarz from 1870, cf. [19].
But the real and rapid development of these methods, motivated by the inter-
est in parallel computing, starts in the second half of 1980's.
In Section 5, we describe a less standard two-level Schwarz method with
the auxiliary global problem created by aggregation of unknowns. This method
is useful in many cases, when application of the standard two-level methods
Space Decomposition Preconditioners and Parallel Solvers 21

requires a lot of extra work involved in creating an auxiliary coarse grid and
relating this grid to the original one.
Section 6 is devoted to the description of a new two-level method, which
uses non-overlapping DD and interaction of subdomain problems only via
a special coarse grid space with interfaces.
Some aspects of implementation of the preconditioners on parallel comput-
ers, the use of inexact subproblem solvers and use of nonsymmetric multiplica-
tive or hybrid pre conditioners are discussed in Section 7. The precondition-
ers, which are not symmetric positive definite, can be efficiently implemented
within the generalized preconditioned conjugate gradient method (GPeG), cf.
[4] and the references therein. In Section 8, we provide some numerical results
illustrating the efficiency of the described methods and corresponding solvers.
In the final section, we summarize the results and mention some topics not
covered in the paper.

2 Abstract SD preconditioners

Our aim is the solution of abstract discrete symmetric elliptic problem in the
following form
finduEV: a(u,v)=l(v) \/vEV, (1)
where V is a finite dimensional subspace of a Hilbert space V = V(f?) of
functions defined in a domain f? C Rd (d = 2,3), a is a bounded symmetric
positive definite (SPD) bilinear form on V and 1 is a bounded linear functional
on V.
To be more specific, we shall consider elliptic boundary value problems
with the bilinear form

ouov
I: k
( d
a(u, v) = if ij ax ax dx (2)
f"l i,j=1 2 J

defined in the Sobolev space Hl(f?) equipped with the seminorm 1·IH1(f"l) and
the norm 11·IIH1(f"l)' We assume that K = (k ij ) is a symmetric positive definite
d x d matrix, which guarantees the existence of positive constants 11:1, 11:2, such
that
(3)
Sometimes, we shall assume that V c HI (f?) is such that

11:0 Ilvll~I'(f"l) ::; a(v, v) \/vEV. (4)


Let us note that most of the presented results are valid also for other elliptic
problems, e.g. problems of linear elasticity.
Using an inner product (,) in V, the problem (1) can be rewritten into the
following operator form
22 R. Blaheta

find u E V: Au = b, (5)

where A: V ---+ V and b E V are determined by the identities

(Au,v) = a(u,v), (b, v) = l(v) \:Iu,v E V.

Further, we shall assume that there is a decomposition of the space V,

(6)

where Vk (k = 1, ... , m) are subspaces of V, which are not necessarily linearly


independent. For each subspace Vk , we introduce
- operator Ak: Vk ---+ Vk defined by (AkU,V) = a(u,v) \:Iu,v E Vk,
- prolongation operator h: Vk ---+ V given by the inclusion Vk C V,
- restriction operator R k : V ---+ Vk given by orthogonal projection to Vk , i.e.
\:Iv E V: RkV E Vk and (v - RkV,W) = 0 \:Iw E Vk .
Note that hand Rk are adjoint with respect to (,) and Ak = RkAh.
The decomposition (6) allows to introduce space decomposition (SD) pre-
conditioners. The SD preconditioner is an operator 0, which can be used for
a cheap computation of the pseudoresidual 9 = Or from a given residual r.
The pseudoresidual should provide an approximation of the error A-I r or at
least its direction. The computation of pseudoresiduals 9 is realized by the
following SD algorithm,

9=0
for k = 1, ... ,m do
9 <-- 9 + hAk1 Rk Zk
end

This SD algorithm represents several types of preconditioners:


1. The additive preconditioner 0 A: r ---+ 9 arises if Zk = r. In this case,
m

(7)

is symmetric and positive definite with respect to (,). Moreover,


m

(8)

where Pk are projections V ---+ Vk , which are orthogonal with respect to


the inner product induced by A, (U,V)A = (Au,v).
Space Decomposition Preconditioners and Parallel Solvers 23

2. The multiplicative preconditioner G M: r ---+ 9 arises if Zk = r - Ag.


In this case, it is easy to verify by induction that

GM = [1 - (I - Pm) ... (I - Pd]A- I


= A-I[l - (1 - ABm) ... (1 - ABI)]. (9)

This preconditioner is not symmetric. To obtain a symmetric multiplicative


operator, it is possible to continue in the subspace corrections in the reverse
order. It gives

For this pre conditioner , it is easy to show that G sMA is symmetric in ( , ) A,


which implies that GSM is symmetric in (,). Similarly, it is easy to show
that G SM A and G SM are positive semidefinite. The positive definiteness
of GSM is equivalent to the fact that 1 - G MA is convergent (see Theorem
2).
3. The hybrid preconditioner G H: r ---+ 9 arises if some residuals are
updated and others are kept. For example, we can update only the residual
after the first subspace correction, which gives
m m

k=2 k=2
This preconditioner can be again symmetrized to the form
m

GSH = [H + (I - PI) L Pk(I - PI)]A- I . (12)


k=2
It shows that the preconditioned operator G SH A is decomposed,
m
GSHA = PI + (I - PI) LPk(I - PI), (13)
k=I

i.e. GSHA is equal to identity on range R(PI ) and to the additively pre-
conditioned operator GAA on range R(I - H). Thus, we can expect that
the hybrid preconditioner will be more efficient than the additive one, cf.
also [17].
Further in this paper, we shall exploit the following two assumptions, which
characterize the considered decomposition:

(AI) There exists a constant Ko such that

vv E V :3 Vk E Vk : v = VI + ... + Vm , L I Vk II~ ~ Ko II v II~. (14)


k
24 R. Blaheta

(A2) There exists a constant KI such that

\;fvEV \;fvkEVk : V=VI+ ... +Vm : Ilvll~::;KI2)Vkll~. (15)


k

Note that IlviiA = V(v, V)A. A trivial upper bound is Kl = m. But we are
interested in m independent bounds for K I , which can be found e.g. by
considering the angles of subspaces Vk. If

Ski = COS(Vk' Vi)


= SUp{(Vk,VI)A: Vk E Vk , VI E Vi, IlvkllA = 1, IlvIllA = I},
E = (Ski) and p(E) is the spectral radius of E, then

KI ::; p(E) ::; mJ:x L Ski· (16)


I

Sometimes, one of the subspaces, say VI, plays an exceptional role. Then
it may be useful to consider E1 = (Ski, k, Ii-I) and the estimate

(17)

Theorem 1. Let .Amin(GAA), .Amax(GAA) and cond(GAA) denote respectively


the smallest and largest eigenvalue and the condition number of the additively
preconditioned operator A. Then

Remark 1. The estimate of .Amin(GAA) is well known as Lions' lemma, see [16].
The proof of Theorem 1 as well as some historical remarks can be found e.g.
in books [13, 20, 18].

r'
Theorem 2. For the multiplicative preconditioner, we get

III - GSMAIIA = III - GMAII~ ::; [1- Ko(l: KI)2 (19)

cond(GsMA) ::; [1 -III - GSMAIIA]-I . (20)

Remark 2. The proof of the estimate (19) is more technical, see [7]. The com-
plete proof can be found e.g. in [20, 18]. The estimate (20) easily follows from
(19).
Space Decomposition Preconditioners and Parallel Solvers 25

3 Schwarz preconditioners based on overlapping DD

A suitable space decomposition (5) can be constructed via overlapping de-


composition of the computational domain n. Now, we describe and analyse
a typical model situation. We solve the boundary value problem (1) in neRd
by the finite element method with linear triangular or tetrahedral elements.
Let Yr. be a regular finite element division of n, h rv max{diam(e), e E Yr.}.
Moreover, let us assume that there is a division of n into m non-overlapping
sub domains n~, ... , n~. Let n% be subdomains of n aligned with the divi-
sion Yr. and such that n% ::J {x En: dist(x, n~) : : ; b"/2}. Let us also denote
H s rv max {diam( st%), k = 1, ... , m} and assume that each point of n belongs
to at most me subdomains, see Fig. 1.

Fig. 1. Decomposition of n, sub domains n~, nt fine triangulations Th, me = 4

The overlapping domain decomposition

n= nr u ... u n~ (21)

now induces a decomposition (6) of the finite element space V Vh into the
subspaces VI"'" Vm ,

Vk = {v E V: v = 0 in st \ nO . (22)

Moreover, we get the following characterization of this decomposition.

Theorem 3. Under the above assumptions, the decomposition V = VI + ... +


Vrn with the local FE spaces (22) fulfils the assumptions (A1), (A2) with the
constants
(23)
26 R. Blaheta

Remark 3. The proof of Theorem 3 can be found in the works of M. Dryja


and O. Widlund, see the books [13, 20] and the references given there. We
sketch the proof here, because it contains ideas important for understanding
the further development of the method in the next sections.

Sketch of the proof.


- For the overlapping domain decomposition, there is a partition of the unity
8 1, ... ,8m E COO(R d), O:S 8k :S 1,
m
8k = 0 in Rd \ stL L 8 k = 1, Ilgrad 8 k IIL~(n) = 0(0- 1 ).
k=l

- Using this partition of unity and the piecewise linear interpolation


Ih: C(st) ---- Vh , we can decompose any element v E Vh as follows,

V=LVk, vk=Ih(8kv).
k

- Property (A1):

L Ilvkll~ :S K2 L IVkl~'(nz) < K2 L L 1h(8k v )IJf1(T).


1

k k k TenZ, TETh

On the element level, it holds that

see e.g. [10]' Lemma 7.4.17 or [12]. Moreover, T c T", can belong to at most
me subdomains stZ. Thus,

L Ilvkll~ :S K21l m e L [0-21I v IIE,(T) + Ivl~f1(T)]


k TETh
:S K21lme(0- 2 + 1) Ilvll~:l'(n)
:S K21l me Ko1 (0- 2 + 1) Ilvll~ .
Note that Ko, K1, K2 are constants from (3), (4).
- Property (A2):

II LVk II~ = La(vi,Vj) = L L aT(vi,Vj) = L LaT(vi,Vj)


k ij ij TETh TETh ij
:S L L VaT(Vi,vi)VaT(Vj,Vj)
TETh ij
= L L VaT(Vi,Vi) L vaT(vj,vj) ,
TETh i j
Space Decomposition Preconditioners and Parallel Solvers 27

where aT(vi, Vi) denotes the restriction of the bilinear form a to T. Each
T E T,. belongs to at most me sub domains and therefore L VaT(Vi,Vi)
has at most me nonzero terms. The Cauchy-Bunyakowski-Schwarz (C.B.S.)
inequality thus gives

Hence,

D
Normally, the overlap 0 is kept proportional to the size of subdomain H s , i.e.
o= {3Hs, where (3 is the proportionality constant. In this case, a combination
of Theorems 1, 2, 3 says that cond(GAA) and cond(GsMA) deteriorate with
the increasing number of subdomains (Hs -+ 0,0 -+ 0). A remedy can be found
in adding an auxiliary coarse grid FE space Va to the space decomposition with
local FE spaces (22), see the next section.

4 Two-level Schwarz preconditioners

Let us consider the extended decomposition

v = Va + V1 + ... + Vm , (24)

where V = Vh is the FE space corresponding to the FE division T h , V 1 , ... , Vm


are the local FE spaces (22) and Va = VH is the FE space corresponding to a
coarser FE division TH of the domain n. We assume that T,. is a refinement
of TH , which guarantees that VH c Vh . Moreover, we assume that H ::::; Hs.
In a more general case of non-nested grids, we could use a prolongation
(interpolation) I'Ii: VH -+ Vh (I'Ii = Ih) and choose Va = I'Ii VH, see [12].
The subproblem operator Aa corresponding to Va is determined by restriction
of the variational formulation to Va. There is also the relation Aa = (I'Ii) * AI'Ii.

Theorem 4. The decomposition (24) fulfils the assumptions (AI), (A2) with
the constants

(25)

Sketch of the proof.


- For the analysis, we shall use the fact there is a mapping Q: Vh -+ Va and
constants 0"1, 0"2 such that
28 R. Blaheta

IQvIHl(Sl) :s: O"llvIHl(Sl) , (26)


Ilv - Q v IIL 2 (Sl) :s: 0"2 H lv IHl(Sl)' (27)
The above properties can be proved e.g. if Q is the L 2 -orthogonal projection
into Va, see [8].
- For any v E V, we take a decomposition v = Va + VI + ... + Vm , Va = Qv,
Vk = Ih(Eh(v - Va)) for k = 1, ... , m.
- For this decomposition,

- Further,
m m m

L
k=1
Ilvkll~ :s: "'2 L
k=1
IVkl~l(Sl~)

:s: "'2 fJo me [5- 21I v - vaIIL(Sl) + Iv - Va l~l(Sl)]


:s: "'2 fJom e [O"~ 5- 2 H2Ivl~1(Sl) + 2Ivl~1(Sl) + 2Ival~1(Sl)]
:s: "'2 fJo me [2 + 20"i + O"~ 5- 2 H2llvl~1(Sl)
:s: c [1 + 5- 2 H2lllvll~
- The above estimates show that, Ka = C(l + 5- 2 H2).
- The estimate of Kl follows easily from (17) and the estimate (23). 0
Remark 4. The proof of Theorem 4 can be found again in the works of M. Dryja
and O. Widlund, see also [13, 20, 10]. The estimates (25) can be strengthened.
In [14], it is proved that Ka = C(l + 5- 1 H). This estimate is sharp, cf. [9].
Remark 5. The constant C depends on "'1, "'2, which indicates some depen-
dence on the anisotropy and jumps in coefficients of the bilinear form, cf. [2].
On the contrary to Theorem 3, the assumption (4) was not used here.

If Tt. arises as a refinement of a coarser FE division TH , then the use of


Va = VH is fully natural. In other cases, it may be impractical and costly to
construct an extra division TH together with the interpolation I'H and coarse
grid operator Aa = A H . For these cases, it may be advantageous to use another
construction of the auxiliary global space Va by aggregation. This construction
will be described in the next section.

5 Two-level Schwarz preconditioners with aggregations

Let V = Vh = span{ <p~, ... ,<p~} and let the index set {I, ... ,n} be decomposed
into groups G 1 , ... ,GN. Then it is possible to define aggregated basis functions
1/;k and the space Va C V as follows,
Space Decomposition Preconditioners and Parallel Solvers 29

(28)

We shall assume that the aggregations are regular, i.e. there is a constant j3
such that each supp 1,Uk contains a ball with diameter j3H, where

H rv max diam(supp 1,Uk) ::; Hs.


k

As a consequence, there are positive constants C 1 , C 2 such that C 1 H d ::;


iSUPP1,Uki ::; C 2Hd. The space Va from (28) together with the local FE spaces
V1, ... , Vm from (22) create a new decomposition

V = Va + V1 + ... + Vm . (29)

Theorem 5. The decomposition (29) fulfils the assumptions (A1), (A2) with
the constants

(30)

Sketch of the proof. We can follow the same procedure as in the proof of The-
orem 4 but instead of L2 -projection to Vh , we shall consider an averaging
operator Q.
- Let us consider the mapping Q: Vh ~ Va defined as follows:

for v E Vh , Qv =
N
L ak1,Uk,
k=l
ak = 1
isupp 1,Uk i
1
supp 1/lk
v(x)dx. (31)

For this operator, it holds that

-
iQ v iHl(r.?) ::; 0"1 V(H
h i v iH1(r.?) , (32)

iiv - Q v iiL 2 (r.?) ::; 0"2 H iviHl(r.?) ' (33)

see [11, 15] for the details.


- For any v E V, we take the decomposition

v = Va + v1 + ... + vm , Va = Qv , vk = Ih (fh (v - va)) , k = 1, ... , m .

- Then

- Further
30 R. Blaheta
m m m

L Ilvkll~ ~ 1\:2 L IVkl~l(DO


k=1 k=1

~ 1\:2 fL me [5- 2 1I v - vaIIL(D) + Iv - Val~l(D)]


~ 1\:2 fL me [()~ 5- 2 H2 + 2 + 2i7r ~] IvIJ-J1(D)
~ c [1 + h- I H + 5- 2 H2] Ilvll~.

- The above estimates give (30). The estimate of KI is the same as in Theo-
rem 4. 0

The space Va created by aggregation was firstly introduced in multigrid


context, see e.g. [3] and the references therein. The properties of the basis
functions {'ljJd can be improved by smoothing and the smoothed aggregations
can be again used in two-level Schwarz preconditioners, see [11].
Note that the paper [15] describes the use of massive aggregation, which
means aggregation of all degrees of freedom in each subdomain. According to
our experience, less massive aggregation is more efficient and better balanced
with the other subproblems for smaller numbers of subdomains.
For structured grids, the aggregations can be done by regular clustering.
Some algorithms for creating aggregations for unstructured grids are described
e.g. in [11].

6 Two-level Schwarz preconditioners with interfaces

The considered two-level Schwarz preconditioners involve two ways of inter-


action between the subproblems: via the overlap and via the coarse grid. In
this section, we would like to show that the interaction via the coarse grid is
sufficient if the coarse grid involves the interfaces between the subdomains.
Thus, let D~, ... , D~ be a non overlapping decomposition of the domain
D and each D~ be aligned with a fine FE division Th of the domain D. Let

Then We V, but W =1= V. If there is a coarse grid space Va such that

V = Va + VI ... + Vm = Va + W , (35)

then Va should contain all degrees of freedom corresponding to the interface


r = U rkl, where nl = (oD k noDI) \ oD.
There are several possibilities how to construct a proper space Va. The first
possibility is to use a coarse grid TH providing enough nodes on the interface
r, i.e. node(TH ) n r = node(Th ) n r. The coarse grid TH should be also
Space Decomposition Preconditioners and Parallel Solvers 31

Fig. 2. A coarse grid with interfaces (left) and its refinement with the domain
decomposition (right)

compatible with the sub domains n2 and Th should be a refinement of TH .


Such situation can be seen in Fig. 2.
Another more flexible possibility how to create a proper space Vo is the
construction by aggregation. This construction can be similar as in the previous
section, the only difference is that we restrict the aggregation to that degrees
of freedom, which do not correspond to the nodes on the interfaces between
the subdomains, see Fig. 3.

Fig. 3. Fine grid and domain decomposition (left) and aggregation out of the inter-
faces between the sub domains (right)

Theorem 6. The decomposition (35) with subspaces V1 , ... , Vm corresponding


to non-overlapping subdomains and the coarse grid space Vo with interfaces
fulfils the assumptions (AI), (A2) with the constants

Ko = 1/(1-,,), (36)
where" = cos(Vo, W o) < 1 for any Wo C W such that V = Vo EEl Woo

Proof. Any v E V can be written as v = Vo + wo, Vo E Va, Wo E Wo C


W. Moreover, Wo = Vl + ... + V m , where Vk E Vk for k = 1, ... , m. The
subspaces V1 , ... , Vm corresponding to the non-overlapping subdomains are
(, IA orthogonal, thus
32 R. Blaheta

k=1

~ 2
L.....llvkllA = IIVollA2 + IlwollA2::: 1
-1-llvo + wollA2 12
= -1-llvIlA'
k=O - , - ,

On the other hand, for any vE V, v = Vo + VI + ... + Vm , where Vk E Vk


for k = 0,1, ... , m, we have
m

Ilvll~ : : : 211voll~ + 211vl + ... + vmll~ = 2 L Ilvkll~·


k=O

o
Note that the constant, < 1 appeared also in the analysis of the hybrid
preconditioner G H. We simply get

because 2:;;'=1 Pk is now again (,)A orthogonal projection. Note also that
G H = G M in this case. More details can be found in [5].
As a consequence of Theorem 6, the investigation of convergence of the
two-level Schwarz method with interfaces can be reduced to investigation of
the constant, = cos(Vo, Wo). This investigation of the strengthened C.B.S.
inequality can be performed locally on macroelements. It will be illustrated for
a simple 2D problem from Fig. 2 in the following theorem. More results will
be given in a forthcoming paper [5].

Theorem 7. Let us consider the problem (1) with orthotropic bilinear form (2)
on the domain n c R2 and the situation similar to Fig. 2. It means that there
is a coarse grid TH with coarse rectangular triangles inside the subdomains and
special rectangular macroelements and fine triangles along the boundaries of the
subdomains, see Fig. 2, 4. There is also a refined triangulation Tit, which con-
sists from congruent rectangular isosceles triangles with two axiparallel sides.
These triangles arise from TH by division of each coarse triangle and each
boundary macroelement into four congruent triangles, see Fig. 4.
Let Vo be the FE space corresponding to TH and W be the union of subspaces
Vk defined in (34). Let Wo contain the functions from W, which are zero in
the nodes belonging to TH . The nonzero coefficients of the bilinear form (kl1
and k 22 )) are assumed to be constant on the macroelements from TH .
Then

, = cos(Vo, W o) ::::: max ( iI,


V"2 V k kl1 k
11 + 22
' (38)
Space Decomposition Preconditioners and Parallel Solvers 33

(a) (b) (c)

~
~~
T3 : T

T4 T4
~ TJ :T2

Fig. 4. The coarse triangles with refinement (a) and boundary macroelements with
3 triangles and their refinement into 4 triangles (b), (c)

Proof. The C.B.S. constant can be investigated locally on the inner and bound-
ary macro elements. If

aE( v, w) = 1'"
E.
ov -;:;-dx
L.J k ii -;:;-
,
ow
uXi uXi

and
aE(v,w) :::; "YEVaE(V, v) vaE(w,w) ,
for each macroelement E and functions v and w, which are restrictions of
v E Va and w E Wo to the macroelement E, then
"Y <
- maX"YE·
E

For the inner macroelements (coarse triangles), it is possible to show that


"YE :::; ~, see e.g. [1] (the estimate is a simple generalization of Remark 3.1.).
Now consider an interface macro element E of the type shown in Fig. 4(b).
For v E Vo , we get

For w E W o, we get aaw


Xl
= aaw
X2
= 0 in T3 and T 4 . Further,

ow ow ow ow
-;:;-- IT!
uX2
= --;:;--
UX2
IT2 , -;:;-- IT!
uXl
= -;:;--
uXl
IT2 ,
34 R. Blaheta

i.e. IE = V
kl1
kl1
+ k22
.

A similar estimate IE = k k+22k


11 22
can be proved for the interface
macroelement E of the type shown in Fig.4(c). 0

Note that for isotropic bilinear form, we get I = II,


which indicates very
good efficiency of the two-level Schwarz pre conditioners with interfaces. The
efficiency is not deteriorated by jumps in coefficients if these jumps do not oc-
cur within the macroelements. On the other hand, the efficiency deteriorates
if the sub domain boundaries cut the main direction of a strong anisotropy.
Therefore, for strongly anisotropic problems, it may be useful to use domain
decompositions cutting only the weak direction of the anisotropy. The numer-
ical (grid) anisotropy has of course the same effect as the physical anisotropy.
The idea of using Schwarz pre conditioner with no overlap but a suitable
coarse grid appeared also in the paper [2], but in a context with a different
construction of the coarse grid problem.

7 Parallel implementation of the preconditioners


Let us consider the solution of a symmetric elliptic boundary value problem by
the FE method. Then we have to assemble and solve linear algebraic system

Au = b, u,b E R n , (39)

with a symmetric positive definite n x n matrix. This system is an algebraic


representation of the problem (5). The system (39) is mostly solved by the
preconditioned conjugate gradient (CG) method.
The solution of large-scale systems requires to use powerful parallel com-
puters and the domain decomposition can be then exploited for parallel im-
plementation of all main operations:
- assembling of the system,
- matrix-by-vector multiplication,
- computation of vector updates and scalar products,
- construction of preconditioners.
This parallel implementation uses decomposition of data (vectors, matrices)
into blocks corresponding to the sub domains. The overlapping DD induces de-
composition of data to overlapping blocks v == {vD etc. The matrix-by-vector
multiplication can be implemented with both overlapping and non-overlapping
blocks. The first case may be advantageous if the extended diagonal blocks
{A~d, k = 1, ... ,m, cover all nonzero elements of A.
Blocks of data are mapped to the processors of the parallel computer and
the domain decomposition or more precisely the decomposition of the FE grid
Space Decomposition Preconditioners and Parallel Solvers 35

should be done with a special care to the computational load of processors.


This is not difficult for structured grids. For a suitable decomposition of un-
structured grids, it is possible to use several graph algorithms, see e.g. [18]. The
decomposition should also respect anisotropies as mentioned in Section 6 and
further in Section 8. For two-level preconditioners, it is additionally needed to
balance the size of the auxiliary global problem.
The space decomposition pre conditioners require to solve the subproblems
of the type AkWk = Vk. Although these subproblems are smaller than the
original problem, it may be still too expensive or inefficient to solve these
systems exactly by direct solvers. Then the exact solution of the subproblems
can be avoided by different means.
First, we can use an approximation .ih to Ak for computing approximate
values of A;;1vk. This approximation can be given by incomplete factorization
of Ak or by applying a fixed number of iterations with some linear stationary
iterative method as SSOR or multigrid. The use of approximate operators .ih
is covered by the theory described e.g. in [20].
Second, we can use inner CG iterations for solving AkWk = Vk approxi-
mately up to some lower relative inner accuracy IIAkwk - Vk I ::; EO IIVk II, where
EO is say 10- 1 . This choice gives SD preconditioners, which are not more linear
operators. We can speak about nonlinear preconditioners. These precondition-
ers can be implemented within the standard preconditioned CG method, but
the resulting inner-outer iterative method can fail or lead to a slow conver-
gence. The reason for such difficulties is the loss of orthogonality and sometimes
also loss of linear independence of the search directions in CG method with
nonlinear preconditioning.
A remedy is to store (some) search directions and orthogonalize the new
one to the stored ones. Such flexible or generalized preconditioned CG method
is described e.g. in [4] and the references therein. In the next section, we shall
use the generalized preconditioned GPCG[s] method with orthogonalization of
the search direction to s previous ones.
The GPCG[s] method also allows to use nonsymmetric multiplicative or
nonsymmetric hybrid preconditioners. These nonsymmetric preconditioners
are substantially cheaper than the symmetrized variants requiring at least
one extra update of the residual and frequently leading to similar convergence
rates as the symmetrized variants.
The convergence of GPCG[s] with nonlinear approximate additive SD pre-
conditioners can be proved with the aid of the convergence theory, which can
be found e.g. in [4]. The convergence of GPCG[s] with nonlinear approxi-
mate multiplicative or hybrid pre conditioner G = G M, G H can be proved if
III - wGAllA < 1 for some wE (0, I), see again [4].
36 R. Blaheta

8 Numerical results
The efficiency of various SD pre conditioners described in this paper can be
compared by solving a simple model problem
[}2u [}2u
kll [} 2
Xl
+ k22 [} X 2 = f in D = (0,2) x (0,3) ,
2
U = 0 on [}D,
discretized by linear triangular FE. We use the uniform grid with the mesh
size h = 1/30, sub domains Dk = (0,2) X (Xk' Xk+l) and overlap 0 = 2h. The
subproblems are solved exactly. The required numbers of iterations for the
accuracy c = 10- 3 and various additive (AP) and hybrid (HP) SD precondi-
tioners can be seen in Table 1. The hybrid preconditioners are used in nonsym-
metric form in combination with GPCG[l]. The coarse grid is either nested
coarse triangular grid, aggregation with clustering 2 x 2 square macroelements
or the same aggregation with interface. For each number of subdomains, the
first column shows the numbers of iterations for kll = k22 = 1, the second one
for kll = 1, k22 = 1/100 and the last one for kll = 1, k22 = 100.

Table 1. The numbers of iterations for the relative accuracy E = 10- 3 . AP de-
notes additive preconditioner, HP denotes nonsymmetric hybrid pre conditioner +
GPCG[I].
Number of subdomains
Type Coarse grid space 4 8 12 16 24
AP one level 16 4 26 22 4 38 23 5 51 31 5 61 37 6 78
AP nested, H = 3h 7 7 16 8 7 20 8 7 23 7 7 26 8 7 31
HP nested, H = 3h 6 6 13 6 7 16 6 7 20 6 7 22 6 6 27
AP regular agg's 2h 13 8 19 15 8 23 16 9 27 17 9 30 17 9 36
HP regular agg's 2h 10 7 15 11 7 19 11 7 22 11 8 25 11 8 29
AP agg's 2h + into 14 8 22 14 8 30 14 8 36 14 9 40 14 9 46
HP agg's 2h + into 8 4 14 7 5 16 8 5 19 8 5 22 8 5 25

For demonstration of efficiency of parallel solvers based on CG method and


additive Schwarz preconditioners, we give in Table 2 the numbers of iterations
and computer times from solving large-scale linear system with the dimension
of about 4 million arising from FE discretization of 3D elasticity problem,
which has a practical application in geomechanics. More details can be found
e.g. in [6]. The adopted discretization uses linear tetrahedral FE.
The computations use one directional decomposition up to 8 subdomains
and coarse grid problem created by regular aggregation of 2x2x2 or 5x5x5 hex-
ahedral macroelements. The computations were performed on a small Linux
cluster consisting of 8 PCs with AMD Athlon/1400 processors and 768 MB
memories. The computers are interconnected via standard Fast Ethernet 100
Mbit/s network.
Space Decomposition Preconditioners and Parallel Solvers 37

Table 2. Results from the solution of the large-scale 3D elasticity problem with the
relative accuracy E: = 10- 4 . #P denotes the number of exploited processors, m is the
number of subdomains.
No coarse grid Aggregation 2 x 2 x 2 Aggregation 5 x 5 x 5
#m #P #It T [s] #P #It T [s] #P #It T [s]
2 2 104 428 3 45 268 3 56 243
3 3 113 321 4 47 242 4 59 176
4 4 121 268 5 51 240 5 62 144
5 5 128 234 6 53 236 6 65 127
6 6 133 211 7 55 244 7 67 114
7 7 136 193 8 57 265 8 70 105

9 Concluding remarks

We used the space decomposition framework for providing an overview of the


Schwarz-type domain decomposition preconditioners. We described a large va-
riety of these preconditioners, still not including some newly developed vari-
ants.
Our experience shows that the Schwarz preconditioners can be used for
development of efficient and scalable parallel solvers at least when working
on smaller parallel computing systems. These methods are flexible enough for
balancing the computational load of the processors, adopting inexact solvers
etc. On the other hand, we showed that a special care should be devoted to
physical or numerical anisotropy and similar difficulties.
The Schwarz technique can be also used for solving other problems as
nonsymmetric problems of convection-diffusion type, parabolic and nonlinear
problems and many others. The general SD framework is suitable also for
developing preconditioners based on hierarchical or physical decompositions,
see e.g. [13, 20, 21, 4].
Acknowledgement: Many thanks are due to P. Byczanski and J. Stary
for preparing the numerical results. The work was supported by the grant
S3086102 of the Academy of Sciences of the Czech Republic.

References
1. Axelsson, 0., Blaheta, R. (2001): Two simple derivations of universal bounds for
the C.B.S. inequality constant. Report 0133, Dept. Math., University Nijmegen
2001. To appear in App!. Math.
2. Bj¢rstadt, P.E., Dryja, M., Vainikko, E. (1997): Additive Schwarz methods with-
out sub domain overlap and with new coarse spaces. In: Glowinski, R. et a!.
(eds) Domain Decomposition Methods in Sciences and Engineering. Proc. 8th
Int. Conf. on Domain Decomposition Methods. J. Wiley
3. Blaheta, R. (1988): A multilevel method with overcorrection by aggregation for
solving discrete elliptic problems. J. Compo App!. Math., 24, 227-239
38 R. Blaheta

4. Blaheta, R.(2002): GPCG - generalized preconditioned CG method and its use


with non-linear and non-symmetric displacement decomposition preconditioners.
Numer. Linear Algebra Appl., 9, 527-550
5. Blaheta, R: Two-level Schwarz preconditioners with interfaces in the coarse grid
space. In preparation
6. Blaheta, R., Jakl, 0., Krecmer, K., Star)!, J. (2002): Large-scale modelling in
geomechanics with parallel computing on clusters of PC's. In: Mestat, P. (ed)
NUMGE'02 5th European Conference Numerical Methods in Geotechnical Engi-
neering. Presses de I'Ecole Nationale des Ponts et Chausses, Paris, pp.315-320
7. Bramble, J.H., Pasciak, J.E., Wang, J., Xu, J. (1991): Convergence estimates
for product iterative methods with applications to domain decomposition. Math.
Comp., 57, 1-21
8. Bramble, J.H., Xu, J. (1991): Some estimates for a weighted L2 projection, Math.
Comput., 56, 463-476
9. Brenner, S.C. (2000): Lower bounds for two-level additive Schwarz precondition-
ers with small overlap, SIAM J. Sci. Comput., 21, 1657-1669
10. Brenner, S.C., Scott, L.R(2002): The Mathematical Theory of Finite Element
Methods, Springer-Verlag, New York Second edition
11. Brezina, M. (1997): Robust iterative methods on unstructured meshes, PhD.
thesis, University of Colorado at Denver
12. Cai, X.C. (1995): The use of pointwise interpolation in domain decomposition
methods with non-nested meshes. SIAM J. Sci. Comput., 16, 250-256
13. Chan, T.F., Mathew, T.P. (1994): Domain Decomposition Algorithms. Acta Nu-
merica, 61-143
14. Dryja, M., Widlund, O.B. (1994): Domain decomposition algorithms with small
overlap, SIAM J. Sci. Comp., 15, 604-620
15. Jenkins, E.W., Kelley, C.T., Miller C.T., Kees, C.E. (2001): An aggregation-
based domain decomposition preconditioner for groundwater flow. SIAM J. Sci.
Comput., 23, 430-441
16. Lions, P.L. (1988): On the Schwarz alternating methods 1. In: Chan, T.F.,
Glowinski, R, Periaux, J., Widlund, O.B. (eds): First International Symposium
on Domain Decomposition Methods for PDE, SIAM, Philadelphia, 1-42
17. Mandel, J. (1994): Hybrid domain decomposition with unstructured sub domains.
In: 6th Int. Conf. on Domain Decomposition, AMS Contemporary Mathematics
vol. 157, 103-112
18. Saad, Y. (2003): Iterative methods for sparse linear systems. SIAM, Philadelphia
19. Schwarz, H.A. (1870): Gesammelte Mathematische Abhandlungen, volume 2,
pages 133-143. Springer, Berlin 1890. First published in 1870
20. Smith, B.F., Bjprstad, P.E., Gropp, W.D. (1996): Domain Decomposition Par-
allel Multilevel Methods for Elliptic Partial Differential Equations. Cambridge
University Press, Cambridge
21. Wolmuth, B. 1. (2001): Discretization Methods and Iterative Solvers Based on
Domain Decomposition, LNCSE 17, Springer Berlin
Boundary Conditions for Hyperbolic Equations
or Systems

Thierry Gallouet

LATP, CMI, 39 rue Joliot-Curie, 13453 Marseille cedex 13 [email protected]

Summary. Different types of boundary conditions in industrial numerical simula-


tors involving the discretization of hyperbolic systems are presented. For some of
them, one may determine the problem to which the limit of approximate solutions
(as the discretization parameters tend to 0) is the unique solution. In turn, this con-
vergence result may suggest other ways to take into account the boundary conditions.

1 Introduction
In the industrial context, efficient numerical simulators are often developped
after a long "trial and error" procedure. The efficiency of the simulators may
be evaluated, for instance, by the fact that the solution satisfies some natu-
ral constraints and that it is in agreement with experimental data. In some
cases, estimates on the approximate solutions allow to obtain the convergence
of some sequences of approximate solutions as the discretization size tends
to O. However, it is not easy to give the answer to the following question:
"What problem has a unique solution which is the limit of the approximate
solutions?" .
This paper will focus on the problem of boundary conditions needed in the
discretization of nonlinear hyperbolic equations or systems of equations; this
problem is not yet clearly understood in many cases. Two different cases will
be presented: a two phase flow in a pipeline and a two phase flow in a porous
medium.

2 A two phase flow in a pipeline


2.1 Description of the system

A "simple" model for a two phase flow in a pipeline (see [8]' for instance) leads
to a 3 x 3 system of conservations laws. The unknown w is a function from
(0,1) x R+ in R 3 , solution of the following system:

Wt + (F(w))x = 0, x E (0,1), t E R+, (1)


where Ot and Ox denote the derivatives with respect to t and x variables.
The first two equations of (1) give the mass conservation of the 2 phases (gas
40 T. Gallouet

and liquid) and the third one is the momentum equation for the mixture. The
expression of the given function F: R3 ~ R3 is quite complicated. It takes
into account thermodynamical laws and a hydro dynamical law. System (1)
is hyperbolic: for any W E R 3 , the Jacobian matrix DF(w) is diagonalizable
in R. The three eigenvalues can be ordered: .A.l(W) < .A.2(W) < .A.3(W). In
real situations, the first eigenvalue, .A.l(W) is negative and the third, .A.3(W),
is positive (they correspond to some "pressure waves" which are related to
a "sound velocity"). The second eigenvalue, .A. 2 (w), corresponds to some mean
velocity between the two phases and can change sign. One can also note that
the field related to this second eigenvalue is quite complicated because it is
not, in general, a genuinely nonlinear field or a linearly degenerate field. In
petroleum engineering, the wave associated to this second eigenvalue is a "void
fraction wave"; engineers require a good representation of this wave in the
numerical simulations.

Remark 1. In real situations, the function F in System (1) also depends on x,


in order to take into account, for instance, the variation in the slope of the
pipeline. Moreover, some source terms have to be added to the system, in order
to take into account, for instance, some friction terms.

In order to complete System (1), an initial condition is prescribed:

W(x,O) = wo(x), x E (0, I), (2)


and it is also necessary to give some boundary conditions. This appears to
be not so easy. Indeed, classically, a general principle is that the number of
boundary conditions needs to be equal to the number of positive eigenvalues of
the Jacobian matrix at x = 0 and to the number of negative eigenvalues of the
Jacobian matrix at x = 1 (and these boundary conditions have to satisfy some
compatibility conditions). However, this principle is not so easy to understand
when an eigenvalue changes sign during the simulation (or in the case of a null
eigenvalue). A very interesting case is the so called "severe slugging" case in
a pipeline. For this case, there are always two positive eigenvalues at x = 0
and two natural boundary conditions are prescribed at x = 0, namely the
fluxes of gas and liquid; these boundary conditions can be taken constant in
time. At x = 1, there is one natural boundary condition, namely the pressure
(which is the same for the two phases, in this model), to be prescribed. It can
also be constant in time. The true physical solution, which is measured by
experiments (and the aim is to modelize these experiments), is periodical in
time and it appears that, at x = I, the first eigenvalue is always positive and
the third one is always negative but the second eigenvalue changes sign during
the simulation. In the sequel, one presents different ways to take into account
the boundary conditions and one gives a convergence result in a simplified
case.
Boundary Conditions for Hyperbolic Equations or Systems 41

2.2 Discretization of the problem

°
In order to discretize Problem (1), (2) and some boundary conditions, which
will be introduced later, let h = f:t
(with N E N*) be the mesh size and k >
be the time step (assumed to be constant, for the sake of simplicity). The
discrete unknown are the values wi E R3 for i E {I, ... , N} and n E N. The
discretization of the initial condition leads to

w? = h1 jih wo(x)dx, i E {I, ... , N}. (3)


(i-1)h

For the computation of wi for n > 0, one uses an explicit, 3-points scheme:

~(w~+1 - wi) + Ft~! - Fi~! = 0, i E {I, ... , N}, n E N. (4)

For i E 1, ... , N -1, one takes F21-! = g(wi, wi+1)' where 9 is the numerical
flux. It has to satisfy, in particular, the classical consistency condition, namely
g(a, a) = F(a), and needs to be chosen in order to obtain some stability
properties for the numerical scheme under a so called CFL condition on the
time step (see Sect. 3 for the study of a scalar model). In the case of two
phase flow in a pipeline, the classical numerical fluxes such as the Godunov
flux (see [9]) or the Roe flux (see [11]) may not be implemented, because of
computational difficulties. A convenient choice is obtained with a simplified
Roe flux, namely g(a,b) = g(a)tg(b) + ~IA(a,b)l(a - b), where A(a, b) is some
appoximation of the Jacobian matrix, depending on a and b, but not satisfying
the so called Roe condition, see [8].

Remark 2. In fact, for the simulation of a two phase flow in a pipeline, the
magnitude of the so-called fast eigenvalues, A1 and A3, is much greater than
that of A2; the choice in [8] is to use an implicit scheme with respect to the fast
eigenvalues, whereas the eingevalue A2, which corresponds to the void fraction
wave, is handled with an explicit second order discretization, since the void
fraction wave needs to be simulated precisely (see [8] for details).

Let us now define the fluxes Fr and F;:'{+l at the boundary.


2 2

2.3 Boundary conditions for the discretized problem

In order to compute Fr 2
(and similarily F;:,{+l) a good way is to know, or
2
to determine, some artificial value Wo E R3 (and wN+1 E R 3 ) and to take
Fr2
= go(wo,w]') (and F;:'{+l2 = gl(WN,WN+1))' The numerical fluxes go and
gl can be chosen equal to g, but this is not at all necessary (see the convergence
result of Sect. 3); in fact, there are numerous situations where one should take
go and gl different from g. Indeed, the scheme is often very sensitive to the
computation of the boundary fluxes and it is often worthwhile to use a more
42 T. Gallouet

precise, but also more expensive numerical flux (such as the Godunov flux, for
instance) for the computation of the boundary fluxes than for the computation
of the interior fluxes. The difficulty is now to determine these artificial values,
Wo and w N+1'
Remark 3. In some cases, the choice of Wo
and w N+1 is quite easy. A well
known example is given by the wall-boundary condition for the Euler equa-
tions (with a perfect gas state law or a more general state law). For the sake
of simplicity, let us mention the one-dimensional case; the generalization to
a multi-dimensional case is quite easy. The Euler equations may be written
the form (1), corresponding to conservation of mass, momentum and energy,
with w = (p, pu, E) t, where p is the density of the fluid, u its velocity, and
E its energy. The wall-boundary condition at x = 0 is u = 0, and the only
component to compute for the boundary condition is the second component of
F'l which is equal here to the pressure at x = 0 (since u = 0 at the wall), say
2
Pl' The value wr
may be computed from the values pr, ur
and pr. A natural
Po Uo Po = pr. The flux F'l
2
choice for Wo is to take = pr, = -ul' and (that is
the value Pl) is then obtained with F'l = go(wo,
2 2
wr) 2
and a convenient choice
of the numerical flux go. We suggest to choose go as the Godunov flux (or as
a linearized Godunov flux, see [3] for instance). Numerical tests which were
performed in [3] show that this choice is very satisfactory, even in the difficult
case of a strong depressurization at the boundary. These tests also show that
the pressure obtained with the Roe flux is not so satisfactory and neither is
the choice Pl =
2
pr
which may seem natural (in particular, in 2D simulations,
using a dual mesh obtained with a finite element primal mesh).
In most cases, however, the choice of Wo
and wN+1 is not so easy. A possible
method, which is described in [4], is now layed out, for a fixed n and go given:
1. Compute DF(wr), its eigenvalues {AI, A2, A3} and a basis of R 3 ,
{'P1,'P2,'P3}, such that DF(wr)'Pi = Ai'Pi, i = 1,2,3.
2. Write wr on the basis {'PI, 'P2, 'P3}, namelywr= (};1'P1 + (};2'P2 + (};3'P3,
3. Let P be the number of positive eigenvalues, compute Wo = (31 'PI + (32'P2 +
(33'P3 and F'l = go(w o , wI), where the three unknowns (31, (32, and (33 are
2
determined by the P equations stating the boundary conditions (note that
these equations involve the components of F'l) and by the 3 - P equalities
2
(3i = (};i for Ai < O.

This method leads, at each time step, to a nonlinear system of 3 equa-


tions with 3 unknowns (except if Ai = 0 for some i), namely (31, (32 and (33;
note that some compatibility conditions are needed in order that this nonlin-
ear system has a solution. Several variants of this method are possible. For
instance, a boundary condition may be imposed on Wo rather than F'l. A sim-
2
ilar method is, of course, possible at point x = 1 (changing the role of positive
and negative eigenvalues).
Boundary Conditions for Hyperbolic Equations or Systems 43

This method is not always satisfactory. In the case of severe slugging for
the simulation of two phase flow in a pipeline, the method seems to perform
well at x = 0, where the eigenvalues ),,1 and ),,2 are always positive and the two
boundary conditions (gas and liquid fluxes) are convenient. However, at x = 1,
the second eigenvalue sometimes becomes negative and one needs a second
boundary condition (the first one is a condition on the pressure). A natural
condition seems to be Ql = 0, where Ql is the second component of the flux F,
that is the liquid flux, but this condition does not lead to good results. Other
possible choices of this additional boundary condition at x = 1 were tested and
did not give good results. A possible interpretation of this problem is the fact
that the sign of ),,2 is computed with wN' Roughly speaking, it is "too late"
when ),,2 (wN) becomes negative (see Sect. 3 for the study of a simple scalar

°
case). Indeed, good results (in agreement with experiments) are obtained with
the unilateral condition Ql ~ (whatever the sign of ),,2(wN )). It consists in
using the preceeding method (for the boundary condition at x = 1) and in
replacing, in the numerical scheme (4), the second component of F;':;'+~ by its
positive part. Then, if ),,2 (w N) < 0, two boundary conditions are given at x = 1
(pressure and Ql = 0) and if ),,2(W N) ~ 0, one boundary condition is given at
x = 1 (pressure) but, in (4), the second component of F;':;'+~ is replaced by its
positive part.
In the following section, we will try to understand the sense of this bound-
ary condition in a simplified scalar case.

3 The scalar case

A general convergence result is presented here in the case of a scalar equation.


Then this result will be applied to understand the sense of the boundary
condition, described at x = 1 in the previous section, in a simplified scalar
case.

3.1 A general convergence result

The unknown is now a function u: (0,1) x R+ ----+ R. The flux is a function


f E C 1 (R, R) (or f: R ----+ R Lipschitz continuous) and the initial datum is
Uo E LOO((O, 1)). Let A, BE R be such that A ::; Uo ::; B a.e .. The problem to
solve is:

Ut + (f(u))x = 0, x E (0,1), t E R+, (5)


with the initial condition:

u(x,O) = uo(x), x E (0,1), (6)


and some boundary conditions which will be prescribed later.
44 T. Gallouet

-Ii
°
As in the previous section, let h = (with N E N*) be the mesh size and
k > be the time step (assumed to be constant, for the sake of simplicity).
The discrete unknowns are now the values ui E R for i E {I, ... , N} and
n E N. In order to define the approximate solution a.e. in (0,1) x R, one sets
Uh,k(X, t) = ui for x E ((i -l)h, ih), t E (nk, (n + l)k), i E {I, ... , N}, n E N.
The discretization of the initial condition leads to

u? = h1 lih uo(x)dx, i E {I, ... , N}. (7)


(i-1)h

For the computation of ui for n > 0, one uses, as before, an explicit,


3-points scheme:

£(u~+l - ui) + f~~ - f::-~ = 0, i E {I, ... , N}, n E N. (8)

For i E 1, ... ,N - 1, one takes

(9)

where g is the numerical flux. Sufficient conditions on g: [A, Bj2 ---+ R, in


order to have a convergent scheme if x E R instead of (0, 1), are:

C1: g is non decreasing with respect to its first argument and nonincreasing
with respect to its second argument,
C2: g(s, s) = f(s), for all s E [A, BJ,
C3: g is Lipschitz continuous.

Let L be a Lipschitz constant for g (on [A, B]2) and ( > 0. If (0,1) is
replaced by R, it is well known (see e.g. [4]) that, if k :::; (1- ()~, the approx-
imate solution Uh,k, that is the solution defined by (7)-(9) (with i E Z), takes
its values in [A, B] and converges towards the unique entropy weak solution of
(5)-(6) in Lfoc(R x R+) as h ---+ 0.
In the case x E (0,1) instead of x E R, one assumes the same conditions
on g, namely (C1)-(C3). In order to complete the scheme, one has to define
n 2
and f~+l·
2

Let u, U E Loo(R+) be such that A :::; u, U :::; B, a.e. on R+, let


go, gl: [A, B]2 ---+ R, satisfying (C1)-(C3), and define:

f ~n -- go (-n
u ,un). -n - ll(n+l)k -(t)dt
1 , U - k nk U
(10)
n
f N+~ -- gl (n =n ). = _ 1 1(n+1)k =(t)dt
UN' U , , u - k nk U ,
Then a convergence theorem can be proven as in the case x E R, see [13]:

Theorem 1. Let fECI (R, R) (or f: R ---+ R Lipschitz continuous). Let


Uo E Loo((O, 1)), u, U E Loo(R+) and A, B E R be such that A :::; Uo :::;
Boundary Conditions for Hyperbolic Equations or Systems 45

B a.e. on (0, I), A ::; u,u ::; B a.e. on R+. Let gO,g1: [A,B]2 --> R,
satisfying (Cl)-(C3). Let L be a common Lipschitz constant for g, go and g1
(on [A,B]2) and let ( > o. Then, ifk::; (1-()~, the equations (7)-(10) define
an approximate solution Uh,k which takes its values in [A, B] and converges
towards the unique solution of (11) in Lfoc([O, 1] x R+) for any 1 ::; p < 00, as
h --> 0:

u E LOO((O, 1) x (0,00)),

roo r1
[(u _ ",)±cpt + sign±(u - ",)(f(u) - f("'))CPxl dxdt
Jo Jo oo
+M 1o
(u(t) - ",)±cp(O, t)dt +M 1
1 0
00

(u(t) - ",)±cp(l, t)dt (11)

+ 10 (uo - ",)±cp(x, O)dx ;::: 0,


V'" E [A, BJ, Vcp E C1([0, 1] x [0,00), R+).
In (11), M is any bound for If'l on [A,B] (and the solution of (11) does
not depends on the choice of M). The definition of sign± is: sign+ (s) = 1 if
s > 0, sign+(s) = 0 if s < 0, sign_(s) = 0 if s > 0, sign_(s) = -1 if s < O.

Remark 4.
1. It is interesting to remark that this convergence result is also true if the
function 9 depends on i and n, provided that L is a common Lipschitz
constant for all these functions.
2. The definition (11) of solution of (5)-(6) with the "weak" boundary con-
ditions u and u at x = 0 and x = 1 is essentially due to F. Otto, see
[10].
3. It is interesting also to remark that if one replaces, in (11), the two en-
tropies (u - "')± by the sole entropy lu - "'I, one has an existence result
(since lu - "'I = (u - "')+ + (u - "')-) but no uniqueness result, see [13] for
a counter-example to uniqueness.
4. This convergence result can be generalized to the multidimensional case,
see Sect. 5 and [13].

If u, solution of (11), is regular enough (say u E C1([0, 1] x R+), for in-


stance), u satisfies u(O, t) = u(t) and u(l, t) = u(t) in the weak sense given in
[1]. This condition is very simple if f is monotone:
If f' > 0, then u(O,·) = u and u does not depend on U.
u
If f' < 0, then u(I,·) = and u does not depend on u.

3.2 A very simple example

One considers here Equation (5), with initial condition (6) and weak boundary
condition u and u
at x = 0 and x = 1, that is in the sense of (11), in the
particular case l' > O. In this case, the main example of numerical flux is
46 T. Gallouet

9 = go = gl, g(a, b) = f(a), which leads to the well known upstream scheme.
With this choice of go and gl, using the notations of Sect. 3.1, the boundary
conditions are taken into account in the form:

fl2 = f(11 n ), f'/v+l2 = f(u'N), (12)

with un = f J~~+l)k 11(t)dt. One may apply the general convergence theorem.
The approximate solutions converge (as h ----) 0) towards the solution of (11).
In this case, the approximate solutions, as well as the solution of (11), do not
depends on U.
In the case l' < 0 the main example is 9 = go = gl, g(a, b) = f(b), which
also leads to the upstream scheme. The boundary conditions are taken into
account in the following way:

fr = f(u'l), f'/v+! = f(u n ), (13)

with un = f J~~+l)k u(t)dt.


These simple cases suggest the following scheme for any f, which is the
scalar version of the scheme described in Sect. 2.2 (note that 1'(u) is the
Jacobian matrix at point u E R):
- Boundary condition at x = 0:
fl = J(un), if f'(u'l) > 0,
{ (14)
f~ = f(ul), if f'(u'l) < O.
- Boundary condition at x = 1:
f'/v+! = f(W), if f'(u'N) < 0,
{ (15)
f'/v+! = f(u'N), if f'(u'N) > O.
This solution is not always satisfactory as can be shown on the following
simple example with the Burgers equation:
u
Let f(8) = 8 2 , Uo = 1 a.e. on (0,1),11 = 1 a.e. on R+ and = -2 a.e. on
R+. The exact solution which has to be approached by the numerical scheme
is the unique solution of (11) with these values of f, Uo, 11 and U. Computing
the approximate solution with (7)-(9), the function 9 satisfying (C2), and with
(14)-(15), leads to an approximate solution which is constant and equal to 1
for any hand k. Then it does not converge (as hand k go to 0) towards
the exact solution which is not constant and equal to 1 since, for the exact
solution, a shock wave with a negative speed starts from the point x = 1 at
time t = O. Indeed, one can also remark that this approximate solution is
the exact solution of (11) with the same values of f, Uo, 11 and with any u
u
satisfying ~ -1 a.e. on R+. In order to obtain a convergent approximation
of the exact solution corresponding to u
= -2, a good choice is, instead of
(15), f'/v+! = gl(U'N, -2) with gl satisfying (Cl)-(C3).
Boundary Conditions for Hyperbolic Equations or Systems 47

3.3 A simplifed model for two phase flows in pipelines

It is now possible to understand the treatment of the boundary described


in Sect. 2 on a simplified model. This simplifed model for two phase flows in
pipelines is given in [12]. For this model, the densities are constant so that there
are no longer pressure waves but only the void fraction wave, corresponding
to the second eigenvalue of the original system (1). It is also easy to see that
for this model, the total flux (that is the sum of the fluxes of the two phases)
is constant in space. One also assumes that this total flux is constant in time
(and positive). System (1) is then reduced to a scalar equation, Equation (5),
where the unknown, u: (0,1) x R -7 R, is the gas fraction which takes its
values between 0 and 1.
The function f can be taken as f (s) = as - bs 2 , where a, b E R are given
and such that 0 < b < a < 2b. In (5), the quantity f(u) is the flux of gas.
Then f(l) - f(u) is the flux of liquid. The function f is increasing between
o and UM = aj(2b) and decreasing between UM and 1. An important value is
U m E [0, UM] such that f(u m ) = f(I).
One takes Uo = 0 a.e. on [0,1] as an initial condition. At x = 0, the gas
flux is given (as in the complete model, see Sect. 2), one takes f(u(O, .)) = f
with f(t) = c for t :::; T and f(t) = 0 for t :2': T, where c and T are given
with c > f(l) and T large enough so that l' changes sign at x = 1 during the
simulation. Indeed, in this simplified model, it is also necessary to take T not
too large in order to avoid a problem at x = 0 (for T too large, l' will also
changes sign at x = 0). The boundary condition at x = 1 will be described on
the discrete problem below.
The discretization of the problem is performed as before with (7)-(9), with
9 satisfying (Cl)-(C3).
For the discretization of the boundary condition at x = 0, the method
described in Sect. 2 leads here to

f'12 = f(nk), (16)


which is indeed in accordance with the fact that f' (u'l) > 0 for all n, at least
if T is not too large.
For the discretization of the boundary condition at x = 1, the first method
described in Sect. 2 and given in (15), using the sign of 1'(uN) leads to

f~+! f(uN)' if uN < UM,


=
{ (17)
f~+! = f(I), if uN> UM,

and does not lead to the desired results. Note also that f~+!, given by (17),
is a discontinuous function of uN.
The second method, described in Sect. 2, uses the fact that the liquid flux
cannot be negative at x = 1. Since the liquid flux at x = 1 is f(l) - fN+! and
48 T. Gallouet

since f(urn) = f(I), this method leads to

f~+~ = f(uN)' ~f uN :S Urn,


{ (18)
f N+~ = f(urn), If uN > Urn,

Note that 1'/{+1.' given by (18), is a continuous function of uN' We shall


2
apply the convergence theorem, Theorem 1 given in Sect. 3.1, for the boundary
conditions (16) and (18), and understand the boundary conditions satisfied by
the limit of the approximate solutions. In order to do so, we need to find go
and gl, satisfying (Cl)-(C3), and u,u E LOO(R+) such that ff: and f~+1.'
2 2
respectively defined by (16) and (18), satisfy (10). Indeed, it is shown in [7]
that both boundary fluxes Jr: and f~+1. may be expressed with the Godunov
2 2
flux in the following way:

- Boundary flux at x = 1. One takes U = 1 a.e. on R+ and go equal to the


Godunov flux, that is go = ga with

a(a f3) - {min{f(s), s E [a,f3]} if a:S f3,


9 , - max{J(s), s E [f3, a]} if a> f3.

The formula (18) reads

fn ( n 1) {f(UN ) if uN :S Urn, (19)


N+~ = ga UN, = f(l) if uN > Urn.

- Boundary flux at x = O. One assumes (for simplicity) that tEN. Let


a,f3 E (0,1) such that a < f3 and f(a) = f(f3) = c. One takes

u(t) = {ao ~f t < T,


If t > T,
(20)

so that, recalling that un = i J~~+I)k u(t)dt,


f(u n ) = {OC I : ff nk < T,
nk 2: T,
Then, if u~ :S f3, the formula (16) reads

ff: = ga(un, U~), (21)


2

since, in this case, ga (un, u 1) = f (un). The fact that u~ :S f3 is true for all n
if T is not too large. If T is too large, the convergence result can be applied
with (21) instead of (16).

It is now possible to apply Theorem 1. Let L be a common Lipschitz con-


stant for 9 and ga (on [0,1]2) and let ( > O. If k :S (1 - () 't,
the approximate
solution Uh,k, that is the solution defined by (7)-(9), with the boundary fluxes
Boundary Conditions for Hyperbolic Equations or Systems 49

(19)-(21) (and Uo = 0, U = 1 and u given by (20)), takes its values in [0,1]


and converges towards the unique solution of (22) in Lfoc([O, 1] x R+) for any
1 ::; p < 00, as h ----* 0:

U E LOO((O, 1) x (0, (0)),


roo
io io
r1[(u _ K)±'Pt + sign±(u - K)(f(U) - f(K))'Px]dxdt

+M 1 00

(u(t) - K)±'P(O, t)dt + M 1 00


(1 - K)±'P(l, t)dt (22)
o
+ 1
1

o
0

(0 - K)±'P(X, O)dx :::: 0,


1
\/K, E [0,1]' 'V'P E Cc([O, 1] x [O,oo),R+).
If u, solution of (22), is regular enough on [0,1] x (0, T), then, it is possible to
°
prove that u satisfies the boundary conditions, for < t < T, in the following
sense (see [13] and [7]):

- Boundary condition at x = °
(recall that u is given by (20)): u(O, t) = a or
u(O, t) :::: (3. In fact, if T is not too large, one has u(O, t) = a.
- Boundary condition at x = 1: u(l, t) ::; Urn or u(l, t) = 1.

Thanks to Theorem 1, it is possible to give other choices for f~+~ for which
the approximate solutions obtained with this new choice of f~+~ converge
towards the same function u, which is the unique solution of (22). Indeed, let
h: [0,1] ----* R be a nondecreasing function such that h::; f and h(l) = f(l)
and take:

f~+l =
2
h(uN)' (23)
One may construct a function gl satisfying (C1)-(C3) such that h(s) =
gl (s, 1), for all s E [0, 1], and then use Theorem 1. Let L be a common Lipschitz
constant for g and gC and gl (on [0,1]2) and let ( > 0. If k ::; (1 - ()£, the
approximate solution Uh,k, that is the solution defined by (7)-(9), with the
boundary fluxes (23) and (21) (and Uo = 0, U = 1 and u given by (20)),
takes its values in [0,1] and converges towards the unique solution of (22) in
Lfoc([O, l] x R+) for any 1 ::; p < 00, as h ----* 0.
Turning back to the complete system described in Sect. 2, the analysis of
this simplified model for two phase flows in pipelines may also suggest another
way to take into account the boundary condition at x = 1 (with a given
numerical flux gl):

1. Compute DF(wN)' its eigenvalues {AI, A2, >d and a basis ofR3 ,
{'P1,'P2,'P3}, such that DF(WN)'Pi = Ai'Pi, i = 1,2,3.
2. Write wN on the basis {'PI, 'P2, 'P3}, namely wN = a1'P1 + a2'P2 + a3'P3·
50 T. Gallouet

3. Since ),3 Ql 2: 0, compute WN+l = f31<Pl +


< 0 and since one wants
f32<P2 + f33<P3
and P].:;+! = gl(wN' wN+l) with the following 3 conditions
on the components of wN+l: usual condition on the pressure, f33 = 0:3 and
RN+1 = 1 where RN+1 is the gas fraction computed with wN+l'

4 Two phase flow in a porous medium

A second example is given by the modelization of a two phase flow, oil and
water (for instance), in a porous medium. Phases are immiscible. Compress-
ibility and capillarity effects are neglected. The model is obtained using the
conservation of mass for each phase and Darcy's law. This study is limited to
the one dimensional case. In this case the pressure can be eliminated and the
problem is reduced to a single equation, namely (5) with:

f(u) = h(u)(o: + f3h(u)). (24)


h(u) + h(u)
The unknown is the saturation of one phase, say water, and is denoted
by u. The quantity 0: is the total flux, which is constant in space, thanks
to the incompressibility of the phases. One assumes also that it is constant
in time and positive. The quantity f3 is the difference between the densities
of the phases. The functions hand 12 are the mobilities of the phases. The
function h is nondecreasing, regular and satisfies h (0) = O. The function 12 is
nonincreasing, regular and satisfies 12(1) = O. The function h + 12 is bounded
from below by a positive number.
Remark 5. For the equivalent two or three dimensional model, the pressure
cannot be eliminated and the resulting model is a coupled system of two par-
tial differential equations and two unknowns (pressure and saturation). The
problem to which the limit of the approximate solutions is solution is then
much more complicated to determine. See [6] for a partial study of this ques-
tion.
Here again, an initial condition is prescribed, namely (6), with Uo E
LOO((O, 1)), 0::; Uo ::; 1 a.e .. The boundary condition will be given later.
The numerical scheme is as in Sect. 3.1; it is given by (7) and (8) with (9).
The choice of the numerical flux, g, satisfying (C1)-(C3), is usually given, for
this model, using an "upwinding phase by phase", that is (see [2], for instance):

( b) = h(a)(o: + f312(a)) 'f - + f3f ( )<0


g a, h(a) + 12 (a) 1 0: 1 a -
(25)
h(a)(o: + f312(b)) .
g(a, b) = h(a) + 12 (b) If - 0: + f3h(a) > O.
Let us then define f'1 and f].:;+1' On considers here the case of an injection
2 2
of pure water at x = O. Then:
Boundary Conditions for Hyperbolic Equations or Systems 51

11 =a,
2
n ~ 0. (26)
At x = 1, The boundary condition is quite complicated. A simple example
is (see [7] for a more complete study):

r 1 = h(u'N)a (27)
N+'2 h(u'N) + h(u'N)
Then the approximate solution is given with (7)-(9), g given by (25), and
(26)-(27).

In order to prove that the approximate solutions converge, as hand k go


to zero, and to determine the problem for which the limit of the approximate
solutions is its unique solution, one proceeds as in Sect. 3.3. One has to find
go and gl satisfying (Cl)-(C3) and u,u E LOO(R+) such that fl and I~+l'
2 2
respectively defined by (26) and (27), satisfy (10). This is again performed in
[7]. The most interesting case is obtained for f3 h (1) > a and when the function
I is increasing on (0, UM) and decreasing on (UM' 1), as in Sect. 3.3. In fact, the
main point is the existence of a unique U m E (0, l)such that I(u m ) = 1(1) = a
and that I is increasing on [0, u m ] and greater or equal to a on [um, 1]. Then
it is quite easy to prove that (26) yields

112 = a = gc(u m , u 1),


where gc is the Godunov flux given in Sect. 3.3.
For the boundary condition at x = 1, it is possible to construct (see [7]) a
function gl: [0,1]2 --+ R satisfying (Cl)-(C3) such that (27) gives:

I~+~ = gl(U'N, 1).


It is now possible to use Theorem l.
Let L be a common Lipschitz constant for g (given by (25)), gc and gl (on
[0,1]2) and let ( > 0. If k ::; (1-()£, the approximate solution Uh,k, that is the
solution defined by (7)-(9) (with g given by 25), and by the boundary fluxes
(26)-(27), takes its values in [0,1] and converges towards the unique solution
of (28) in Lfoc([O, 1] x R+) for any 1 ::; p < 00, as h --+ 0:
52 T. Gallouet

where M is a bound for If'l on [0,1] (f is given by (24). As in Sect. 3.3 it


is possible to give the sense of the boundary condition if u is regular enough.
Indeed, let u be a regular solution of (28). Then u satisfies the boundary
conditions in the sense given by [1], that is:

sign(u(O, t) - um)(J(u(O, t)) - f(K)) :S 0, VK E [um, u(O, t)], for a.e. t E R+,

sign(u(l, t) - 1)(J(u(l, t)) - f(K)) ~ 0, VK E [1, u(l, t)], for a.e. t E R+,
with [a, b] = {ta + (1 - t)b, t E [0, I]} and sign(s) = 1 for s > 0, sign(s) = -1
for s < 0, sign(O) = 0.
This gives u(O, t) = Um or u(O, t) = 1 and u(l, t) :S Um or u(l, t) = l.
In particular, at x = 0, one has f(u(O, t)) = a (only water is injected) and,
at x = 1, f(u(l, t)) < a if u(l, t) < Um (which states that there is some oil
production).

5 The multidimensional scalar case


In this section, a generalization of Theorem 1 is presented for the multidi-
mensional scalar case together with a rough sketch of proof. For the sake of
simplicity, one considers d = 2 (the extension to d = 3 is straightforward) and
°
a flux function under the form v(x, t)f(u), with div(v(·, t)) = (see [13] for the
general case of a flux function f(x, t, u)). This leads to the following equation:

Ut + div(vf(u)) = 0, in Q x (0, T), (29)


where Q is a bounded polygonal open set of R2, T > 0, f E Cl(R,R) (or

°
f: R -> R Lipschitz continuous) and v E C 1 (R2 X [0, T]) -> R2 with
div(v(-, t)) = in R2 for all t E [0, T]. The unknown is u: Q x (0, T) -> R.
Let Uo E £OO(Q) and 11 E £OO(aQ x (0, T)). Let A, B E R be such that
A :S Uo :S B a.e. on Q and A :S 11:S B a.e. on aQ x (0, T). Following the work
of [10]' an entropy weak solution of (29) with the initial condition Uo and the
(weak) boundary condition 11 is a solution of (30):

u E £OO(Q x (0, T)),

11
o
T

n
[(u - K)±c.pt + sign±(u - K)(J(U) - f(K))V· gradc.p]dxdt
T
+M r r (11(t) - K)±c.p(X, t)d"((x)dt (30)
Jo Jan
+ L(uo - K)±c.p(X, O)dx ~ 0,
VK E [A,B], Vc.p E C;(Q x [O,T),R+),

where d-y(x) stands for the integration with respect to the one dimensional
Lebesgue measure on the boundary of Q and M is such that
Boundary Conditions for Hyperbolic Equations or Systems 53

where Ilvll oo = sUP(X,t)EnX[O,T]lv(x, t)1 (and 1·1 denotes here the Euclidean
norm in R2).

Remark 6.

1. If U satisfies the family of inequalities (30), it is possible to prove that U is

°
a solution of (29) (on a weak form), u satisfies some entropy inequalities in
fl x (0, T), namely Iu- lilt +div(v(j(max(u, Ii)) - f(min(u, Ii)))) ::::; for all
Ii E R, but also on the boundary of ofl and on t = 0. u satisfies the initial

°
condition (u(·,O) = uo) and u satisfies partially the boundary condition.
For instance, if l' > and u is regular enough, then u(x, t) = u(x, t) if
x E ofl, t E (0, T) and v(x, t) . n(x, t) < 0, where n is the outward normal
vector to ofl.
2. Let M ::::: 1. It is interesting to remark that u is solution of (30) if and only
if u is solution of (30) where the term fn(uo - 1i)±cp(X, O)dx is replaced by
M fn(uo - 1i)±cp(X, O)dx.

A sketch of proof of existence and uniqueness of the solution of (30) together


with the convergence of numerical approximations is now given, following [13].
STEP 1: ApPROXIMATE SOLUTION. With a quite general mesh of fl (with
triangles, for instance), denoted by T, and a time step k, it is possible to
define an approximate solution, denoted by UY,k, using some numerical fluxes
(on the edges of the mesh) satisfying conditions similar to (C1)-(C3) in Sect.
3.1. Under a so called CFL condition (like k ::::; (1 - ()¥:, in Sect. 3.1), it is
easy to prove that A ::::; UY,k ::::; B a.e. on fl x (0, T). Unfortunately, it does
not seem easy to obtain directly a strong compactness result on the familly of
approximate solutions (alhough this strong compactness result is true, as we
shall see below).
STEP 2: WEAK COMPACTNESS. Using only this Loo bound on UY,k, one can
assume (for convenient subsequences of sequences of approximate solutions)
that UY,k --> U, as the mesh size goes to zero (with the CFL condition), in
a "nonlinear weak-* sense" (similar to the convergence towards young mea-
sures, see [4] for instance), that is U E LOO(fl x (0, T) x (0,1)) and

1T in g(UY,k(X, t))cp(x, t)dxdt --> 11 1T in g(u(x, t, a))cp(x, t)dxdtda,


for all cp E U(fl x (0, T)).

STEP 3: PASSING TO THE LIMIT. Using the monotonicity of the numerical


fluxes, the approximate solutions satisfy some discrete entropy inequalities.
Passing to the limit in these inequalities gives that U (defined in Step 2) is
solution of some inequalities which are very similar to (30), namely:
54 T. Gallouet

11 iT 1
u E LOO(D x (0, T) x (0,1)),

[(u - K)±<pt + sign±(u - K)(f(U) - f(K))V. grad<p]dxdtdoo


o 0 n T
+M r r
Jo Jan
(u(t) - K)±<p(X, t)d,(x)dt (31)

+ In (uo - K)±<p(X, O)dx 2: 0,


VK E [A, B], Vip E C~(D x [0, T), R+).

For this step, one chooses M not only greater than the Lipschitz constant
of Ilvlloof on [A, B], but also greater than the Lipschitz constant (on [A, B]2)
of the numerical fluxes associated to the edges of the meshes (the equivalent
of L in Theorem 1). This choice of M is possible since the unique solution
of (30) does not depend on M provided that M is greater than the Lipschitz
constant of Ilvlloof on [A, B] and since it is possible to choose numerical fluxes
(namely, Godunov flux, for instance) such as the Lipschitz constant of these
numerical fluxes is bounded by the Lipschitz constant of Ilvlloof (then, the
present method leads to an existence result with M only greater than the
Lipschitz constant of Ilvlloof on s E [A, B], passing to the limit on approximate
solutions given with these numerical fluxes).
STEP 4: UNIQUENESS OF THE SOLUTION OF (31). In this step, the "dou-
bling variables" method of Krushkov is used to prove the uniqueness of the
solution of (31). Indeed, if u and ware two solutions of (31), the doubling
variables method leads to:

11 lIlT t lu(x, t, a) - w(x, t, (3) l<pt dxdtdood(3

+ 11 11 lin
(32)
(f(max(u, w)) - f(min(u, w)))v· grad<pdxdtdood(3 2: 0
Vip E Cl(D x [0, T), R+),

Taking <p(x, t) = (T - t)+ in (32) (which is, indeed, possible) gives that u
does not depend on a, v does not depend on (3 and u = v a.e. on D x (0, T).
As a result, u is also the unique solution of (30).
STEP 5: CONCLUSION. Step 4 gives, in particular, the uniqueness of the
solution of (30). It also implies that the nonlinear weak-* limit of sequences
of approximate solutions is a solution of (30) and, therefore, it guarantees the
existence of the solution of (30). Furthermore, since the nonlinear weak-* limit
of sequences of approximate solutions does not depend on a, it is quite easy
to deduce that this limit is "strong" in LP(D x (0, T)) for any p E [1,00) (see
[4], for instance) and, thanks to the uniqueness of the limit, the convergence
holds without extraction of subsequences.
Boundary Conditions for Hyperbolic Equations or Systems 55

References
1. Bardos, C., LeRoux, A.Y., Nedelec, J.C. (1979): First order quasilinear equations
with boundary conditions. Comm. Partial Differential Equations, 9, 1017-1034
2. Brenier, Y., Jaffre, J. (1991): Upstream differencing for multiphase flow in reser-
voir simulation. SIAM J. Num. Ana. 28, 685-696
3. Buffard, T., Gallouet, T., Herard, J.M. (2000): A sequel to a rough Godunov
scheme: Application to real gases. Computers and Fluids, 29, 813-847
4. Eymard, R., Gallouet, T., Herbin, R. (2000): Finite volume methods. Handbook
of numerical analysis, Vol. VII, 713-1020. North-Holland, Amsterdam
5. Kagan, A.M., Linnik, Y.V., Rao, C.R. (1973): Characterization Problems in
Mathematical Statistics. Wiley, New York
6. Eymard, R., Gallouet, T. (2003): H-convergence and numerical schemes for ellip-
tic equations. SIAM Journal on Numerical Analysis, 41, Number 2, 539-562
7. Eymard, R., Gallouet, T., Vovelle, J.: Boundary conditions in the numerical ap-
proximation of some physical problems via finite volume schemes. Accepted for
publication in "Journal of CAM"
8. Faille, 1., Heintze, E. (1999): A rough finite volume scheme for modeling two
phase flow in a pipeline. Computers and Fluids, 28, 213-241
9. Godunov, S. (1976): Resolution numerique des problemes multidimensionnels de
la dynamique des gaz. Editions de Moscou
10. Otto, F. (1996): Initial-boundary value problem for a scalar conservation law. C.
R. Acad. Sci. Paris Ser. I Math. 8, 729-734
11. Roe, P.L. (1981): Approximate Riemann solvers, parameter vectors, and differ-
ence schemes. J. Compo Phys., 43, 357-372.
12. Patault, S., Q.-H. Tran, Q.H. (1996): Modele et schema numerique du code
TACITE-NPW, tech. report, rapport IFP 42415
13. Vovelle, J. (2002): Convergence of finite volume monotone schemes for scalar
conservation laws on bounded domains. Num. Math., 3, 563-596
Fictitious Domain Methods in Shape
Optimization with Applications In
Free-Boundary Problems

Jaroslav Haslinger 1 , Tomas Kozubek 2 , Karl Kunisch 3 and Gunter Peich1 3

1 Charles University, Prague, Czech Republic, [email protected]


2 Technical University of Ostrava, Czech Republic, [email protected]
3 University of Graz, Austria, [email protected], [email protected]

Summary. This paper deals with a class of 2D shape optimization problems with
a 'flux' cost functional and a fictitious domain formulation of state constraints. These
constraints are given by nonhomogeneous Dirichlet boundary problems in bounded,
doubly connected domains. This approach is used for the numerical realization of
free-boundary problems of Bernoulli type.

1 Introduction
This paper deals with a particular shape optimization problem with a fictitious
domain (FD) formulation of the state equation. Solvers which are based on
FD formulations represent nowadays one of efficient tools for solving large
scale algebraic systems arising from discretizations of state problems. The
fact that the new FD problem is solved in a domain n with a simple shape
(a box, e.g.) enables us to construct uniform partitions of n and consequently
to use fast solvers and special preconditioning techniques. FD solvers have
additional advantages when used in shape optimization. To see that let us
recall the standard approach in shape optimization which is based on boundary
variations of admissible domains. Let us suppose that a linear state problem is
solved by a standard finite element method and a gradient type method is used
for the minimization of the cost functional. Then the following steps have to be
performed after every change of the shape: (i) remeshing the new configuration;
(ii) assembling the new stiffness matrix and the right-hand side of the linear
algebraic system; (iii) solving this new system. As a result the computational
process is not efficient. As we shall see, FD solvers utilizing nonfitted meshes
completely avoid step (i) and partially avoid step (ii) since the stiffness matrix
remains the same for every admissible domain. The FD formulation that we use
in this paper is based on the dualization of Dirichlet conditions by boundary
Lagrange multipliers ([6]' [8]). It turns out that this variant is appropriate in
shape optimization problems for the following reasons: the Lagrange multiplier
being part of the solution is equal to the conormal derivative on the searched
boundary of the solution to the original state problem. The conormal derivative
of the state appears in expressions for the shape derivative of cost functionals
Fictitious Domain Methods in Shape Optimization 57

([18]). In addition, the computed Lagrange multiplier is used when evaluating


our particular cost functional.
The paper is organized as follows. In Section 2 the optimal shape design
problem with the "flux" cost functional is defined and analyzed. State prob-
lems are given by Dirichlet boundary value problems in doubly connected,
bounded domains. To avoid difficulties with shape dependent function spaces
and for numerical purposes we use the above mentioned FD formulation of
the state problems. We introduce appropriate assumptions on the family of
admissible domains which guarantee the existence of optimal shapes. Conti-
nuity of solutions to the FD formulation with respect to domain variations
which plays a key role in the existence analysis is the main result of this sec-
tion. To avoid the evaluation of the dual trace norm which defines the cost
functional we give an alternative setting of the problem in which the standard
HJ(D)-norm is used for expressing the cost functional. Section 3 deals with
computational aspects of this approach. We present a finite element discretiza-
tion of the FD formulation. Then we shortly describe the modified controlled
random search (MCRS) method, i.e. the gradient free global minimization
method of the stochastic type which will be used for minimizing the cost func-
tional. For more details on computational aspects we refer to [11] and [9].
Finally, in Section 4 several Bernoulli free-boundary problems will be solved
using our approach. Both the gradient and MCRS methods will be used for
the numerical minimization of the cost functional.

2 Setting of the problem, existence analysis


In what follows we shall consider the following optimal shape design problem:

. 111 0u (w) QI12 Q JR.l (lP)


~~g"2 OVA - -l/2,r f (w)' E ,

where:
- 0 is a family of admissible domains which consists of doubly connected
domains in JR.2 contained in a box D. The components of the boundary ow
are denoted by fo and f f(w). We shall suppose that fo is fixed and the
same for all wE 0 while f f(w) is variable and defines the shape of w (in our
presentation f f (w) is exterior to f 0 but one can consider also the opposite
situation);
- o~(w) denotes the conormal derivative of u on f f(w), where u solves the
UVA
Dirichlet state problem in w:

- div (AV'u(w)) = f in w,
{ u(w) = 9 on fo, (P(w))
u(w) = 0 on f f(w),
58 J. Haslinger et al.

with f E L 2(0,), g E H1/2(ra), A E Loo(0,,]R2 x ]R2) satisfying:

30: > 0: A(x)~. ~ ~ 0:11~112 V~ E ]R2, a.e. in 0,.

The symbol II 11-1/2,rj(w) stands for the dual norm in H-l/2(r f(w)).

Remark 1. Problem (Jii» is closely related to Bernoulli free-boundary problems.


Indeed, let there exist w* E 0 such that

J(W*) := !II o~~:*) - QII~1/2,rj(w*) = o.

Then simultaneously u(w*) = 0 and o~(w*)


ullA
= Q on r f(w*) which is a typical
system of boundary conditions satisfied on free boundaries. A shape optimiza-
tion approach can be used for the numerical realization of this type of prob-
lems. To illustrate how we proceed let us consider the exterior Bernoulli free-
boundary problem (see [3]): for Q < 0 given, find w* E 0 and u*: w* -> ]R1
such that
6u* = 0 in w*,
{ u* = 1 on r a , (1)
u* = 0, ~~ = Q on rf(w*).
For W E 0 given a-priori, problem (1) is ill-posed due to the conditions on
r f(w). A possible way how to solve (1) is to skip one of the conditions on
rf(w) (say ~~ = Q). The rest of the problem is now well posed and defines
the state problem (P(w)). The remaining Neumann condition on r f(w) will be
satisfied by minimizing the cost functional J. If Wopt E 0 is a solution to (Jii»
such that J(wopt) = 0 then W* = Wopt solves (1). Let us mention however that
the assumptions on 0 which are specified below do not automatically ensure
J(wopt) = o.
Next we shall closely follow [10] where detailed proofs of all results can be
found.

2.1 Parametrization of shapes

We shall suppose that the outer components r f (w) of the boundary OW, W E 0
are parametrized by 27r-periodic functions 'Y: [0,27r] -> ]R2. We shall use the
following notations: r f(w) := r, meaning that r f(w) is the range of 'Y. Further
w, denotes the doubly connected domain between ra and r,. The family of
admissible domains can be specified by a class S to which 'Y belongs.
Definition 1. A function 'Y: [0,27r] -> ]R2 belongs to S if:
(51) 'Y E C~7r' i.e. 'Y is 27r-periodic, twice continuously differentiable on [0, 27r];
Fictitious Domain Methods in Shape Optimization 59

(S2) 3a,/'1,/'2 > OVt E [0,27r]: I/"(t) 1~ a, Ih'lloo:S /'1, 1b"1100:S /'2;
(S3) /' is positively oriented;
(S4) W, C 0 = (0,1)2;
(S5) 3d> 0: dist(fo,f,) ~ d, dist(f"aO) ~ d;
(S6) 3h > 0 V/, E S Vt E [0, 27r]: 3Bi , Bo open discs of radius h such that
Bi C w" Bo C 0 \w" /,(t) E Bi nBo'

Remark 2. Assumption (S6) means that there exists a tubular neighborhood


of f, of uniform thickness for all /' E S (see Fig. 1).

Fig. 1. Geometry of the Bernoulli problem

2.2 The fictitious domain formulation of ('P(wl'))

The unit square 0 which contains W, for all /' E S will serve as the fictitious
domain in the FD formulation. Let us introduce the following spaces:
1
Vg={vEHo(O) v=gonf o, v=Oonf,},
A A

Vo := Vg with 9 = O.

Instead of (P(W,)) , /' E S fixed, we formulate the following problem on 0:

Find il E Vg: a(il, v) = (j, v)o,n Vv E Va, (2)


where
a(il,v):= L AV'il· V'vdx

and j is the zero extension of f from w, into 0 \ wI'


60 J. Haslinger et al.

Remark 3. It would be possible to use any continuous extension of f outside


of w'Y but the zero extension is of a special significance.
It is easy to see that (2) has a unique solution u. In addition, ul solves
w.,
(P(w'Y)).
Now we may look at the conditions on fo and f'Y as constraints which
will be released by means of the Lagrange multipliers belonging to H- 1 / 2 (f o ),
H- 1/ 2(f'Y)' respectively. The FD formulation of (P(w'Y)) reads as follows:

Find (u, AO, A'Y) E HJ(D) x H- 1 / 2(f o) x H-l/2(f "I) s.t.


a(u,v) - (AO, ToV)ro - (A'Y,T'YV)r., = (j,v)o,n Vv E HJ(D),
(110, ToU)ro + (11"1' T'YU)r., = (110, g)ro
V(l1o, 11"1) E H-l/2(fo) x H-l/2(f "I).
(p(W'Y))

Here TO: HJ(D) ---* Hl/2(fo), T'Y: HJ(D) ---* Hl/2(f "I) stand for the trace
mappings and (, )ro' (, )r., denote the respective duality pairings. Using re-
sults of [2] one can easily prove
Theorem 1. There exists a unique solution (U,AO,A'Y) to (p(w'Y)). In addi-
tion, u:= ul solves (P(w'Y)) and A'Y = {){)u on f'Y.
w., VA

Remark 4. The equality A'Y = {){)u on f'Y is due to the zero extension of f
VA
outside of W'Y. For any other extension one only knows that A'Y = [!lv:] r ., '
where [ ]r., denotes the jump of the corresponding quantity across f 'Y"

The previous result motivates us to consider (IF) in the following form:

min -211IA'Y -
'YESL QII~1/2 'r ., ' (fill)

where A'Y E H- 1/ 2(f'Y) is the last component of the solution to (p(w'Y)) and
SL ~ S is a compact subset of S.

2.3 Periodic function spaces on [0,27r]

To prove the existence of optimal shapes in (fill) one has to show that the
solutions to (p( w'Y)) depend continuously on variations of 'Y E S. One of the
difficulties that we face in any shape optimization problem is the fact that the
functions have their own, variable domain of definition. To handle this difficulty
Fictitious Domain Methods in Shape Optimization 61

we pass to reference, shape independent function spaces. This will be done for
the trace space H1/2(Ly) and its dual H-1/2(Ly). Using the parametrization
of r" we shall replace H1/2(r,,) by the reference space Hit2 and H-1/2(r,,)
-1/2
b y H 211" .
The space of 21T-periodic, square integrable functions will be denoted by
L~11"' The reference trace space Hit2 is defined as follows (see [13]):

Hit2 = {rp E L~11" I IlrpI11/2,211" < oo},


where

II rp 1 2 '=1111 2
1/2,211" . rp L~~
+JJ I 211" 211"
Irp(t)-rp(sW dtds
sin((t - s)/2W .
o 0
Next we shall define the trace mapping T, from HJ(D) into the reference space
Hit2. To this end we introduce the following spaces on r", 'Y E S:
H}P(r,,) = {rp E L2(r,,) I rp0'Y E Hit2}
with the norm
IlrpI11/2,p := Ilrp °'Ylh/2,211"
and the standard Sobolev space H1/2(r,,) endowed with the norm ([15]):

Ilrplli/2,,, := Ilrpllt2(L,) + JJIrp(~~ =:1~Y)12


r1' r1'
ds x ds y = 10 211" Irp °'Y1 2b'l dt

+ r211" r211"
Jo Jo
Irp°'Y(t) -rpO'Y(sWI'Y'(t)II'Y'(s)ldtds
b(t) - 'Y(S)12
making use of the parametrization of f". The relation between H~/2 (f,,) and
H1/2 (f,,) follows from the next lemma.
Lemma 1. The spaces H~/2(f,,) and H1/2(f,,) coincide as sets and are topo-
logically equivalent, uniformly with respect to 'Y E S, i. e. there exist positive
constants C1, C2 such that

cIilrpI11/2,,, :::; Ilrp 'Ylh/2,211" :::; c21IrpI11/2,,,


0

holds for every rp E H1/2 (f ,,) and every 'Y E S.

Let i,,: H1/2 (f,,) ----> H~/2 (f,,) be the identity mapping and I" :
H~/2 (r,,) ----> Hit2 be an isometry defined by I" (rp) := rp 0 T Then the trace
mapping T,: HJ(D) ----> Hit2 is given by

T, := I" °i" 0 T".

The trace mapping T, enjoys properties which are useful in the existence anal-
ysis for (IP').
62 J. Haslinger et al.

Lemma 2. 1) There exists a constant c > 0 such that

(3)

2) If In -+ I in Gi", In, I E S, then

,n
T. v A

-+
T.,v A

tn
• Hl/2
2" (4)

holds for every v E HJ (D) .


The mapping T, maps HJ(D) onto Hi~2. This follows from
Lemma 3. (Inverse trace property) There exists a continuous extension map-
ping E,: Hi~2 -+ HJ(D) such that

T.,0,<P
C
= <P \..I
v<p
E Hl/2
2"

and
(5)
where c > 0 does not depend on I E S. In addition, supp E,<p is contained in
the h-neighborhood of f,.
For the proof of the previous lemmas we refer to [10].

In a similar way one can associate with any p" E H- 1 / 2 (f,) a unique
e ement p"- E H-
I 2"
1/ 2
:

where p, := I:; *i:; * p" and (, )2" denotes the duality pairing between H~1/2
an d H 2".
l/2

This enables us to rewrite problem (p(w,)) in the following equivalent


form in which only shape independent function spaces appear:

Find (u, Ao, '\,) E HJ(D) x H- 1 / 2 (f o) x H:;,,1/2 s.t.


a(u, v) - (Ao, ToV)r o - (,\" T,V)2" = (], v)o,o \Iv E HJ(D),
(p,o , ToU)r o + (P, T,U)2" = (P,o , g)ro
\I(p'o, p,) E H- 1/2(f o) X H:;,,1/2.

Since
Fictitious Domain Methods in Shape Optimization 63

we obtain the following transformation of (IF'):

(IF') ref
- 1/2
where A, E H:;1"( is the last component of the solution to (P(w'))ref"
A

2.4 Existence analysis for (IF') ref


As we have already mentioned, the continuous dependance of solutions to
(p(w,)) ref with respect to I E S plays a central role in the existence analysis.
To prove this property we shall need the following auxiliary results.
Lemma 4. The set of solutions {(u" AO" .\,), I E S} to (p(w'))ref is
bounded in HJ(O) x H-l/2(fo) X H:;,//2.
Lemma 5. (Stability of the Dirichlet boundary condition) Let In ---* I in C 21"(,
,n" E S and let {v n } be any sequence inHJ(O) such that TOVn = g, T,nVn = o.
If vn -'" v in HJ(O), then TOV = g and T,V = O.

Continuity of solutions to (p(w'))ref with respect to I E S now follows


from the next theorem.
Theorem 2. Let In ---* I in Ci1"(' ,n" E Sand (Un' AOn,.\n) E HJ(O) x
H-l/2(fo) x H:;1"(1/2 be the solution to (p(w,J)ref" Then

(6)

(7)

Proof. By Lemma 4 we may assume that there exists a subsequence of


{ (un, AOn , .\n)} (denoted by the same symbol) such that

(Un, AOn)n) -'" (U,AO),) in HJ(O) x H- l / 2 (f o) X H:;1"(1/2. (8)


From Lemma 5 it follows that TOU = g, T,U = 0 so that the second equation
in (p(w'))ref is satisfied. Let v E HJ(O) be fixed. Then

lim (.\n' T,nv)z1"(


n---+oo
= n---+oo
lim {a(un,v) - (AOn' ToV)ro - (j,v)o,[l}
(9)
= a(u,v) - (AO, ToV)ro - (j,v)o,[l
64 J. Haslinger et al.

as follows from (8). On the other hand

lim (.\n' T,n V)27r


n---+CX)
lim {(.\n, T,n v - T, vh7r + (.\n, T,V)27r} = (.\"1' T,V)27r,
= n---+oo

making use of (4) and (8). From this and (9) we see that (u, Ao, .\"1) solves
(P(W'Y)) ref" In addition, the whole sequence in (8) tends to (U,Ao,'\'Y). Strong
convergence of {un} follows from

using that TOU n = 9 and T,n Un = o.


Let cp E H 1 / 2 (r o) be given and v E HJ(D) be its continuous extension:
TOV = cp, Ilv1l1,0 :S cllcpI11/2,ro with suppv being in a tS-neighborhood of r o,
tS :S d. Then

yielding strong convergence of {Aon} to Ao. For the proof of (7) which is more
technical we refer to [10J. 0

It remains to specify a compact subset SL of S. Let L > 0 be given and


define

SL = bE S I I,//(t) -,//(s)1 :S Lit - sl Vt, s E [0, 27rJ}. (10)

Theorem 3. Let SL be defined by (10). Then (JID)ref has a solution.


The proof follows from Theorem 2 and continuity of the cost functional.

The dual norm defining the cost functional is hard to evaluate. For this
reason we use another functional which also measures the distance between
flu and the target Q and which is easy to compute. To this end we introduce
ul/A
the new norm in H1/2(r "I):

l[cpJI1/2,r := inf IvlI,o.


-y vEH,jCll) (11)
T,V=<p

It is easy to show that the respective dual norm is given by

11J-l11-1/2,r-y = 12!I,0, (12)

where 2 is the solution of the transmission problem:

Find 2 := 2(J-l) E HJ(D) such that


{
(\72, \7v)o,o = (J-l,v)r-y Vv E HJ(D).
Fictitious Domain Methods in Shape Optimization 65

This leads to a new definition of the shape optimization problem (IF):

1Iz (u(w))lr 'n,


min -2
o (liD)

where z(u(w)) solves (A(tI)) with tI = aau


VA
- Q. Observe that z(u(w)) = 0 if
and only if aau = Q, i.e. the zero value (if any) of the cost functionals in
VA
(IF)
and (liD) is realized by the same shapes.

The mathematical analysis for (iP) is done in [9]. It is worth noticing that
the existence of optimal shapes in (iP) is ensured under weaker assumptions on
O. Instead of C 2 ,2 in (IF) one needs C 1 ,1 boundaries in (iP), only. This is due
to the fact that the cost functional in (iP) is expressed by a volume integral.

2.5 Computational aspects

In this section we briefly describe the discretization of (iP). For more details we
refer to [9]. Instead of the family of admissible domains 0 we shall introduce
a new family 01< which contains domains with boundaries defined by a finite
number d := d(f£) of parameters, e.g. splines. For WI< E 01< given we define
a discretization of (i-'(w,)) as follows:

Find (Uh, AHa' AHy) E Vh X AHa X AH-y s.t.


a(Uh,vh) - [AHa,Vh]O - [AH-y,Vh], = (i,vh)o,n VVh E Vh,
[tIHa, Uh]O + [tIH-y, Uh], = [tIHa, :9]0
V(tIHa, tIHJ E AHa X AH-y.

Here Vh is a discretization of HJ(O), AHa, AH-y are appropriate discretiza-


tions of H- 1/ 2(f o ), H-1/2(f,), respectively, :9 is an approximation of 9 and
[, ]0, [, ], are approximations ofthe duality pairings ( , )ra, ( , )r -y' respectively.
The symbols h, Ho and H, denote the norms of the respective partitions used
for the construction of Vh , AHa, AH-y' respectively and H := (Ho, H,). In what
n
follows we shall suppose that the partition 1;" of characterizing Vh does not
depend on the geometry of WI<'

To ensure the existence and uniqueness of solutions to (i-'(Wl<)): we shall


need the following stability condition:
66 J. Haslinger et al.

For the detailed mathematical analysis of (p(w",)):, in particular for the ver-
ification of the LBB-condition we refer to [6]. Having AH-y at our disposal we
define the discretization of (A(Wk)):

Find Zh := Zh(AH-y) E Vh such that


{
('VZh, 'VVh)o,n = [AH-y - Q,Vh]" 'VVh E Vh.

Finally, the discretization of (TID) reads as follows:

~iKn ~IZh(AHJli,n'

where Zh(AH-y) solves (A(w",)):.


To see better the structure of the discrete problem we now present its
algebraic form. Any w'" E 0", is characterized by a vector I'>, E ]Rd of discrete
design variables. The family 0", can be identified with a compact subset U s;:;
]Rd. For a given I'>, E U we first solve the saddle-point problem:

Find (u,Ao,A,,) E]Rn x ]Rffil X ]Rm2 s.t.


{
Au - Iffi~ Ao - Iffi; (I'>,) A" = f(I'>,), (P(I'>,))
Iffio u = g, Iffi,,(I'>,)u = 0,
then the linear system:

where A is the stiffness matrix and Iffi o, Iffi,,(I'>,) are matrices coupling the primal
variable u with the Lagrange multipliers Ao, A", respectively. Notice that only
the matrix Iffi" and the right hand side f depend on I'>, but not A! Therefore, A
needs to be assembled only once. We finally arrive at the following non-linear
mathematical programming problem:

(IF')~

where z( 1'>,) solves (A( 1'>,)).


A traditional way of solving (IF') ~ is based on derivative information. It is
known from the theory that fictitious domain methods which use non-fitted
meshes may reduce the smoothness of the control-to-state mapping (see [11],
[12]). In addition, the classical gradient minimization techniques are local
methods. To obtain a global minimizer one should use global minimization
methods which do not need any gradient information. For our class of prob-
lems we use a stochastic type method, namely the modified controlled random
search (MCRS) algorithm [14].
Fictitious Domain Methods in Shape Optimization 67

This algorithm uses ideas of the simplex method [16] and the CRS (Con-
trolled Random Search) algorithm [17]. It starts with a population P of
N points (N » n) which are chosen at random in the search space X
(n := dim(X)). A new trial point x is generated from a simplex S (a set
of n + 1 linearly independent points of a population P in X) by the following
operation called reflection:

x = g - Y(z - g),
where z is one (randomly chosen) vertex of the simplex S, g is the center
of gravity of the remaining n vertices of the simplex and Y is a randomized
multiplicative factor. Thus the new point x is obtained from the reflection of
the point z with respect to g. Let Xmax be the point with the largest objective
function value among the N points currently stored in P. If f(x) < f(x max )
then Xmax := x, i.e. the worst point in the population is replaced by the new
trial point. The process continues until a stopping criterion is fulfilled.
The main modification of the original CRS algorithm consists in randomiz-
ing the multiplicative factor Y. The best results in most tested examples were
obtained with Y distributed uniformly in [0, a[ with a ranging from 4 to 8 and
N = max(5n, n 2 ). For more details on this algorithm we refer to [14].

3 Applications in free-boundary problems


In this section we show that the shape optimization problem (liD) with the
FD solver of the state problem represents an efficient computational tool for
solving a large class of free-boundary problems. Since free-boundary problems
are very well investigated from the theoretical point of view they may serve as
benchmarks for testing the reliability of the method. In what follows we use
our approach for solving exterior and interior Bernoulli free-boundary (BFB)
problems and a dam problem.

3.1 Bernoulli free boundary problems

We shall be concerned with the exterior as well as interior BFB problem.


Unlike to the exterior BFB problem defined by (1), the unknown component
r f(w*) of the boundary is interior to r o in the interior BFB problem and the
respective boundary conditions on r o and r f(w*) read as follows (see [3]):

u* = 0 on r o,
(13)
u*=l, 8u* )
8v=Qonr f (w*,

where Q is a positive constant. Since the Dirichlet condition on r f (w*) is non-


homogeneous, one has to introduce the new variable u := u - 1 in order to
ensure that A, = ge on r f(w*).
68 J. Haslinger et al.

The discrete family 01< consists of all doubly connected domains WI< whose
variable component of the boundary is realized by piecewise second order
Bezier curves. The discretization parameter '" is related to the number of
Bezier segments. The space Vh C HJ(D) is realized by continuous, piecewise
linear functions over a uniform partition ih of D. Further, AHa, AH.., are spaces
of piecewise constant functions over partitions THo, TH.., of polygonal approxi-
mations I'o, I', of r o, r" respectively (see Fig. 2). Finally, the duality pairings
[, ]0, [, ], are realized by the L 2 (I'o), L 2 (I', )-scalar products, respectively. In
order to satisfy the LBB-condition, the partitions THo, TH.., are constructed in
such a way that H o ~ 3h, H, ~ 3h. For more details on the practical realiza-

....... (..,
--0-ro
I!!lElI WI< A A

• nodes of THo,TH..,

Fig. 2. FE partitions

tion we refer to [9]' [11]. Both external and internal problems were computed
with the following data: D = (0,10)2, h = 10/64, the number of Bezier seg-
ments '" = 10 and r o is L-shaped. The examples are computed for different
values of the target Q by using two cost functionals, namely:
(a) J 1 (w) as in the definition of (iF);

(b) J 2 (w) = ~II..\H.., - QII~1/2,r.." where

Il vll-1/2,r.., = Ilv 0,11-1/2,[0,1] and

2 _ Icn l2
Il vll- 1/2,[0,1] - n=~oo 1 + Inl'
00

Here, denotes a piecewise linear parametrization representing I'"

The results obtained by the MCRS method after 2000 function evaluations are
shown in Figs. 3-4.
Fictitious Domain Methods in Shape Optimization 69

1or:-------~----___, 1Or------.----------,

°0~----------~,0 °0~----------~,0

Variant (a) Variant (b)

Fig. 3. Exterior BFB

10r:------------___, 1Or:----------------,

ro
°0~----------~,0 °0~----------~,0

Variant (a) Variant (b)

Fig. 4. Interior BFB

3.2 A gradient approach

In this subsection we outline a gradient based approach to the optimal shape


design problem OF) with the H- 1 / 2 (TJ )-norm replaced by the L 2 (TJ )-norm.
A rigorous foundation of the following arguments as well as a detailed numer-
ical analysis will be carried out in a future paper. For simplicity we consider
the following variant

.
mmwEOrr 2"1 Jr f IOu(w)
av - QI 2 d (J',
(14)

where o~Sw) denotes the outer normal derivative of the solution to


70 J. Haslinger et al.

-Llu = a on w,
{ u = 9 on To, (15)
u = a on Tf
with To and T f as in Section 2. We assume that the searched component of
the boundary Tp := Tf of every admissible domain w p, p E II, is the union
of a fixed number k of adjoining arcs Tp,i, Tp = U7=1 Tp,i, such that each arc
Tp,i can be described by a quadratic Bezier curve. More precisely, given an
ordered set of distinct control nodes X1, ... ,Xk, Xi = (~i,rJi)' i = 1, ... ,k,
the i-th Bezier arc Tp,i is determined by the triple (mi-1, Xi, mi) with mi =
!(Xi +Xi+1), i 1, ... , k. Here we set Xk+1 = Xo and mo = mk. Thus Tp,i can
=
be parametrized by

I'i(t) = mi-1(1- t)2 + 2xit(1 - t) + mit2, t E [0,1].

Defining B: [0,1] ---* jR1X3 (a (lx3) matrix), t f-+ ![(1 - t)2, 1 + 2t(1 - t), t 2]

and (~i-1 rJ i _ 1)
Xi = ~i rJi
~i+1 rJi+1
this parametrization can be compactly written as

I'i(t) = B(t)Xi, t E [0,1]' i = 1, ... , k.


Shifting the parameter interval

Bi(t) = B(t - (i -1)),


I'i(t) = Bi(t)Xi , t E [i - 1, i], i = 1, ... , k,
one obtains a parametrization T [0, k] ---* jR2 of Tp defined by I'1[i-1,i] = I'i.
Thus each index p E jRkx2 corresponds to an ordered list of coordinates of
control nodes and II C jRkX2 describes some a priori assumptions on the
location of the control nodes.
Let J(p) denote the value of the cost functional defined in (14) computed
by solving (15) on wp corresponding to the configuration p of the control nodes.
The Gateaux derivative of J at p in the direction op is defined by

J'(p)op = lim ~(J(p + sop) - J(p)).


8-->0+ S

Proceeding formally one can derive the following representation

J'(p)op = - Jrp all


ou!!..i!:.a dO" - 1 J K(( OU)2 -
all 2 rp all Q2)a dO". (16)

Above a describes the normal component of the displacement of Tp caused by


the perturbation op of the configuration p and K stands for the curvature of
Tp. The auxiliary variable f..l is the solution of the adjoint equation
Fictitious Domain Methods in Shape Optimization 71

-L1JL = 0 on wP '
{ JL = 0 on ro, (17)
JL = g~ - Q on rp
and u is the solution of (15) on wp. Let us introduce the shorthand notation

J'(p)8p = Irp Kado- (18)

and let 8Xi E jR3x2 hold the perturbation of the coordinates of the nodes Xi-I,
Xi, Xi+l specified by 8p. If we insert

a(t) = Bi(t)8XiV(t), t E [i - 1, i], i = 1, ... ,k,


into (18) we obtain
k .

J'(p)8p = 81~1 K(t)Bi(t) 8XiV(t)b'(t)1 dt.


If we just perturb the j-th coordinate of the l-th control node by 8r, j
1,2, this will affect only the Bezier arcs rp,l-l, rp,l and rp,l+l determined
by (ml-2,xl-l,ml-l), (ml-l,xl,ml) and (ml,xl+l,ml+l), respectively. This
results in

J'(p)8p = ~[J/-=-2l K(t)t 2vj(t)II"(t)1 dt + ILl K(t)(1 + 2t(1 - t))vj(t)II"(t)1 dt

+ 1/+ 1 K(t)(1 - t)2vj(t)II"(t)1 dt]8r := glj8r


(19)
with obvious modifications for l = 1 and l = k. Hence the gradient J'(p) is
represented by the matrix G E jRkX2 with entries Glj = glj determined by
(19).
We demonstrate the feasibility of this approach by applying this concept to
the Bernoulli problem described in Section 3. Figure 5.1 shows the computed
free boundaries (solid line) after 13, 12 and 19 optimization steps based on
the gradient information supplied by (19) for Q = -1.0, Q = -0.5 and Q =
-0.35, respectively. The dotted line indicates the initial configuration, stars
the corresponding initial and circles the final positions of the control nodes. For
comparison purposes we also show the free boundaries (dashed line) obtained
in Section 3 by a global method. Figure 5.2 illustrates the decrease of J on
a logarithmic scale.

3.3 A dam problem

We conclude this paper by solving a dam problem [4], [5]. The classical bound-
ary variation approach has been used in [1] and [7]. The vertical wall n made
72 J. Haslinger et al.
10,.,..---------------,

°0~-~---------~1O
1. Found free boundaries 2. Minimization history

Fig. 5. Exterior BFB (gradient approach)

of a non-homogeneous material separates two water levels of height Yl, Y2. One
wants to find a curve separating the wet and dry part of D. The mathematical
model leads to the following free-boundary problem:

Find D'P ~ D and u: D'P -+ JR.l such that


- div (k'Vu) 0 In D'P'
u Yi on ri , i = 1,2, (20)
k OU
ov
0 on rour'P'
u Y on r'P ur ""
where k E Loo(D), k ::::: ko > 0 and the partition of oD'P into r o, r 1 , r 2 , r 'P
and r", is shown in Fig. 6. The Neumann condition on r'P will be satisfied by

Fig. 6. Geometry of the dam problem


Fictitious Domain Methods in Shape Optimization 73

minimizing the cost functional J(rp) = !llk~~11=-1/2,r"" whereas the rest of


system (20) defines a well-posed state problem (P(rp)) for any rp E U ad . If n
is made of a homogeneous material, corresponding to a constant k > 0, U ad
consists of smooth, concave and decreasing functions defined over fo. After
substituting u := u - y for given rp E U ad we obtain the new state problem
with homogeneous Dirichlet data on f U f cp: (Y

- div (kY'u) = ~~ in ncp,

u = Yi - Y on fi' i = 1,2,
°
u = on f cp U f (y,
(15(rp))

au -1
all - on f o·

!
The FD formulation of (15(rp)) is now straightforward:

Find (iL, .Acp) E Vg x H-l/2(f~) such that

r kY'iL.Y'vdxdy=- Inr k av
In ay
dxdy+(.Acp,v)r", Vv E Vo, (p(rp))

(ft, iL)r", = ° Vft E H-l/2(f cp),


where
Vg = {v E Hl(n)1 v = g on an \ f o},
Yi-YOnf i , i=1,2,
g- {
°on an \ (f 1 U f 2 Ufo)
and k is the zero extension of k from ncp to n. The cost functional to be
minimized now takes the form
(21)

where lIy is the y-component of 1I. In computations the cost functional (21)
is replaced by the L2(f cp)-norm. As before the spaces Vg and H-l/2(f cp) are
approximated by continuous, piecewise linear and piecewise constant functions,
respectively and the unknown component f cp by piecewise, second order Bezier
curves. The example was solved with the following data: n = (0,1.62) x (0,4),
h = 1/32, Yl = 3.22, Y2 = 0.84, k = 1 and the number of Bezier segments
K = 6. The free boundary found by the MCRS method after 2000 function
evaluations is shown in Fig. 7.

Conclusions
The variant of the FD method presented in this paper provides an efficient
computational tool for the numerical realization of shape optimization prob-
lems. Its main features are the following:
74 J. Haslinger et al.

4,--------,

3.5

1.5

0.5

Fig. 7. Found free boundary

- it enables us to solve efficiently state problems;


- the use of non-fitted meshes in discretized problems avoids remeshing of the
fictitious domain after any change of the shape of an optimized structure.
Consequently, the whole process is more "user friendly" compared with
the standard boundary variation technique;
- the (co )-normal derivative of the state which appears in the shape deriva-
tive of cost functionals is a part of the solution to the FD-formulation;
- it can be utilized with minor changes for the numerical realization of a large
class of free-boundary problems.

Acknowledgement. Research was supported in part by the AKTION Czech Repub-


lic-Austria under 36p6, by the grant IAA1075005 (J. Haslinger and T. Kozubek),
201/02/Dl02 (T. Kozubek) and MSM 113200007.

References
1. D. Begis and R. Glowinski (1975): Application de la methode des elements finis
a l'approximation d'un probleme de domaine optimale. Methodes de resolution
des problemes appro eMs , Applied Mathematics & Optimization, 2, 130-169.
2. F. Brezzi and M. Fortin (1991): Mixed and hybrid finite element methods.
Springer-Verlag, New York.
3. M. Flucher and M. Rumpf (1997): Bernoulli's free-boundary problem, qualitative
theory and numerical approximation, J. Reine Angew. Math., 486, 165-204.
4. C. Baiocchi and A. Friedman (1977): A filtration problem in a porous medium
with variable permeability, Ann. Mat. Pura Appl., 114, 377-393.
Fictitious Domain Methods in Shape Optimization 75

5. M. Chipot (1984): Variational inequalities and flow in porous media, Springer-


Verlag, New York.
6. V. Girault and R. Glowinski (1995): Error analysis of a fictitious domain method
applied to a Dirichlet problem, Japan J. Indust. Appl. Math., 12, 487-514.
7. J. Haslinger, K.H. Hoffmann and R. Makinen (1993): Optimal control/dual ap-
proach for the numerical solution of a dam problem, Advances in Math. Sciences
and Appl., 2, 189-213.
8. J. Haslinger and A. Klarbring (1995): Fictitious domain / mixed finite element
approach for a class of optimal shape design problems. RAIRO M2 AN, 29, 435-
450.
9. J. Haslinger, T. Kozubek, K. Kunisch and G. Peichl (2003): Shape Optimiza-
tion and Fictitious Domain Approach for Solving Free Boundary Problems of
Bernoulli Type, COAP, 26, 231-251.
10. J. Haslinger, T. Kozubek, K. Kunisch and G. Peichl (2003): An embedding
domain approach for a class of 2-D shape optimization problems: mathematical
analysis, (to appear in Journal of Mathematical Analysis and Applications)
11. J. Haslinger and R.A.E. Makinen (2003): Introduction to Shape Optimization,
Theory, Approximation, and Computation. SIAM, Philadelphia.
12. J. Haslinger and P. Neittaanmaki (1996): Finite Element Approximation for
Optimal Shape, Material and Topology Design. Second Edition, J. Wiley & Sons,
Chichester.
13. R. Kress (1989): Linear Integral Equations, Springer-Verlag.
14. I. Kfivy and J. Tvrdik (1995): The controlled random search algorithm in opti-
mizing regression models, Comput. Statist. and Data Anal., 20, pp. 229-234.
15. J. Necas (1967): Les Methodes Directes en TMorie des Equations Elliptiques,
Masson, Paris.
16. J.A. NeIder and R. Mead (1964): A simplex method for function minimization,
Computer J., 7, 308-313.
17. W.L. Price (1976): A controlled random search procedure for global optimisation,
Computer J., 20, 367-370.
18. J. Sokolowski and J.P. Zolesio (1992): Introduction to Shape Optimization,
Shape Sensitivity Analysis, Springer Series in Computational Mathematics, 16,
Springer-Verlag, Berlin.
Part II

Contributed Papers
Domain Decomposition Method for a Class of
Non-Linear Elliptic Equation with Arbitrary
Growth Nonlinearity and Data Measure

Nour Eddine Alaa 1 and Jean Rodolphe Roche 2

1 Departement de Mathematiques et Informatique, Universite des Sciences et


Techniques Cadi Ayyad, B.P. 618, Gueliz,Marrakech,Maroc,
alaa@fstg-rnarmkech. ac. rna
2 I.E.C.N., Universite Henri Poincare, B.P. 239, 54506 Vandoeuvre les Nancy,
France, [email protected]

Summary. In this paper we show the existence of weak solutions for a nonlinear
elliptic equations with arbitrary growth of the non linearity and data measure. A nu-
merical algorithm to compute a numerical approximation of the weak solution is
discribed and analysed. In a first step a super-solution is computed using a domain
decomposition method. A numerical example is presented and commented.

1 Introduction

This work deals with weak solutions of the following quasi-linear elliptic prob-
lem with Dirichlet boundary conditions:

-ul/(t) + G(t, u'(t)) = F(t, u(t)) + f(t) In (0,1)


{ (1)
u(o) = u(l) = 0

where G, F : [0,1] x IR ---+ [0, +oo[ are measurable and continuous with respect
to u' and u, f is a given finite non negative measure on (0,1).
The main goal is to present a numerical analysis of this weak solutions
and to study their existence and uniqueness. Such problems arise from biolog-
ical, chemical and physical systems and various methods have been proposed
to study existence, uniqueness, qualitative properties and numerical simula-
tion of such solutions (see [8]). When f is regular, it is proved in [9] that
if F) has a nonnegative super-solution in W~,oo then (1) has a solution in
n
Wo,oo W 2 ,p. Many authors dealt with this problem when f is irregular and
G is sub-quadratic with respect to u' namely:

IG(t, r)1 :::; c(g(t) + IrI2), g(t) E Ll(O, 1), c > 0 (2)

They showed that, if G satisfies (2), (1) has a solution u E HJ(O, 1) provided
that (1) has a super-solution in Wl,oo(O, 1) see [6]' [5]. The case where the
super-solution itself is irregular has been treated in [2], if the super-solution
80 N.E. Alaa, J.R. Roche

belongs to HJ (0, 1) then (1) has a solution in HJ (0,1) provided that G satisfies
(2).
In this work we are particularly interested in situations where f is irregular
and where the growths of G with respect to u' and F with respect to u are
arbitrary. Let us make some precisions about the model problem:

{ -ul/(t) + lu'(t)lq lu(tW + f(t)


°
= in (0,1)
(3)
u(o) = u(l) =
where p, q ::::: 1 and f E Mii(O, 1), the set of nonnegative finite measure on
(0,1). We show here that if the semi-linear problem:

{ -wl/(t) = Iw(tW + f(t)


°
in (0,1)
(4)
w(O) = w(l) =
has a solution then (3) has a solution. Remark here that no restrictions for
p and q are imposed. For an elegant study of (4), see the work of Pierre and
Baras [4]. If w'(O) = +00 or w'(l) = -00 then w (j. W~,oo and obviously
the classical approach fails to provide existence of solutions of (3) and new
techniques have to be used.
Another approach studied here is the numerical approximation of the so-
lution of problem (1). In this approach the most important difficulties are to
determine the uniqueness and the blowup of the solution.
The general algorithm for the numerical approximation of this equation is
the application of the Newton method to the discrete version of problem (1):

Find U E IRm such that AU = H(U) (5)


where A is a sparse matrix and H : IR m --+ IRm is a nonlinear operator.
The Newton algorithm is given by:

chose UO in a neighbourhood of the solution


{ and solve until convergence (6)
(A - H'(Uk) Id) (U k+1 - Uk) = -A Uk + H(U k )

where H'(U k ) is the Jacobian matrix of the operator H computed in Uk and


I d is a identity matrix in IRm. This method converges quadratically when it
converges. Convergence depends in particular on the choice of UO and on the
existence and uniqueness of solutions of the linear system (6). In the case of
problem (1) the matrix A - H'(Uk)Id is often singular.
To overcome this difficulty we introduce a domain decomposition to com-
pute an approximation of auk = U k+ 1 - uk by the resolution of a sequence of
problems of type (1) in subset Di of (0, 1), such that D U Di . The idea
i=l,K
of the method comes from the following remark [11]:
°: ;
a < b::; 1, ai E LOO(O, 1), for i = 1,2. If Ib - al is small
-
Lemma 1. Let
enough then the operator - :t~ al (t) fit - a2(t)I d has an inverse in (a, b).
Domain Decomposition for a Non-linear Elliptic Equation 81

We have organized this paper in the following manner. In section 2 we give


the exact setting of the problem, we present an approximate equation for (1)
and we prove that the existence of weak super-solutions implies the existence
of weak solutions, without any restriction of the growth of G with respect to
u', this result being a generalization of the classical result of [9]' [6] and [2].
In section 3 we present an approximation scheme for problem (1) based on
the Schwarz overlapping domain decomposition method, combined with finite
element method.
This work was supported by the French Grant" Action Integree MA/02/33".

2 Mathematical analysis of the problem

Throughout this paper we suppose that f is a nonnegative finite measure


on (0,1) and G, F : [0,1] x IR -+ [0, +(0) are measurable.

The functions r -+ G(t,r),F(t,r) are continuous a. e. t (7)


F(t,.) is non decreasing and G(t,.) is convex, (8)
Vr E IR, G(., r), F(., r) are integrable on (0,1) (9)
G(t,O) = min{G(t, r), r E IR} = ° and F(t, 0) = 0. (10)
We introduce the notion of weak solution, super-solution and sub-solution used
here.

Definition 1. A function u is said to be a weak solution of (1) if

{ u E Wl;~(O, 1) nCo[O, 1]
-u"(t)+G(t,u'(t))=F(t,u(t))+f in 'D'(O, 1) (11)

(replace in (11) = by 2: for a weak super-solution and by :::; for a weak sub-
solution)

Remark 1. In (11) u E Wl~'coo (0, 1), using (9) we have G(t, u'(t)) and
F(t, u(t)) E L[oc (0, 1). Hence every term in (11) makes sense.

This enables us to state the main result of this paper.

Theorem 1. Assume that (7)-(10) and f E Mi(O, 1) hold. Assume that there
exists a weak solution w to the problem,

{ wE Wl~'~ (0, 1) n ColO, 1] (12)


-w" = F(., w) + f in 'D'(O, 1)
Then w is a super-solution of (1) and there exists a weak solution u of (1) such
that u :::; w.
82 N.E. Alaa, J.R. Roche

Remark 2. 1) It should be noted that there are no growth restrictions on the


lower order nonlinearity of F and G w.r.t. u and u' respectively. Hence the
present theorem extends some results proposed in [2], [6].
2) For any finite nonnegative measure f, the problem:

{ -wE w.a1,oo (0 , 1) (13)


-1J2." + G(t,1J2.') = f in V'(O,I)
has a unique solution 1J2., see [1], and remark here that 1J2. is a sub-solution of
the problem (11).
In order to prove theorem 1, we consider Gn(t, .), for n ;::: 0, the Yoshida
approximation of G(t,.) which increases a.e. to G(t,.) as n tends to infinity.
Note that G n (t, .) satisfies (7) -(10) and
(14)
According to the result given in [1], [6] there exists a sequence (un) of solution
to the problem:

{
Un+l E W~,OO(O, 1) (15)
- u~+1 + Gn +1 (t, u~+1) = F(u n ) + f in V'(O, 1)
where Uo = w. To have estimates when passing to the limit, we give the
following three lemmas, see [1].
Lemma 2. Let a(t) E L[oc(O, 1), v E WI~';(O, 1) n Co [0, 1] such that

{ a(t) v'(t) E LLc(O' 1) (16)


-v" - av' ;::: 0 in V' (0, 1)
Then v ;::: 0 in [0,1].
Lemma 3. Letu E WI~'';(O,I),:!!.,VELOO(O,I) andi-.i E Mt(O,I) such that:

:!!. :s: u :s: v in ]0,1 [


{ - u" :s: i-.i in V' (0, 1) (17)
-v" ;::: i-.i in V' (0, 1)

Then u E Wl~;'(O, 1), and


1
lu'(t) I :s: d(t;a,b) (c(a,b) + II:!!.IIL= + IlvlIL= +11i-.iIIMB) (18)

for all 0 < a < b < 1, where d(t; a, b) = min(b - t, t - a) and c(a, b) is
a constant depending on a and b.
Lemma (3), will provide Wl~';'(O, 1) estimates for the approximate solution
Un. But this estimate does not allow us to pass to the limit in the nonlinear
terms. We need the strong convergence of Un in Wl~;'(O, 1). We obtain this
result from the following Lemma.
Domain Decomposition for a Non-linear Elliptic Equation 83

Lemma 4. Let (un)n C W~,oo(o, 1) such that,


Un ---+ U strongly in L 00 (0, 1) (19)
Un,
3Q ::; U ::;
{ in V' (0, 1)
- U" ::; f1 (20)
w" ::; f1 in V' (0, 1)
Then u~ ---+ u' strongly in L Z;:C (0, 1)
Proof of the theorem (1). In a first step we have by direct application
of the maximum principle(see [2]) that
3Q ::; U n +1 for all n 2: ° (21)
In a second step, since F is monotonous W.r.t. r we can prove by induction

°
that
Un+1 ::; Un ::; W in [0,1] for all n 2: (22)
By lemma (3) Un is bounded in WI~~(O, 1) n C o [O,l] independently of n.
Therefore, there exists a subsequence, still denoted by (un) for simplicity, Un
such that converges to U strongly in Loo(O, 1) if n ---+ 00. Also u~+l converges
to u' strongly in Lfoc(O,l) and a.e. in (0,1). Then from lemma (3) we have
u~+1 converging to u' strongly in LZ;:c(O, 1), and

Ilu~IIL~(a,b) ::; K(a,b)(c(a,b) + IlwIIL~(O,l) + IIIIIMB + 113QIIL~(O,l)) (23)


°
where K(a, b) = 1/T) and < T) < a < T)+b < l.

°
Since G(t,.) and F(t,.) are continuous with respect to the two last argu-
ments, we have for all < a < b < 1
G(t,u~+1)' F(t,u n ) converges to G(t,u') , F(t,u) a.e. t E (0,1). (24)
On the other hand, for a.e t E (a, b)
IG(t,u~+1(t)l::; max IG(t,r)1 = e(t) (25)
Irl :'0 c' (a,b)
and
max IF(t, s) 1= 8(t) (26)
Isl:'Omax( IlwIIL~(O,l)' 11!£IIL~(O,l»)

and e, 8 E Lfoc (0, 1) from (9). Using Lebesgue's dominate convergence Theo-
rem (see [7]), we also have;
G(t,u~+l)' F(t,u n ) converges to G(t,u'), F(t,u) in Ll(a,b) respectively
(27)
Now, we can pass to the limit in (15), and if 'P E V (0, 1) with sup 'P C [a, b]
then
°= n-+oolim (-u~+1 + G(u~+1) - F(u n ) , 'P) = (-u"
- F(u) , 'P) + G(u')
(28)
where (.,.) denotes the duality pairing between V'(O, 1) and V(O, 1). The the-
orem follows.
84 N.E. Alaa, J.R. Roche

3 Numerical method

In this section we present the numerical method to solve equation (1). Formally
the iterative method brings out a sequence of numerical solutions of (15) in
HJ(O,I) with a first guess which is a super-solution of (1), in our case, a
solution of problem (12).
Then the algorithm can be formulated in the following way:
1) Find w E HJ(O, 1) such that:

- w" ;::: F(., w) + f (29)

2)Given Uo = w we compute a sequence, solution in HJ(O, 1) of the non


linear equation:

(30)

Both problems (Pd and (P2 ) are nonlinear, and if (PI) has a solution,
in theorem (1) we prove that the solution of (P2 ) is also a solution of the
equation (1). Problem (P2 ) has a unique solution and the numerical calculation
is straightforward by Newton method. To solve the nonlinear equation (Pd,
which presents some interesting difficulties, we construct a sequence w k such
that w k is a solution of a linear problem and w k converges to w.
Let wD = 0, we define wk + I = wk + 0 where 0 is the solution of the
following linear problem:

8F~~k) 0 = + F(wk ) + f
°
-0" - (w k )" in (0,1)
{ (31)
0(0) = 0(1) =

Then at each iteration we have to solve the linear problem (P3 ). To achieve
this, we consider a weak formulation of the problem and use the finite element
method.
To simplify the text we reformulate (P3 ) in the following way: find v E
HJ (a, b) such that:

+ c(t) v(t)
(P4 ) { - v(t)"
v(a) =
=
v(b) = °
h(t) in (a, b) (32)

where h E MB(a, b), the set of finite measure in (a, b), and c(t) E L 2 (a, b),
without restriction on its sign. We assume Coo = IlcIILOO(a,b) is bounded.
According to Lemma 1, problem (P4 ) has a solution in a domain (a, b)
small enough.
If V = HJ(O, 1) then the weak formulation of (P4 ) is:

find v EV: a(v,w) 1 b


v'w'+c(x)vwdx= 1
b
hwdx=(h,w) 'v'winV
(33)
Domain Decomposition for a Non-linear Elliptic Equation 85

Thanks to Poincare inequality we have:

a(w,w) = (w',w') + (c(t)w,w) 2: (Ib~al - coo)(w,w) (34)

Thus the bilinear form a(w, v) is coercive if Ib - al < .£Q...


c""
This remark is of great interest, because it can be exploited to obtain
a numerical solution of (P4 ) using a domain decomposition technique. In other
words, this means that the domain partition should be determined by the
behavior of 118FJ~k) 1100'
The aim of this paragraph is to introduce the Schwarz overlapping domain
decomposition method [10] applied to problem (P4 ). To simplify, without lost
of generality, we assume that we can consider a two domains decomposition
(a, b) = (a, (3) U(oo, b) such that:
. . Co
< (3 and ((3 - a), (b - a) < mm(-,
1['
wIth a ~) (35)
Coo 2y Coo

Then, if vo is an initialization function, defined on (a, b) and vanishing at a


and b, we define for k 2: 0, two sequences vf, i = 1,2 solving the following
problems:
(36)

and
{ -(v~+l)"(t) + c(t) v~+l(t) = h(t) in (a, b)
(37)
v~+l(oo) = v~(oo); v~+l(b) = 0
Now to prove the convergence of the Schwarz overlapping domain decom-
position algorithm applied to problem (P4 ) we consider two problems:

-VI(t)" + C(t)VI(t) = h(t) E (a,(3)


(P4 .I) { VI (a) = 0; VI ((3) = V2 ((3) (38)

and:
- V2(t)" + c(t) V2(t) = h(t) in (0:,1)
(P4.2) { v2(OO) = vI(OO), v2(b) = 0 (39)

Let v be v = VI in (a, (3), v = V2 in (a, b), then VI = V2 in (a, (3). We


suppose the existence of a solution of P4.1 in C( a, (3) and a solution of P4 . 2 in
C(oo, b).
Theorem 2. Assume a, b, a and (3 are such that a < (3, ((3 - a), (b - a) <
min( .£Q.., 2 ~). Then the sequence v k converges to v in C( a, (3) and C( a, b).
Coo V Coo

Proof:
Let dk = v~ - v in (a, (3) and ek = v~ - v in (a, b). We prove the following
inequality:
86 N.E. Alaa, J.R. Roche

where --y < l.


The difference d k satisfies the following equation.

(41)

and ek satisfies a similar equation in (a, b) with boundary conditions


ek+1(a) = dk(a) ;
ek+1(b) = o.
If we consider the following equation:

P, {- cp(t)" - Coo cp(t) = 0 in (a, (3)


(42)
( <p) cp( a) = 0; cp((3) = le k+1 ((3) I

k 1 sin(~(t-a)) . ... ..
then cp(t) = Ie + ((3)1 sIn
. (~((3
Coo - a
)). ThIS solutIOn IS umque and posItIve
if ((3 - a) < 2 ~. If we consider the difference z = cp - d k +2 it is easy to
prove that z :::: 0 and dk+2 ::; cp ::; lek+l((3)I. If now z = cp + dk+2 we have
also z:::: 0 and -le k+1((3)1 ::; cp ::; dk+2(t) , \:It in (a, (3). Then the inequality
Ild k+2 1loo ::; lek+l((3)1 ::; Il ek+llloo holds.
To prove that le k +1 ((3) I ::; --y IWlloo with --y < 1, we consider the equation:

( P, ) { - <jJ(t)" - Coo <jJ(t) = 0 in (a, b)


( 43)
1> <jJ(a) = Idk(a)1 and <jJ(b) = 0

The solution of this equation is given by: <jJ(t) = Idk(a)1


a
~in(~~:
t)}). This
SInCoo -
-
solution is positive if (b-a) < 2~ and we have <jJ(t):::: lek+l(t)1 \:It E (a,b).

Then le k+1 ((3) I ::; <jJ((3) ::; --y Idk(a)1 with --y = si.n~~ ~~ - (3)]. The coef-
sm~ -a
ficient --y is smaller than 1 only if a < (3.
In conclusion we have IW+211oo < Ildklloo if a < (3 and ((3 - a), (b - a) <
min(::, 2~). In the same way we prove that Ile k+21loo < Ilekll oo if a < (3
and ((3 - a), (b - a) < min({!, 2~).
We conclude that the Schwarz overlapping domain decomposition method
applied to problem (P4) converges.

4 Numerical Results

The algorithm introduced in the previous section has been implemented nu-
l
merically for the model problem (3) with p = q = 3 and f(t) a Dirac in 3.
Domain Decomposition for a Non-linear Elliptic Equation 87

To study the convergence history of the numerical simulation plotted in


figure 1 we consider two steps. In the first step, where we compute a super-
solution, we observe the evolution of the number of sub-domains: it goes from
m = 2 sub-domains to m = 10 sub-domains in five iterations according to
criterion (35). Simulation stops after 17 iterations when the residual is of the
order 10- 11 .
In the second step, starting with the super-solution computed in the previ-
ous step we perform nine iterations of the Yoshida approximation described in
section 2 and the simulation stops when the correction computed is in uniform
norm of the order 10- 11 .

2.315 - , - - - - - - - - - ,•. . . . - - - - - - - - - - - - - - - - - - - ;
I

/
i'
2.084

1.852

1.621

1/
1.389

1.158

0.926

0.695

0.463
/
0.232
/
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
- - (:omputed solution
+ computed super-solution

Fig. 1. Example f(t) = 8. * Ot, t = ~, m=lO

References
1. Alaa, N. (1989): Etude d'equations elliptiques non-lineaires a dependance convexe
en Ie gradient et a donnees measures, These de Doctorat, Universite de Nancy I
2. Alaa, N., Pierre, M. (1993): Weak solution of some quasi-linear elliptic equations
with data measures, SIAM J. Math. Anal., 24, 23-35
3. Alaa, N., Iguernane, M. (2002): Weak periodic solutions of some quasi-linear
parabolic equations with data measure, J. Inequal. Pure Appl. Math. 3
4. Baras, P., Pierre, M. (1984): Criteres d'existence de solutions positives pour des
equations semi-lineaires, Annales Fourier Grenoble, 24, 1985-2006
5. Bensoussan, A., Boccardo, L., Murat, F. (1988): On a nonlinear partial differen-
tial equation having natural growth terms and unbounded solution, Ann. Inst.
Henri Poincare, 5, 347-364
6. Boccardo, L., Murat, F., Puel, J.P. (1989): Existence results for some quasi-linear
parabolic equations, Nonlinear Analysis Theory Method and Applications, 13,
373-392
7. Brezis, H. (1983): Analyse fonctionnelle theorie et applications, Masson
88 N.E. Alaa, J.R. Roche

8. Levin, S.A., Hallam, Th. G., Gross, L.J. (1989): Applied Mathematical Ecology,
Biomathematics 18, Springer Verlag
9. Lions, P. L.(1980): Resolution de problemes elliptiques quasilineaires, Arch. Ra-
tional Mech. An!', 74, 335-353
10. Quarteroni, A.,Valli, A.(1999): Domain decomposition Methods for Partial Dif-
ferential Equations, Oxford Science Publications
11. Witomski, P. (1983): Sur la resolution numerique de quelques problemes non-
lineaires, These d'Etat, Universite Scientifique et medicale de Grenoble
Variants of Relaxation Schemes and the Lattice
Boltzmann Model Relaxation Systems

Mapundi Kondwani Banda 1

Darmstadt University of Technology, Schlossgartenstr. 7, D-64289 Darmstadt


[email protected]

Summary. In the low Mach number limit of the Lattice Boltzmann type models
one obtains the incompressible Navier-Stokes equation. This is achieved by asymp-
totic analysis. Moreover in the course of this analysis, the Lattice Boltzmann Model
reduces to a relaxation system which can be discretized using relaxation schemes.
We present two variants of the relaxation schemes characterized by local approxima-
tion of characteristic speeds and a multidimensional flux approximation. These are
applied to relaxation systems. Their performance will be discussed with reference to
test cases of isothermal incompressible flow.

1 Introduction

Many kinetic equations or discrete velocity models of kinetic equations yield


in the limit for small Knudsen and Mach numbers an approximation of the
incompressible Navier Stokes (INS) equations. A classical example is given
by the discrete velocity models used for Lattice-Boltzmann methods, see [1].
These discrete velocity models can be viewed as relaxation systems for the INS
equations.
Relaxation type schemes have been used successfully to discretize such
relaxation systems. In particular, a large number of numerical methods for
kinetic equations with stiff relaxation terms have been considered in fluid dy-
namic or diffusive limits. For these relaxation methods and asymptotic pre-
serving methods, we refer to [2, 3] and for more general applications of relax-
ation schemes to the recent review paper [4]. Such a multiscale based approach
provides an alternative to understanding the numerical transition from kinetic
models to the continuum models. This can be used as a platform for developing
alternative numerical schemes for INS. In the context of hyperbolic conserva-
tion laws relaxation schemes provide efficient high resolution and Riemann
solver free numerical methods. Here we introduce an HLL type of relaxation
scheme and apply it in the context of INS. Further by choosing a different
method for computing cell averages a scheme that can be considered multidi-
mensional is realized. This paper follows closely the work presented in [6].
90 M.K. Banda

2 Lattice-Boltzmann type Discrete Velocity Models and


Simplified Relaxation Systems
2.1 The Lattice-Boltzmann moment system

Consider kinetic equations with a diffusive scaling with a small parameter, £,


together with a rescaling of velocity. This scaling describes the small Knudsen
and low Mach number limit of kinetic equations, see [7] for details. Under these
transformations, we obtain

(1)

which describes the evolution of a particle density J(x, v, t) with x = (x, Y) E


1R? and v = (VI, V2) E 1R? For discrete velocity models in two-dimensions
(2-D) consider a model with nine velocities (N = 9). In the discrete case,
the v-dependence of the particle distribution J(x, v, t) is uniquely determined
through 9 functions Ji(X, t) = J(x, Ci, t), i = 0, ... ,8. where Ci are discrete
velocities.
A discrete moment M of order m E lN of J is defined by M(t, x) =
(1(x, v, t), P(v))v' where P is a v-polynomial of degree m E IN. In the follow-
ing we denote the components of the velocity by u = (u, v). The scalar product
is defined as (gl,g2)v = I:vg 1(V)g2(V). Equation (1) is transformed into an
equivalent set of moment equations (see also [8, 9] for a similar approach)
using moments based on v-polynomials [10]. The mass and momentum den-
sity are given by the zeroth and first order moments of J, (1,I)v = p and
(1, Vl)v = pu, (1, V2)v = pv. The second order moments form a symmetric
tensor, 8 = (8 X 8 Y ), and the remaining moments are set to q and s. The
equations of mass and momentum conservation are
. 1
8t p + div pu = 0, 8t pu + dlV 8 + -2-\1 p = 0. (2)
3£ a
Here, the divergence is applied to the rows of 8. The equation for 8 is
2 1 1
8t 8 + -S[pu]
3£2a
+ -Q[q]
3
= -- -(8 -
£1+a7
pu ® u), (3)

where
S[u] = ~ (
2
28xu yu v) .
8 yu + 8 x v
8 + 8x
28yv
Since Q[q], q and s are not needed to derive INS, they will be ignored.
From the momentum equation in (2) one can deduce that as £ - . 0, p
approaches a constant p and can be written as p = p(1 + 3£2a p ) to obtain (')(1)
terms. Hence
1
8t p + 3£2a div u =- div pu, (4)
Relaxation Schemes and Lattice Boltzmann Model 91

For 10 ---+ 0, equation (3) yields at the lowest order

(5)

Since 2 div S[u] = (L1 + V div )u, we obtain from (4) and (5) the INS equations
as limiting system for a = 1
. T
divu = 0, OtU + dlvu (>9 u + Vp = "3L1u, (6)

°
where the Reynolds number is related to the relaxation time by Re = 3/T.
For < a < 1, we obtain the incompressible Euler equations

divu=O, OtU + div (u (>9 u) + Vp = 0.

2.2 Simplified Relaxation Systems

By neglecting the lower order terms in the above equations (3) and (4) and
setting p == 1, we can introduce a simplified relaxation system:

1 .
OtP + -2- dlvu = 0, OtU + dive + Vp = 0,
10 <>
(7)
2 1
Ote + Va[u] + -SE[U]
10 2 <>
= ---(e - u
E1+<>T
(>9 U),

where
10 2 <>
SE[U] = S[U]- Tva[u].

We have added and subtracted the term

with a = (~) E 1R!. Obviously the limit equations for this system are again
the INS equations with Reynolds number Re = l/T.
Considering the nonstiff advection parts in (7) separately for u and e, we
obtain a hyperbolic system with characteristic speeds ±a and ±b in x and y
directions:
OtU + dive = 0, (8)
As we will see in the last section, a is chosen depending on the local speed. In
the x-direction (8) can be written as:

(9)
92 M.K. Banda

A variant of this scheme incorporates a simple linear HLL-type Riemann solver.


To incorporate the HLL type formulation, equation (9) is written as:

I
(a- + a+)I )(u)=o
8 X x
(10)

where a- and a+ are determined by some algorithm [17]. The submatrices 0


and I are 2 x 2 zero and identity matrices. Observe that if a- = -a+ the
formulation (9) is obtained.

3 Numerical Schemes

3.1 Space Discretizations

To discretize the equations in space a uniform grid in x and y with grid points
(Xi, Yj) with spacing h is used. Consider the linear system (8). For the x-
direction 8 x ± au are the characteristic variables associated with the charac-
teristic speeds ±a while for the y-direction the characteristic variables associ-
ated with the characteristic speeds ±b are 8 Y ± bu. Similarly the characteristic
variables associated with the x-direction for the HLL type scheme in equation
(10) are

w+ = 2a+ (8 X
(a+ - a-)
- a-u) and W- = (a:~:-) (8 X - a+u). (11)

The values of the characteristic variables will be determined at cell edges


as in [2]. Hence the numerical fluxes at cell boundaries are:

8f+1/2j = ~ ( 8fj + 8f+1j) - ~ ( Ui+lj - Uij)

+~ (a;j(8 + au) -
X
a;+1j (8 X - au)),

Ui+1/2j = ~ ( Uij + Ui+1j) - 21a ( 8f+1j - 8fj)

+ L (a;j (8 X + au) + a;+lj (8 X


- au))

in the case of second order method. If minmod slope limiting is applied then
atj are given by atj(z) = ~minmod( Zij - Zi-1j, Zi+1j - Zij)'
For the HLL type scheme the numerical fluxes at cell edges are
Relaxation Schemes and Lattice Boltzmann Model 93

We denote by Fh, F~ the discretization of the convective parts div E> and
\7a[u] in equation (7), respectively. Using the numerical fluxes above one ob-
tains

Fh(E>, u) = ~ (E>f+1/2j - E>f-1/2j) + ~ (E>;J+1/2 - E>;j_1/2)'


2
Fh(E>, u) = (1h (a 2Ui+1/2,j - a2 ) 'h1(b2Uij+1/2 - b2
Ui-1/2j )).
Uij-1/2

And for the HLL type scheme we get the following convective term:

F~(E>, u) = (~( -a- a+Ui+1/2,j + a- a+ui- 1/ 2j)


+~((a- +a+)E>f+1/2,j - (a- + a+)E>f_1/2j) ,
~ ( -b-b+Uij+1/2 + b-b+Uij_1/2)
+~ ((b- + b+)E>;J+1/2 - (b- + b+)E>;j_1/2) ).
To obtain a multidimensional scheme the computation of the cell averages is
modified. To take advantages of diagonal points of the cell the trapezoidal
approximation is used. The cell averages are thus defined as:

E>ij = ~ (t(u~E) + f(u~W) + f(u~E) + f(u~W)) (12)

h N E(NW) =Pij (xi±~'YJ+~ ) , u SE(SW) =Pij (xi±~'Yj_~ ) an d Pij ( X,Y ) =


u ij
were ij
Zij + (Zx)ij(X - Xi) + (Zy)ij(Y - Yj). The slopes (ZX)ij and (ZY)ij are (at least
first order) approximations to derivatives Zx and ZY' respectively. The flux is
the equilibrium flux (5), f(u) = u Q9 u - (2E 1- a T)/3S[u] at E ----7 0.
Denote the discrete gradient by Gh and the discrete divergence by Dh
which are given by second order centered differences. Second order centered
difference approximations of Sf and S are denoted by S1. and Sh. Hence we
obtain the spatial discretization:
. 1
P + -Dh
E2a
.U= °'
u + Fh(E>, u) + GhP = 0, (13)

e + F~(E>, u) + E;a S1.(u) = - E1 ;a T (E> - u Q9 u)


94 M.K. Banda

or equivalently

Dh . GhP - 2E2"'p = -Dh . Fh(e, u), u + Fh(e, u) + GhP = 0,


<3 + F~(e, u) = - E1;"'T (e - u 0 u + 2E 1-"'TSi,(U)).
A corresponding high order upwind based space discretization for the INS
equations is obtained considering the limit of the above discretization as E -+ 0

3.2 Time Discretizations

To treat only the limit equations (E = 0) any explicit high order Runge-Kutta
method [11] can be used in combination with a Poisson solver and the limiting
(relaxed) spatial discretization.
Further one would want to discretize the relaxation system (13) for all
ranges of the parameter E. This allows the study of the numerical passage
from the Boltzmann to the INS regime. An implicit-explicit (IMEX) Runge-
Kutta method of the type recently developed in [12] is used as suggested in
[6]. We denote the time step by k and use superscript n to denote the time
iterations. For the second order time discretization a two stage IMEX Runge
Kutta method [12] which guarantees second order accuracy in the stiff limit is
chosen. For E -+ 0 this scheme suggests a formulation for a second order time
discretization of INS equations i.e. the projection is taken at every step:
Step 1:

Llpn+1/2 = k~ div (un - k"diVe n ),

Un+1/2 = un _ k,,( dive n + \7pn+1/2) ,


e n+1/2 = u n +1/ 2 0 u n +1/2 - 2E 1-"'TS[Un +1/2].

Step 2:

Llpn+l = k1" div (un - k (5 div en + (1-5) div e n+1/2 + (1_,,)\7pn+1/ 2) ),


u n+1 = un _ k (5 div en + (1 - 5)e n+1/2 + (1 - ,,)\7pn+1/2 + ,,\7pn+1) ,
e n +1 = u n +1 0 u n +1 - 2E 1-"'TS[Un +1].

with" = 1- V2/2 and 5 = 1-1/2", which gives a scheme for the incompress-
ible Euler equations (0 < a < 1) and INS equations for a = 1.
Relaxation Schemes and Lattice Boltzmann Model 95

Guided by the above formulation we obtain the second-order modified


Runge-Kutta scheme for INS:
Step 1:

Step 2:

Llpn+l = ~ div (un - ~ ( div en + div e n+l/ 2 ) ) ,

un+1 = un _ ~(diven + diven+1/2)_~\7pn+l.

In addition, we do not have to compute e at every time step unless it is


needed. We rather compute dive instead. This reduces the number of vari-
ables. The usual hyperbolic and parabolic CFL conditions have to be fulfilled
to guarantee stability.

4 Numerical Examples and Results

The cases for E = 0 will be tested in this section. The space discretization in
equation (13) will be applied. For slope limiting the van Leer limiter is used.
Example 1: Taylor-Vortex Flow - Accuracy test
Relaxation schemes will be first tested on incompressible Euler equations,
i.e. 1.1 = T = 0.0 augmented with smooth periodic initial data. The test admits
the following exact solution [13]:

u(x,y,t) = -COS(27rX) sin(27rY) exp(-2vt);


(14)
v(x, y, t) = sin(27rx) cos(27rY) exp( -21.1t); x, y E [0,1].

Taking h = liN with N = 32,64,128, and 256, the solution is computed up


to t = 2.0 and the L1 and L2 norms of the errors and convergence rates of u
(the velocity component in the x-direction) are listed in Table 1 below.
The following schemes have been tested: ::R~~~,2 (Second-order relaxed
scheme with TVD time integration); ::R~7~K2 (Second-order relaxed scheme with
DIRK time integration).
All examples use the CFL number of 0.475 based on the local flow velocity.

Example 2: Travelling wave - accuracy test


This numerical test was used by Minion and Brown in [14]. The compu-
tational domain of unit length is doubly-periodic. The exact solution of the
Navier-Stokes equations for this problem is:
96 M.K. Banda

Table 1. Error-norms for the Incompressible Euler Problem with the initial condition
in 14 at t = 2.0 using relaxation schemes, v = 0.0.

N Scheme L1-error Rate L 2 -error Rate


9{<=O,2 2.99588 . 10- 3 4.10701 . 10- 3
64 TVD 2.5683 2.4753
9{<=O,2 2.80480 . 10- 3
DIRK 2.5351 3.91299 . 10- 3 2.4382
9{<=O,2
128 TVD 6.16432 . 10- 4 2.281 9.55745.10- 3 2.1034
9{<=O,2 5.97678 . 10- 4
DIRK 2.2305 9.28992 . 10- 4 2.0745
9{<=O,2 1.13089 . 10- 4 1.8214. 10- 4
256 TVD 2.4465 2.3916
9{<=O,2 1.05389 . 10- 4
DIRK 2.5036 1.64960 . 10- 4 2.4935

u(X, y, t) = 1 + 2 cos(27r(x - t)) sin(27r(Y - t)) exp( -S7r211t);


v(x, y, t) = 1 - 2 sin(27r(x - t)) cos(27r(Y - t)) exp( -S7r211t); (15)
p(x, y, t) = -(cos(47r(x - t)) + cos(47r(Y - t))) exp( -167r 211t).

Numerical results are presented in Table 2.

Table 2. Error-norms for the Incompressible Euler Problem with the initial condition
in (15) at t = 0.7 using relaxation schemes, v = 0.0.

N Scheme L1-error Rate L 2 -error Rate


9{E=O,2
64 TVD 1.3222 . 10- 2 1.4691 1.60313 . 10- 2 1.5942
9{E=O,2 1.29641 . 10- 2
DIRK 1.5573 1.57197. 10- 2 1.5656
9{E=O,2 2.63745.10- 3 3.26585 . 10- 3
128 TVD 2.3257 2.2954
9{E=O,2 2.59158 . 10- 3 3.20928 . 10- 3
DIRK 2.3226 2.2923
9{E=O,2
256 TVD 3.18838 . 10- 4 3.0482 4.2203 . 10- 4 2.952
9{E=O,2 3.1598.10- 4
DIRK 3.0359 4.18859 . 10- 4 2.9377

Example 3: Doubly Periodic Shear Layer


This test was introduced by Bell, Colella and Glaz in [15]. In the periodic,
two-dimensional computational domain of size [1 x 1], the following velocity
fields are generated as initial conditions:

ul!(X 0) _ {tanh(e(Y - 1/4)), y ::::; 1/2; v"(X,y,O) = 8sin(27rx);


, y, - tanh(e(3/4 - y)), y > 1/2;
Relaxation Schemes and Lattice Boltzmann Model 97

where () is the shear layer width parameter and 0 is the strength of the initial
perturbation. The strength coefficient 0 = 0.05 is kept unchanged. The results
of vorticity profiles are displayed using 20 equidistant contours.
In figure 1 vorticity profiles of solutions obtained by applying different
schemes without recourse to slope limiters are shown.

Vorticity T =1.2 N "" 128 Vorticity T = 1.2; N = 128.

Vorticity T =1.2; N "" 128; Vorticity T = 1.2; N = 128;


""""~--=:;::--<S

Fig. 1. Thick Shear ((2 = 30) Layer Results for Euler case E = 0: Staggered Central
Scheme [16J (a), Godunov Based Scheme [15J(b); Relaxation Scheme with TVD time
integration (c); and Relaxation Scheme with relaxed DIRK time integration (d).

Further a refinement of the grid to N = 256 was made. The same compu-
tation was repeated with the van Leer limiter applied to the relaxation-based
schemes. The results are shown in figure 2. In figure 3 a v velocity cut at
x = 0.5 is shown.
98 M.K. Banda

Vorticity T = 1.2 N = 256 Vorticity T =1.2; N = 256

Vorticity T '" 1.2; N = 256 Vorticity T = 1.2 N = 256

t;;
9°·2
0.1 ~O.1
UJ
0 0
(d)O
0
(C)
Fig. 2. Comparison of Thick Shear Layer Results for the incompressible Euler case,
multidimensional vs. dimension-by-dimension approach: (a) second-order DIRK, (b)
second-order TVD, (c) second-order multidimensional with DIRK, (d) second-order
multidimensional with TVD.

The relaxation system was also tested on the thin shear layer, p = 80,
problem. The Navier-Stokes equation was considered for t = 1.0 using a grid
of N = 256. A comparison of the time evolution of total kinetic energy and
enstrophy in the incompressible Euler equation up to time t = 2.0 was made.
Figure 4 presents the evolution of the decay of the total kinetic energy of the
flow and a history of the mean enstrophy.
In all the three examples we observe that the relaxed schemes (E = 0)
perform reasonably well inspite of their simplicity. Much as the DIRK formu-
lation suggests the projection structure, the direct TVD formulation has better
qualitative results. The DIRK formulation tends to be more dissipative. Never-
theless they both tend to achieve their expected convergence rate. Further we
Relaxation Schemes and Lattice Boltzmann Model 99
V cuts at T = 1 .2, N = 128 for Second Order Schemes V cuts at T = 1.2, N =256 with DIRK time integration
0.3r---~--~--~---r----, 0.3r---~--~--~---r--___,

0.2 0.2

-0.1 -0.1

-0.3:----::'::----::'-0---:-:------='0-----' -0.3~-___,:'::----:'-:"---:-:------=,O-----'
(a)0 0.2 0.4 0.6 0.8 (b) 0 0.2 04 06 08

Fig. 3. The cut at x = 0.5 of v at t = 1.2 for the Thick Shear Layer Problem
computed with the staggered central scheme (,Central'), the Godunov projection
method (,Centered'), the relaxation-based scheme with DIRK time integration (,Cen-
tered DIRK') and the relaxation-based scheme with TVD time integration (,Centered
TVD') (Left). The cut at x = 0.5 of vat t = 1.2 for the Thick Shear Layer Problem
computed with the Godunov projection method (,Centered'), the relaxation-based
scheme ('2nd Order') and the multidimensional relaxation-based scheme (,2nd Order
Mult.') (Right).

Kinetic Energy for Incompressible Navier-Stokes Equation Enstrophy for Incompressible Navier-Stokes Equation
0.48
- Centered DIRK
100,--~---~-r===;c:::e::;nt=ere=d;:;:D"'IR;:;;K:=il
. - . - Centered TVD . - . - Centered TVD
- - Centered 90 - - Centered
0.47

0.46 ........ - ... 80

70
0.45
~
60
0.44
~
UJ

50

0.43
.............
40

0.42 30

0.41~-----:_'::_----;----c':_-----' 20
1.5
(a) 0 05 n~e 1.5 (b) 0
0.5 1
Time

Fig. 4. Comparison of total kinetic energy and enstrophy at t = 2.0, Re = 10000 for
the thin shear layer problem computed with the Godunov projection method (,Cen-
tered'), the multidimensional relaxation-based scheme with TVD time integration
(,Centered TVD') and the multidimensional relaxation-based scheme with DIRK
time integration (,Centered DIRK').
100 M.K. Banda

Vorticity T =1.0; N =256; Rs = '0,000 VorticityT= 1.0; N =256; Re = 10,000

i;

9°·2
-g
~ 0.1

w
00
l-----="':':--~~~~
0.2 0.4 0.6 O.B
(b) Contour Plot L =-59.9293 H = 59.9293

Fig. 5. Comparison of Thin Shear Layer Results for Navier-Stokes case: (a) second-
order multidimensional DIRK scheme, (b) second-order multidimensional TVD
scheme.

observed that for the DIRK scheme there is no significant difference between
the multidimensional and dimension-by-dimension scheme. A comparison with
other schemes shows that relaxations schemes are less dissipative than central
schemes while on some occasion they are very close to the Godunov scheme.
The resolution of the solution for thin shear layer problems shows that the
scheme has a lot of potential for improvement. The implementation of the
HLL formulation and relaxing schemes (E -I- 0) is underway to investigate how
this can be achieved.
Acknowledgement: This work was supported by a DFG Grant KL 1105/9.

References

1. Chen, S., Doolen, G.D. (1998): Lattice Boltzmann Method for fluid Flows. Ann.
Rev. Fluid Mech., 30, 329-364
2. Jin, S., Xin, Z. (1995): The Relaxation Schemes for Systems of Conservation Laws
in Arbitrary Space Dimensions. Comm. Pure Appl. Math., 48, 235-277
3. Klar, A. (1998): An Asymptotic-Induced Scheme for Nonstationary Transport
Equations in the Diffusive Limit. SIAM J. Num. Anal., 35, 1073-1094
4. Naldi, G., Pareschi, L., Toscani, G. (to appear): Relaxation schemes for PDEs and
applications to second and fourth order degenerate diffusion problems. Surveys
in Mathematics for Industry.
5. LeVeque, R.J., Pelanti, M. (2001): A Class of Approximate Riemann Solvers and
their Relation to Relaxation Schemes. J. Compo Phys., 172, 572 - 591
6. Banda, M.K., Klar, A., Pareschi, L., Seai'd, M. (2001): Lattice-Boltzmann type re-
laxation systems and high order relaxation schemes for the incompressible N avier-
Stokes equation. preprint.
Relaxation Schemes and Lattice Boltzmann Model 101

7. De Masi, A., Esposito, R, Lebowitz, J.L. (1989): Incompressible Navier Stokes


and Euler Limits of the Boltzmann Equation. Comm. Pure Appl. Math., 42,1189
8. Klar, A. (1999): Relaxation Schemes for a Lattice Boltzmann type discrete ve-
locity model and numerical Navier Stokes limit. J. Compo Phys., 148, 1-17
9. d'Humieres, D. (1992): Generalized Lattice-Boltzmann Equations. In: AIAA Rar-
efied Gas Dynamics: Theory and Applications, Progress in Astronautics and Ae-
oronautics, 159, 450-458
10. Giraud, L., d'Humieres, D., Lallemand, P. (1998): A lattice Boltzmann model
for Jeffreys viscoelastic fluid. Europhys. Lett., 42, 625-630
11. Shu, C.W., Osher, S. (1988): Efficient implementation of essentially non-
oscillatory shock-capturing schemes. J. Compo Phys., 77, 439-471
12. Ascher, U., Ruuth, S., Spiteri, R (1997): Implicit-explicit Runge-Kutta methods
for time-dependent partial differential equations. Appl. Numer. Math., 25, 151-
167
13. Chorin, A.J. (1968): Numerical Solution of the Navier-Stokes equations. Math.
Comp., 22, 745-762
14. Brown, D., Minion, M.(1997): Performance of under-resolved two-dimensional
incompressible flow simulations, II. J. Compo Phys., 138, 734-765
15. Bell, J.B., Colella, P., Glaz, H.M.(1989): A second order projection method for
incompressible Navier-Stokes equations. J. Compo Phys., 85(2), 257-283
16. Kupferman, R, Tadmor, E. (1997): A fast high-resolution second-order central
scheme for incompressible flows. Proc. Nat. Acad. Sci., 94, 4848
17. Toro, E. F. (1997): Riemann Solvers and numerical methods for fluid dynamics.
Springer, Berlin
A Time Semi-Implicit Relaxation Scheme for
Two-Phase Flows in Pipelines

Michael Baudin l , Frederic Coquel 2 and Quang-Huy TranI

1 Institut Frangais du Petrole, Rueil-Malmaison, France, [email protected]


2 Laboratoire Jacques-Louis Lions, Universite Pierre et Marie Curie, Paris, France

Summary. The aim of this paper is to present a relaxation scheme designed to


approximate the solutions of the system of conservation laws which arises in the
modeling of two-phase flows in oil and gas pipelines. The main idea is to relax only
the highly non-linear closure laws, thanks to a Lagrangian change of coordinates.
By construction, all the fields of the nonlinear hyperbolic relaxation system are lin-
early degenerated. In addition to the simplicity of the evaluation of the flux function,
the relaxation coefficients can be easily devised so as to guaranty the positivity of
the mass fractions. We propose to use an integration in time which is explicit with
respect to the small eigenvalues and linearly implicit with respect to the large eigen-
values. In the first part, we construct a second-order explicit relaxation scheme based
on a Godunov method. In the second part, we present the semi-implicit relaxation
scheme. A "stiff" numerical simulation of an industrial case is shown.

Introduction

The aim of this paper is to present a relaxation scheme designed to approx-


imate the solutions of the system of conservation laws which arises in the
modeling of two-phase flows in oil and gas pipelines[lO]. The system, made up
of three conservation laws, is a drift-flux type model and is closed by two ther-
modynamic and hydrodynamic models. The complexity of these models make
classical numerical schemes such as Godunov or Roe schemes very difficult to
use. Our feeling is that only a "rough" scheme would be able to successfully
meet the challenge of nonlinearities.
An essential property of our system is that it possesses fast characteristic
speeds (acoustic waves) and slow ones (corresponding to the mass transport).
From the engineer's standpoint, the crucial aspect is the petroleum transport
and not the acoustic. That is why we will never use an explicit scheme, the
time step of which is limited by the CFL stability condition: the time required
for the simulation would be prohibitive due to the large characteristic speeds.
Inspired by [7], I. Faille and E. Heintze have proposed in [6] a VFRoe-type
scheme which is sufficiently rough (diffusive) to handle very stiff industrial
cases. The scheme shown in [6] is linearly implicit with respect to the large
eigenvalues and explicit with respect to the small eigenvalues. Therefore this
semi-implicit scheme will combine accuracy on the waves propagating at the
A Time Semi-implicit Relaxation Scheme 103

small velocities and reduced CPU-time because the CFL time step limitation
only applies to small eigenvalues.
But the VFRoe scheme also suffers from a few drawbacks. The first one
is that the scheme is still CPU-time consuming because it needs to compute
numerically the eigenvalues of the Jacobian matrix of the system. The second
one is that theses eigenvalues are not necessarily real, that is, the system is
not always hyperbolic. It is observed in [4] that the system is hyperbolic only
if the slip between the phases is not too large.
This is why the authors of [2] designed an explicit relaxation scheme. The
approach [2] is close in spirit to Jin & Xin's [11] but differs in the fact that
only the genuine non-linearities are relaxed, namely the pressure and the hy-
drodynamic laws. In a first step, the genuine nonlinearities are shown by means
of a Lagrangian change of coordinates. Then, these are relaxed. Finally, the
equations are brought back to the Eulerian frame. The resulting relaxation
system is automatically hyperbolic and has all its fields linearly degenerated.
This scheme is less CPU-time consuming than the VFRoe-type one because
the most complex (algorithmically speaking) step is the computation of only
two relaxation coefficients. The relaxation coefficients can be easily devised so
as to guaranty two properties: first, the stability of the first order asymptotic
system computed thanks to the Chapman-Enskog expansion and, second, the
positivity of the mass fractions. As a first attempt to design a fast and rough
numerical scheme for two-phase flows, these results were encouraging.
The goal of this paper is to work out a semi-implicit version of the explicit
scheme [2]. In essence, the extension is possible because of the linear degeneracy
property of the relaxation system.
The paper is organized as follows. In section 1, we introduce the two-phase
flow model together with the boundary conditions and the characteristics of
this model. In section 2, we develop the second order explicit relaxation scheme
with the computation of new relaxation coefficients and boundary conditions.
In section 3, we present the semi-implicit scheme and section 4 is devoted to
the numerical results.
Most of the results presented here are extracted from [1], [2] and [3].

1 Two-phase flow model

1.1 Equations

In the flow, the mixture is characterized by its density p, its velocity v and its
gas (resp. liquid) mass fraction Y (resp. X = 1 - Y). The model is governed
by the following system of conservation laws:

(1)
104 M. Baudin et al.

°
for all x E JR and t ;::: where the unknown is u := (p, pv, pY). Here the source
term S includes gravity and wall friction terms which are given functions of
the unknown. The two following functions of the unknown u
O"(u) = pY(l- Y)<l!(u), P(u) = p(u) + pY(l - Y)<l!(U)2, (2)
are highly non-linear closure laws. The natural phase space associated with
such variables then reads:
n= {u = (p, pv, pY) E JR3; p> 0, v E JR, Y E [0, I]}.
Considering the pressure p, we will consider a perfect gas and a compress-
ible liquid. The pressure law can be put in the general form p = p(p, pY) and
is a smooth function. We consider a general algebraic hydrodynamic law of the
type
VL - VG = <l!(u), \j u E n, (3)
in order to close (1). In (3), the mapping <l! is assumed to be smooth enough.
In practical situations, <l! turns out to be nonlinear in the unknown u (see [4],
[12] for instance).

1.2 Boundary and initial conditions


At the inlet of the pipe (x = 0), the mass flowrates are given as functions of
time, i.e.
(pV)", (0, t) = q~(t), t ;::: 0, a = L, G,
where and (pV)L (resp. (pv)G) denotes the mass flux of the liquid (resp. the
gas). At the outlet of the pipe (x = L, the length of the pipeline), the pressure
is a given function of time, i.e.
p(L, t) = pL(t), t ;::: 0.
We will treat cases in which the flow is induced only by variations of the
boundary conditions: in such experiments, the initial condition is the steady
state, computed by the values of the boundary conditions at time t = 0.

1.3 Characteristics of the model


There is no analytical expression for the physical flux of the considered sys-
tem, except for very simple hydrodynamic laws. Therefore the eigenvalues of
the system (1) are not known in full generality. However, in most common

°
situations, i.e., for usual values of u, the system is hyperbolic and has three
real eigenvalues >'1 < >"2 < >"3 with >"1 < and >"3 > 0.
The interesting property of the system is that the eigenvalues >"1,3 cor-
respond to the acoustic waves (or "pressure waves") and propagate at fast
speeds thanks to the compressibility of the fluid. The eigenvalue >"2, which has
a variable sign, corresponds to the kinematic waves and propagates at slow
speeds with the fluid. These properties imply that the large eigenvalues are
10-100 times bigger than the small eigenvalue, i.e. 1>"1,31» 1>"21.
A Time Semi-implicit Relaxation Scheme 105

2 Explicit relaxation scheme

2.1 The relaxation model

We relax all the genuine non-linearities of the equilibrium system (1) which
appear by means of a Lagrangian change of coordinates. Then, we introduce
two new state variables E and II which are intended to coincide respectively

!
with u(u) and P(u) in the limit of the relaxation parameter A. Finally, back in
Eulerian coordinates, we propose as a relaxation model the following system:

at p + Ox (pv) = 0,
at (pv) + Ox (pv 2 + II) = S(u),
at (pII) + Ox (pIIv + a2 v) = Ap(P(U) -II), (4)
at (pY) + Ox (pYv - E) = 0,
at (pE) + Ox (pEv - b2 Y) = Ap(U(U) - E),
where a and b are two real positive parameters that we call "relaxation coef-
ficients". The above relaxation system will be given hereafter the convenient
abstract form:

atV + axQ(v) = AR(v) + S(v), t > 0, x E JR.; (5)

where the functions Q, Rand S receive clear definitions.


The first order system with no source term extracted from (4) admits five
real eigenvalues: v, v ± aT, v ± bT, (T = lip) and five linearly independent
corresponding eigenvectors. Consequently, the first order system with no source
term extracted from (4) is hyperbolic. Moreover, each eigenvalue is associated
with a linearly degenerated field.
Because of the specific values of the functions P and u, we always have
a > b. This is in harmony with the fact that a is associated with pressure
waves and b with kinematic waves.

2.2 First order explicit scheme

The pipeline is made of I cells, denoted (Mi)i=l,I. Let Xi be the center of the
cell and L1x its length. We also denote xi+1/2 = (Xi + xi+d/2 the interface
between two cells, Xl/2 = °the inlet boundary interface and x I +1/2 = L
the outlet boundary interface. Let L1t n = t n+1 - t n be the time step. Let
u7 R;; U(Xi, tn) be the discrete unknown. The numerical scheme is based on
the following splitting method.
1. Relaxation. We take A = 00 and solve the ODE system atV = AR(v) by
projecting the variables on the equilibrium variety i.e. we set IIf := P(u7)
and Ef := u(u7)·
2. Evolution. We take A = ° and solve the system atV + axQ(v) = S(v) on
one iteration in order to go to the time t = tn+l.
106 M. Baudin et al.

In the evolution step, we consider the system 8 t v + 8 x Q(v) = S(v) that we


approximate by a classical Godunov finite volume scheme which is based on
the update v7+ l = vi - ~! (Q(V:+1/2) - Q(V:_ 1 / 2)). The Riemann solution
V:+1/2 on the interface i + 1/2 is made of six constant states separated by five
contact discontinuities (see [2] for details). It is easily computed and leads to
a low cost numerical scheme.
The time step of the relaxation explicit scheme is limited by a CFL condi-
tion.

2.3 Relaxation coefficients

We present in this section the computation of the relaxation coefficients a and


b. We consider one local couple (ai+1/2' bi+l/2) on each interface Xi+1/2 in
order to minimize the numerical dissipation. The relaxation coefficients are
designed in order to ensure:
1. the stability of the first order asymptotic equilibrium system thanks to the
Chapman-Enskog expansion
II = P + .>.-1 III + 0(.>.-2), (6)
The Chapman-Enskog analysis justifies the following choice:

ai+l/2 = Jmax(A(uL),A(UR)), A(u) = -Py(u) + P:(u), (7)


bi+l/2 = Jmax(B(UL) , B(UR)), B(u) = cr~(u).
2. physical properties of the approximate solution.
It is possible to compute the relaxation coefficients a and b in order to
satisfy, at the discrete level, the following two basic physical properties:
the positivity of the density and, above all, the maximum principle on the
gas mass fraction, i.e., ~n E [0,1]. This is done by taking ai+1/2 and bi+l/2
large enough, which results in a increased, but well-adjusted, amount of
numerical dissipation.
Finally, on one interface Xi+1/2, the relaxation coefficients are
ai+1/2 = max(ai+l/2' ai+l/2) and bi+l/2 = max(bi +1/2' bi +1/2)'

2.4 Second order in space and time

The scheme is extended to second-order accuracy in space by using the classi-


cal MUSCL (Monotonic Upstream Scheme for Conservation Laws) technique
([8]' [9]). Instead of taking a constant approximate solution on each cell, we
construct a linear approximation. The limited slopes are those of the "physical
variable" (p, Y, v) rather than those of the conservative variable u [6]. We
choose the classical minmod slope limiter.
The scheme is extended to second-order accuracy in time thanks to
a Runge-Kutta second-order procedure.
A Time Semi-implicit Relaxation Scheme 107

3 Semi-implicit relaxation scheme

3.1 First order linearly implicit scheme

The linearly implicit scheme is classically splitted into three steps.


1. In the physical step, one computes a "predictor" using the explicit scheme

Here H is the numerical flux, based on a Godunov method.


2. In the mathematical step, one solves the linear equation in ov

(9)

where ai-I, (3i and Ii+! are 5 x 5 matrices involving partial derivatives of
the numerical flux.
This consists, at each time step, in solving a linear system A ov = b thanks
to, for example, a Gauss method. The matrix A is a tridiagonal by block
matrix with each block of size 5 x 5: the resulting matrix is a band matrix
with 9 extra-diagonal terms.
3. Finally, we can update the conservative variable of the cell i by v~+l =
vi + OVi.
Now, the remaining question is: how to construct the matrices a, (3 and
,? It is easy (see [6]) to compute these matrices when the explicit scheme
is a Roe-type scheme. The main objective is therefore to put the relaxation
explicit scheme under Roe's form. Such a rewriting is based on a shock curve
decomposition and is possible when the solution of the Riemann problem is
made of shocks and contact discontinuities (which is the case of the relaxation
system). This computation is detailed in [1, 3].

3.2 First order semi-implicit relaxation scheme

The relaxation scheme is constructed as to be:


- linearly implicit on the fast waves of speed v ± aT ("pressure waves"),
- explicit on the slow waves of speed v ("kinematic waves") and the associated
relaxation waves of speed v ± bT.
Accordingly, when one computes the partial derivatives of the numerical flux,
only the terms associated with the largest eigenvalues are kept : we nullify
the entries of the diagonal matrix implied in the computation of Roe's matrix.
The eigenvalues of the Jacobian matrices are modified in the same way. The
matrices a, (3 and I are then computed with these modified matrices.
108 M. Baudin et al.

Time step of the semi-implicit relaxation scheme At each time step,


one computes two auxiliary time steps. The explicit time step is computed
on the base of the small eigenvalues of the relaxation system with a CFL
number of 0.5. The linearly implicit time step is computed on the base of
the large eigenvalues of the relaxation system with a CFL number of 20. The
final time step is the minimum of the two last time steps. The CFL number
20, experimentally determined, turns out to be a good compromise between a
small CPU-time and an acceptable amount of smearing out for the numerical
solution.

Implicit projection We follow the ideas of Chalons [5]. The use of a linearly
implicit scheme implies that the projection step is giving steady states that
are not accurate. The solution is to link the evolutions of the relaxed variables
associated with an implicit field to the evolutions of the equilibrium variables.
This modifies the linear system to be solved in the mathematical step. It
enables us to reduce the size of each block from 5 x 5 to 4 x 4 and therefore
reduce the size of the global linear system. Numerical experiments shows that
not only the steady solutions are more accurate (as already shown in [5]), but
even transient solutions are more accurate.

3.3 Second order semi-implicit scheme

The slope limiters used to build the second order explicit scheme leads to a non
differentiable expression. That is why we choose a simplified version in which
we do not differentiate the nonlinear operator involved in the second order
correction. In the physical step, we first compute the second order correction
of the states UL,R and evaluate the numerical flux. In the mathematical step,
we solve the first order linear system but the derivatives are computed with
the corrected states UL,R.
In order to have a second order accuracy in time, we use again the Runge-
Kutta 2 procedure. Since the scheme is only semi-implicit, the order 2 in time
is only achieved on the explicit waves and the linearly implicit waves are solved
with order 1 in time.

4 Numerical results
In this Section, we show the numerical results of the semi-implicit relaxation
scheme: we consider a real-life problem in which the solution is driven by the
changes of the boundary conditions.
The details of this test-case are shown in figure (1) and the results are given
in figures (2).
In this experiment[6], the inlet gas flow rate is decreased from 0.114 to 0
kg/so As the mass flowrates are small, the decrease in the inlet gas mass flow
rate gives rise to negative oil velocities in the upper part of the riser. Therefore,
A Time Semi-implicit Relaxation Scheme 109

Extracted from : [6]


Geometry of the pipe and discretization:
- Length : 80 m, vertical
- Diameter: 0.146 m
- Cell size : 1.6 m
Closure laws:
- Thermodynamic: compressible liquid, perfect gas
- Hydrodynamic : like Zuber-Findlay
Source terms: Friction+ Gravity.
Boundary and initial conditions:
- Stabilization period : 10 s
- Transient period : 100 s
- Inlet condition on gas fiowrate : 0.114 to O. kg/s
- Inlet condition on liquid fiowrate : 1.628 kg/s
- Outlet condition on pressure : 10 6 Pa

Fig. 1. Detail of the experiment, Zuber-Findlay-like law with boundary conditions

a void fraction between a "single-phase gas" state and a "two-phase" state


propagates from the outlet down the pipe. Simultaneously, the change in the
inlet gas mass flowrates induces another void fraction front which propagates
from the inlet up the pipeline. These two fronts meet around x = 47 m at
time t = 260 s to form a unique discontinuity wave which propagates toward
the outlet of the pipe. At time t = 200 s, this discontinuity wave reaches the
outlet and the pipe turns to a single-phase liquid steady state.
This test-case is quite stiff. During the simulation, the scheme must han-
dle two-phase states, liquid state and gas state. Moreover, the discontinuity
propagating at the end on the simulation is between two one-phase states.
The classical VFRoe scheme is not enough rough in order to handle this case
and that is why the authors of [6] introduced more numerical diffusion in
their VFRoe- TACITE scheme. The numerical results show a good agreement
between the two schemes.

o
i 0.6

l~ 0, (

Time(s)

Fig. 2. Experiment 4, gas surface fraction. Left: TACITE results, Right: relaxation.
110 M. Baudin et al.

The difference between the two schemes is mainly the CPU cost of the
two simulations. The TACITE scheme requires 22 minutes and the relaxation
scheme requires 7 minutes. See [1] for a discussion on this point.

Conclusion

We have presented a second order semi-implicit relaxation scheme. The main


difficulty was to express the relaxation Godunov scheme in Roe's form, that is
to say, to compute Roe's matrix. The semi-implicit relaxation scheme is then
explicit for the slow waves and linearly implicit for the fast waves and enable
us to reduce the CPU time as well as to be more accurate on slow waves.
Numerical experiments show a good agreement with a VFRoe-type scheme on
realistic problems.
The main open issue is the extension of this scheme to the compositional
flows which have a very important role in petroleum industry. The energy
equation should also be investigated in order to take care of thermal effects.
These systems are explored in [1]. Another effort could be made in developing
rough boundary conditions: the explicit relaxation only ensures the physical
properties inside the domain and not at the boundaries.

Acknowledgments

The authors wish to thank C. Berthon, 1. Faille and F. Willien for the numerous
helpful discussions we had.

References
1. M. BAUDIN, Methodes de relaxation pour la simulation des ecoulements
diphasiques dans les conduites petrolie-res, PhD thesis, Universite Pierre et Marie
Curie, 2003.
2. M. BAUDIN, C. BERTHON, F. COQUEL, R. MASSON, AND Q.-H. TRAN, A re-
laxation method for two-phase flow models with hydrodynamic closure law, sub-
mitted, (2002).
3. M. BAUDIN, F. COQUEL, AND Q.-H. TRAN, A semi-implicit relaxation scheme
for modeling two-phase flow in a pipeline, in preparation, (2002).
4. S. BENZONI-GAVAGE, Analyse Numerique des Modeles Hydrodynamiques
d'Ecoulements Diphasiques Instationnaires dans les Reseaux de Production
Peroliere, PhD thesis, Ecole Normale Superieure de Lyon, 1991.
5. C. CHALONS, Bilans d'entropie discrets dans l'approximation numerique des
chocs non classiques. Application aux equations de Navier-Stockes multi-pression
2D et a quelques systemes visco-capillaires, PhD thesis, Universite Pierre et Marie
Curie, Paris VI, novembre 2002.
6. 1. FAILLE AND E. HEINTZE, A rough finite volume scheme for modeling two phase
flow in a pipeline, Computers and Fluids, 28 (1999), pp. 213-241.
A Time Semi-implicit Relaxation Scheme 111

7. T. GALLOUET AND J.-M. MASELLA, A rough godunov scheme, in Compte Rendus


a l'Academie des Sciences, Paris, 1996, p. 77.
8. E. GODLEWSKI AND P .-A. RAVIART, Hyperbolic systems of conservation laws,
Mathematiques et Applications, SMAI, Ellipses, 1991.
9. R. J. LEVEQUE, Numerical methods for conservation laws, Lectures in Mathe-
matics, ETH Zurich, Birkhiiuser Verlag, Berlin, (1992).
10. C. PAUCHON, H. DHULESIA, G. BINH-CIRLOT, AND J. FABRE, TACITE: a tran-
sient tool for multiphase pipieline and well simulation, in SPE annual Technical
Conference, New Orleans, 1994.
11. Z. XIN AND S. JIN, The relaxation schemes for systems of conservation laws in
arbitrary space dimensions, Comm. Pure App!. Math., 48 (1995), pp. 235-276.
12. N. ZUBER AND J. FINDLAY, Average volumetric concentration in two-phase flow
systems, J. Heat Transfer, C87 (1965), pp. 453-458.
Computational Study of Field Scale BTEX
Transport and Biodegradation in the
Subsurface

Markus Bause

Institut fUr Angewandte Mathematik, Universitat Erlangen-Niirnberg,


Martensstr. 3, 91058 Erlangen, Germany (bause@am. uni-erlangen. de)

Summary. In this work we simulate numerically the transport and biodegradation


of an organic contaminant (BTEX) in the subsurface. A "real world" contaminated
site is considered and realistic laboratory-derived or field-measured input parame-
ters are used. The biodegradation of the dissolved contaminant plume is modelled by
Monod type kinetics. For the computations we use an approximation scheme which
is based on a higher order finite element method and the two step backward differ-
entiation formula and was proposed and carefully analyzed by the author in a recent
work [3]. It was successfully applied to test (benchmark) problems reported in the
literature (cf. [10, 13]) as well as complex scenarios with an additional numerical
computation of the flow field by solving the parabolic-elliptic degenerate Richards
equation; cf. [3]. The higher order approximation scheme has shown to reduce sig-
nificantly the amount of inherent numerical diffusion compared to lower order ones.
Thereby an artificial transverse mixing of the species leading to a strong overestima-
tion of the biodegradation process and wrong prediction is avoided.

1 Introduction
Groundwater contamination by biodegradable organic compounds has be-
come a serious and widespread environmental problem in industrialized coun-
tries. Major organic contaminants include petroleum fuels (gasoline, diesel),
petroleum bypro ducts (coal tar, coal-tar creosote), and chlorinated solvents. In
many cases, groundwater contains a mixture of organic contaminants, either
due to the complex mixture in many non-aqueous phase liquids (NAPLs; e.g.,
gasoline) or due to co-disposal/co-spillage (e.g., landfillleachates). The degra-
dation of these contaminants is controlled to a large extent by the biological
and geochemical conditions in the groundwater. Fortunately, biodegradation
tends to attenuate at least some organics during groundwater transport.
The question of whether active remediation is required, or whether natural
processes of attenuation (passive remediation) will be sufficient is a critical
issue in "real world" situations. Passive or intrinsic remediation is generally
preferred, if feasible, due to the potential to, firstly, eliminate permanently
contaminants through biogeochemical transformation or mineralization and,
secondly, avoid expensive biological, chemical and physical treatments. How-
ever, the possible attenuation of organic compounds and the impact of that
Computational Study of BTEX Transport and Biodegradation 113

contamination on a groundwater resource is difficult to predict since field sam-


pling limitations make it difficult to develop an accurate mass balance. Numer-
ical models can be used to help answer these questions, predict the long-term
evaluation of the contaminant plume and evaluate factors limiting biodegra-
dation. However, a predictive capability for decision-making can only be found
in advanced contaminant transport models which include the full range of the
controlling processes and efficient, accurate and reliable numerical methods for
solving these equations. Although the understanding of conservative transport
and the effect of medium heterogeneity on transport are now well-advanced,
methods for modelling bioreactive processes, in particular at the field scale,
are less well understood; cf. [6]. Also, the accuracy of numerical techniques in
the context of bioreactive transport has been little explored so far.
In a recent work [3], the author proposed a higher order approximation
scheme (cf. Sec. 3), based on finite element methods and backward differentia-
tion formulae, for biochemically reacting contaminant transport in the subsur-
face with Monod type kinetics (cf. Sec. 2). The numerical scheme was carefully
compared to a recently published adaptive finite volume approach (cf. [10,13])
by recomputing some computational experiments of biodegradation processes
presented in [10, 13]. The higher order finite element techniques seemingly pro-
vided more accurate results than the finite volume methods; cf. [3]. Further,
the simulations presented in [3] have clearly borne out that in order to en-
sure the reliability of the numerical discretization of the bioreactive transport
model, it is of importance to use higher order approximation schemes, in par-
ticular, for the spatial discretization. Using lower order methods may lead to
an overrepresentation of the transverse mixing of the substances and, thereby,
to a significant overprediction of the biodegradation process; cf. [6, 13]. Com-
pletely wrong solutions are obtained, even if the spatial grid is locally refined
and adapted to the solution. Higher order methods help to overcome these
difficulties due to their less inherent numerical diffusion.
In this work we use the approximation scheme suggested by the author in [3]
to study the long-term evaluation and biodegradation of a "real world" field
scale Benzene Toluene Ethylbenzene Xylene plume in the subsurface. The
contaminated site is located in the north part of the city Geretsried in Germany
close to Munich and was recently analyzed within the interdisciplinary net-
work project "Sustainable Remediation involving Natural Attenuation" that
was supported by the Bavarian State Ministry for Regional Development and
Environmental Affairs; cf. Sec. 4 or contact [7] for further information. At this
site, large quantities of mineral oil were infiltrated into the soil between 1948
and 1989 by a chemical laundry. Despite the cleaning up performed in 2001, a
significant concentration of BTEX is still being measured there. The plan for
the paper is now as follows. In Sec. 2 we introduce the mathematical model
describing the transport and Monod type biodegradation of organic contami-
nants in the subsurface. In Sec. 3 the numerical discretization techniques are
114 M. Bause

briefly introduced. The simulated results for the expansion and movement of
the contaminant plume are presented in Sec. 4.

2 Governing equations

Microbial activity in the subsurface is dependent on the bioavailability of all


substrates utilized by the microorganisms. The main substrates are the elec-
tron donor, the electron acceptor, and the primary carbon source. In the stan-
dard case of metabolic aerobic degradation, oxygen is the electron acceptor,
and the contaminant to be degraded acts both as the electron donor and the
primary carbon source; cf. [5]. Here, for the sake of simplicity, only the basic
process of aerobic degradation of a single substrate (xylene) is considered. The
principles hold equally for multiple electron donors or acceptors.
For this "simple" scenario, biomass growth is assumed to follow the double
Monod kinetics, also referred to as the double Michaelis-Menten law; cf. [5, 6,
15]. Then the governing equations for the electron donor CD [M L -3], electron
acceptor CA [M L -3] and immobile biomass Cx [M L -3] are respectively given
by
8t(6)Ci) - \1. (D8ci - qCi) = -aiM,
(1)
8t cx + kdCx = ~ (1 - C~:::ax) M,
for i = D, A, where the Monod term M [M L -3 T- 1 ] is defined by

CD KID cA KIA
M= 6>Mmax Cx K . (2)
D + CD KID + CD KA + cA KIA + cA
Thus, we have to consider a coupled system of partial and ordinary differential
equations. In (1), 6> H denotes the volumetric water content, q [LT-l] the
Darcy velocity vector (volumetric flux) and D i , i = D, A, [L2 T- 1 ] with

(3)

the dispersion tensor following the Scheidegger parametrization (cf. [14]) in


which d i , i = D, A, [L2 T- 1 ] is the molecular diffusion, I H the identity ma-
trix and (3z [L] and (3t [L] are the longitudinal and transverse dispersivities,
respectively. We have aD = 1 [-]. The constant aA H denotes the electron
acceptor to donor mass ratio, kd [T-l] is the first order decay rate for the
biomass, Y H is the microbial yield coefficient per unit electron donor con-
sumed (mg biomass per mg electron donor) and cXmax [ML -3] is the maximum
biomass concentration. In (2), Mmax [T- 1 ] denotes the maximum growth rate,
K i , i = D, A, [ML -3] is the half-utilization constant of the electron donor and
acceptor, respectively, and K Ii , i = D, A, [M L -3] is the Haldane inhibition
concentration of the electron donor and acceptor, respectively. The inhibition
term K1;!(KJi+Ci), i = D, A, proposed by Haldane [8] and Andrews [1], yields
Computational Study of BTEX Transport and Biodegradation 115

a slower microbial growth and, therefore, a slower effective electron donor uti-
lization rate at higher concentrations; cf. [15]. In the numerical model, unre-
alistically high biomass concentrations are avoided by introducing the term
1 - CX/CXmax in the second of the equations (1). If microbial growth is not
restricted in the model, simulated microbial concentrations may become very
large, especially in source areas with continuous electron donor and acceptor
supply. In real aquifers, the size of the biomass is limited, for example, due
to a lack of available pore space, production of inhibitory metabolites, lack
of nutrients and viral attack. The constant cXmax represents the maximum
microbial concentration at which the biomass reaches a quasi-steady state.
We consider solving (1), (2) over (O,T) x D where D c IR d , d = 2,3, is a
two- or three-dimensional bounded domain and the system (1)-(2) is supplied
with initial conditions
CD(O,') = CD,O, CA(O,') = CA,O, cx(O,.) = cx,o in D at t = 0, (4)
and nonhomogeneous Dirichlet and Robin boundary conditions for Ci, 2 =
D,A,
Ci = gi on (0, T) x rD ,
Here, 1/ denotes the outer unit normal to the boundary 8D = rDurR of D. The
existence of a global unique non-negative solution CD, CA E Wi,2((0, T) x D),
with p > 2, and Cx E C 1 ([0, T]; C(D)) to (1)-(5) for any given T E (0,00)
was recently proved; cf. [12]. In particular, the non-negativeness of CD, CA, Cx
can be ensured. The proof can be carried over to d = 3. For the definition of
W~/2,l((0, T) x D) we refer to [11].
In our computational experiment, the velocity vector q [LT- 1 ] in (1), (3)
and (5) is prescribed analytically. This is done due to a lack of information and
measurements of the flow field for the considered site; cf. Sec. 4. For numerical
simulations of contaminant transport and biodegradation scenarios where the
flow field is additionally computed numerically by solving the parabolic-elliptic
degenerate Richards equation we refer to [3].
The established regularity of solutions to problem (1)-(5) is, by far, too
weak to justify the use of higher order approximation schemes. However, one
may expect that a higher regularity of the solution still holds in some sense
locally. This might be sufficient to get a significant advantage of higher order
approximation schemes over lower order ones. Such superiority of the higher
order methods was recently confirmed by numerical computations; cf. [3]. Fur-
ther, higher order regularity results for solutions to equations (1)~(5) are es-
tablished in a forthcoming paper; cf. [4].

3 Discretization and solution techniques


We shall now briefly describe our numerical methods and solution techniques
for solving the equations (1)-(5). For the spatial discretization we use con-
116 M. Bause

forming finite element methods. Here, for simplicity, we assume that fl c ]R2
is a polygonal bounded domain. If either the whole boundary of fl or at least
some part of it is curved, we adapt the mesh to the boundary by using the
isoparametric counterparts of the finite elements introduced below; cf. [3]. In
our computations we will consider non-vanishing Dirichlet boundary values gi,
i = D, A, in (5). However, for simplicity, the variational formulation of (1)-(5)
is given for homogeneous Dirichlet boundary conditions only. Nonhomogeneous
boundary values are incorporated by standard techniques.
Let It, = {K} be a finite decomposition of mesh size h of fl into triangles.
The decompositions are assumed to be regular, i.e., "face to face". We use
standard conforming P2 elements for the Monod model (1)-(5). The approxi-
mation spaces Vh for the electron donor and acceptor C i , i = D, A, and X h for
the biomass C x are thus defined as Vh = {Ci E C(fl) I C ilK E P 2(K) for K E
12 -
Th } n Wo,'rD and Xh = {Cx E C(fl) I CXIK E P 2(K) for K E It,}. By Pj(K),
j EN, we denote the space of all continuous polynomials of maximum degree
j. Further, W~'; , D
= {c E W 1,2(fl) I C = 0 on rD}. Hence, the spatial dis-
cretization of CD, CA and Cx converges formally of third order with respect
to the norm in L 2 (fl). Advection-dominated transport of the mobile species
(electron donor and acceptor) introducing local numerical instabilities in the
solution can efficiently be captured by the streamline upwind Petrov-Galerkin
method (SUPG); cf. [3, 9].
For the temporal discretization of problems (1)-(5) we consider a mesh
{tn}, n = 0, ... ,N with to = 0 and tN = T, for the time variable t and define
Tn = tn+ 1 - tn. Due to the generally high stiffness of semidiscretizations to flow
and transport problems, implicit schemes should be preferred in the choice
of time-stepping methods for solving these problems. The backward Euler
method is robust and has excellent stability properties, but it is inaccurate
due to its first convergence order only and also strongly damping. So, it should
only be used for nonstationary calculations which aim to iterate towards the
steady limit. A scheme having similar stability properties as the backward
Euler method but being of second order accuracy is the two step backward
differentiation formula BDF2 which we use in our computations. Further, to
increase the efficiency of the calculations, an adaptive time stepping procedure
was developed and tested in [3] for the proposed discretization of the transport
and biodegradation model (1)-(5).
Now, we suppose that sequences {q(t n ),8(tn )} E Wl,OO(fl)x LOO(fl) are
explicitly prescribed. Let P Zh denote the L2-projection onto the finite element
space Zh. The discretization of the Monod model (1)-(5) by the Galerkin
method and the two step backward differentiation formula B D F2 then reads
as follows:
Set Cp = PVhCi,O and C!k = Pxhcx,o. For all time steps n = 0, ... ,N - 2
compute approximations C~+2 E Vh , i = D, A, and C~+2 E X h by solving the
equations
Computational Study of BTEX Transport and Biodegradation 117

/'n+2 (8(tn+2)Cf+2, Vi) - /'n+l (8(tn+1)Cf+l, Vi) + /'n (8(tn)C:", Vi)


+Tn+1 (q(tn+2) . VCf+2, Vi) + Tn+1 (Di(t n+2)VCf+2, VVi)
(6)
-Tn+1 ((q. v)Cf+2, Vi)rR + Tn+1 (V· q(t n+1)Cf+2, Vi)
= -Tn+1 (ai Un+2, Vi) - Tn+1 (hi, Vi) rR ,
for all Vi E Vh, i = D, A, and

n
C n+2
1 +:n:, e(~:::'( t- fl:) ~n+~n+'
C n+1 + cn + k C n+2
, X ~ (7)

for all nodes (Xj )j=l, ... ,M associated with degrees of freedom of C X+2 , where

By (.,.) and (., ')rR we denote the standard £2 inner product in £2([2)
and £2(rR), respectively. Further, in (6) and (7) we use the notation /'n+2 =
1 + Tn+d(Tn+1 + Tn), /'n+1 = 1 + Tn+dTn and /'n = T~+1/((Tn+1 + Tn)Tn).
The time step sizes Tn+l can be chosen adaptively; cf. [3]. Clearly, identity
(7) formulates a pointwise condition for all nodes associated with degrees of
freedom of C X+2 . Using instead of (7) a variational equation, analogously to
(6), leads to stability problems and severe oscillations.
As usual, for the test functions Vi we choose the basis functions of Vh .
Further, let C~ E X h , i = n, n + 1, n + 2, be represented in terms of the finite
element basis functions {Wj }~l of X h , i.e., X h = span{Wj 11 ::; j ::; M} and
x
C = L~l~;Wj, for i = n,n+l,n+2, where the vector ~i = (~L ... ,~k)
x'
denotes the degrees of freedom of C Then, (7) amounts to solving the system
of equations in the unknown vector ~n+2,

where U n+2 = (un+2(Xl)"" ,Un+2(XM))'


Since the BDF2 is a two step method, we need a starting procedure to
compute appropriate approximations C1,C1 and C} of CD(tl, .), CA(tl,') and
cx(h, .), respectively. Here, the first time step is done by performing M sub-
steps of the backward Euler method with step size TaIM. In our computations
we use M = 4. To solve the resulting nonlinear systems of equations, a damped
version of Newton's method is applied. The linear problems of the Newton iter-
ation are solved by standard Krylov space methods like GMRES, for instance,
with SSOR preconditioning. For the future we plan to use multigrid methods
for the linear solver which is motivated by our former experiences with com-
puting variably saturated subsurface flow; cf. [2]. In the simple case of a single
118 M. Bause

Fig. 1. Computational domain (left) for contaminated site Geretsried and initial
concentration of electron donor xylene (middle) and electron acceptor oxygen (right)

electron donor and acceptor and a single biomass an alternative treatment of


the transport and biodegradation problem (1)-(5) seems to be possible. After a
temporal discretization of (1), one may resolve the time-discrete version ofthe
second of the equations (1) for the biomass concentration ex and substitute
the resulting identity into the time-discrete version of the first of the equations
(1). Thus, the biomass concentration is eliminated from the nonlinear system
of equations and can be computed in a postprocessing procedure which leads
to smaller systems of linear equations to be solved. However, a generalization
of such approach to the case of multiple microbial populations and, in partic-
ular, its implementation seems to be more complex than an explicit treatment
of the ordinary differential equations for the biomass. Therefore, the approach
is not considered here.

4 Computational results

We shall now present our computational results obtained for a "real world"
residual waste and thereby provide valuable insights into the complex interac-
tions of biological, chemical and physical processes that are involved in natural
Computational Study of BTEX Transport and Biodegradation 119

attenuation phenomena. The contaminated site is located in the city Geretsried


in Bavaria (Germany) and was recently analyzed within a network project;
contact [7] for details. Large quantities of mineral oil were infiltrated into the
soil between 1948 and 1989 by a chemical laundry. Despite some active reme-
diation, a significant concentration of BTEX is still being measured there.
The computational domain D in the subsurface is visualized in Fig. 1 with
length given in meter. For simplicity and due to a lack of information, a con-
stant groundwater flow field parallel to the left and right boundary of D is
supposed. The flow direction is from the lower to the upper boundary ad-
jacent to the river Isar. The measured concentration profiles at the current
state, assumed to be the initial state of the simulation, are shown in Fig. 1.
The shapes within the profiles were slightly idealized. We restrict ourselves to
the electron donor xylene which here is the main component of BTEX with
70-90%. The xylene concentration inside the ellipse is 12.5 [mg/l] and 0 [mg/l]
elsewhere. Biodegradation of xylene with either oxygen or nitrate as electron
acceptor is observed. Here, we consider the first case. We have a small rectangu-
lar domain overlapping with the xylene ellipse where the oxygen concentration
is 2 [mg/l] and 4.0 [mg/l] in its center. The ambient oxygen concentration is
8.6 [mg/l]. The measured higher oxygen concentration inside the contaminant
source puzzles us and has not been completely understood yet. One reason
for that might be a lower permeability inside the contaminant source due to
a lack of available pore space. Further investigations are necessary. Therefore
and due to the lack of information in particular about the spatial variation
of the flow field, our simulations have to be considered rather as a qualitative
analysis of the biodegradation process than as a quantitative prediction of the
xylene degradation. Nevertheless they contribute to a better understanding of
natural attenuation phenomena for this site.
In our calculations we used reliable field-measured and laboratory-derived
input parameters that were given in [15]. They proved to describe adequately
field scale degradation provided that all controlling factors are incorporated
in the field scale model. The flow field q = (0.045,0.15)T, with time given in
days, and the diffusion-dispersion parameters were obtained by measurements.
Precisely, we put (9: 0.3, d D , dA: 8.64e-5, (3t: 2.0, (31: 10.2, etA: 3.16, kd:
0.001, Y: 0.52, cXmax: 1.0, /Lmax: 4.13, K D : 0.79, K A : 0.1, KID: 91.7, KIA:
00. The initial concentration of the biomass was cx(O,·) = 0.003 in D. We
chose homogeneous Neumann boundary conditions at the left, right and upper
boundary and a Dirichlet condition at the lower one. The computations were
done on an almost uniform grid with 12322 elements. The time step sizes were
chosen adaptively; cf. [3]. The calculated concentration profiles of the electron
donor (contaminant) xylene, acceptor oxygen and biomass are visualized in
Fig. 2 to 4. For comparison, the problem was recomputed on a very fine mesh
with 197152 elements. This was done on a Linux cluster with 16 processors.
No significant changes in the computed profiles were observed. It shows that
the proposed numerical scheme reliably predicts the degradation rates even on
120 M. Bause

Fig. 2. Concentration of electron donor xylene at T = 20 days, 1.5 years and 4.5
years

relatively coarse meshes which is in agreement with the results presented in


[3].
Fig. 2 to 4 show that the contaminant is transported by the flow field to the
upper boundary of n adjacent to the river. Simultaneously, it is degraded by
a reaction between electron donor, acceptor and biomass. The initial oxygen
concentration inside the contaminant plume becomes depleted within a few
days which is not consistent with the measured profile and might result from
an insufficient description of the flow and permeability conditions inside the
contaminant source. The reaction between the species is restricted to those
regions where their concentrations are sufficiently large. Basically, it is the
interface between the electron donor and the surrounding region where still
enough acceptor is available. If a numerical method with much artificial dif-
fusion is used, this interface between the species smears out and the reaction
takes place in the larger region. Then, the contaminant is degraded too fast;
cf. [3,6, 10, 13]. In particular, this happens if lower order methods are applied
on not highly refined meshes; cf. [3, 13].

References
1. J.F. Andrews. A mathematical model for the continuous culture of microorgan-
isms utilizing inhibitory substrates, Biotechnol. Bioeng., 10:707-723, 1968.
Computational Study of BTEX Transport and Biodegradation 121

Fig. 3. Concentration of electron acceptor oxygen at T = 20 days, 1.5 years and 4.5
years

2. M. Bause, P. Knabner. Computation of variably saturated subsurface flow by


adaptive mixed hybrid finite element methods, Adv. Water Resour., submitted,
2002.
3. M. Bause, P. Knabner. Numerical simulation of contaminant biodegradation
by higher order methods and adaptive time stepping, Compo Vis. Sci., ac-
cepted, 2004.
4. M. Bause, W. Merz. Higher order regularity and approximation of solutions to
the Monod biodegradation model, Appl. Numer. Math., submitted, 2004.
5. R.C. Borden, P.B. Bedient. Transport of dissolved hydrocarbons influenced by
oxygen limited biodegradation: 1. Theoretical development, Water Resour. Res.,
22:1973-1982, 1986.
6. O.A. Cirpka, E.O. Frind, R. Helmig. Numerical simulation of biodegradation
controlled by transverse mixing, J. Contam. Hydrol., 40:159-182, 1999.
7. Gesellschaft zur Altlastensanierung in Bayern mbH (GAB mbH), Innere Wiener
StraJ3e lla/l, 81667 Miinchen, Germany (http://www.altlasten-bayern.de).
8. J.B.S. Haldane. Enzymes, M.I.T. Press, Cambridge, MA, 1965.
9. T.J.R. Hughes, A.N. Brooks. A multidimensional upwind scheme with no
crosswind diffusion. Finite Element Methods for Convection Dominated Flows,
T.J.R. Hughes (ed.), 19-35, ASME, New York, 1979.
10. R. Klofkorn, D. Kroner, M. Ohlberger. Local adaptive methods for convection
dominated problems, lnternat. J. Numer. Methods Fluids, 40:79-91, 2002.
11. O.A. Ladyzenskaya, V.A. Solonnikov, N.N. Ural'ceva. Linear and Quasilinear
Equations of Parabolic Type, Am. Math. Soc., Providence, RI, 1968.
122 M. Bause

Fig. 4. Concentration of biomass at T = 20 days, 1.5 years and 4.5 years

12. W. Merz. Global existence result of the monod model, Adv. Math. Sci., accepted,
2003.
13. M. Ohlberger, C. Rohde. Adaptive finite volume approximations of weakly cou-
pled convection dominated parabolic systems, IMA J. Numer. Anal., 22:253-
280, 2002.
14. A.E. Scheidegger. General theory of dispersion in porous media, J. Geo-
phys. Res., 66(10):3273-3278, 1961.
15. M. Schirmer, J.W. Molson, E.O. Frind, J.F. Barker. Biodegradation modelling
of a dissolved gasoline plume applying independent laboratory and field param-
eters, J. Contam. Hydrol., 46:339-374, 2000.
A Two-Level Stabilization Scheme for the
N avier-Stokes Equations

Roland Beckerl and Malte Braack 2

1 Laboratoire de Mathematiques Appliquees


Universite de Pau et des Pays de l'Adour
BP 1155, 64013 PAU Cedex, France
roland. [email protected]
2 Institut fur Angewandte Mathematik
Universitiit Heidelberg
1m Neuenheimer Feld 294, 69120 Heidelberg, Germany
malte. [email protected]

Summary. As an alternative to classical stabilization schemes as, for instance,


Galerkin-Least-Squares or streamline diffusion techniques, a stable equal-order fi-
nite element scheme for the Navier-Stokes equation is proposed. The approach is
based on filtering small-scale fluctuations of pressure and velocities by local pro-
jections. For the Stokes system, we prove stability and analyze the arising system
matrix. Furthermore, the transport equation is analyzed with respect to stability and
an a-priori estimate is given.

1 Introduction

In this note we present a discretization of the stationary Navier-Stokes equa-


tions based on equal-order finite elements. We combine the pressure stabiliza-
tion for the Stokes equations developed in [2] with a similar technique for the
nonlinear convection term. The entire approach is based on the use of two
discrete spaces W 2h , Vh . We use finite elements on quadrilateral meshes. The
discrete space Vh corresponds to bilinear finite element functions and W 2h to
piecewise constant elements on a globally coarser mesh.
Although stabilization by weighted least-squares terms, as for instance
Galerkin-Least-Squares (GLS) or streamline-upwind Petrov-Galerkin
(SUPG), see [10, 9, 6, 14], is now classical and provides a rather general frame-
work, there is a certain need for different stabilization techniques, see the more
recent approaches [4, 8, 2, 5]. One common feature of the new approaches is
that they have better local conservation properties then the classical ones.
Further, the difficulties of SUPG for higher order polynomials might be over-
come. The choice of the stabilization parameter does no longer depend on the
constants of an inverse estimate, see [2].
Our motivation for development of new stabilized schemes comes from two
fields of application: a) reacting flows with complex chemistry and b) optimal
control of incompressible flows. In the first case, SUPG-like stabilization leads
124 R. Becker, M. Braack

to enormous coupling terms which are rather difficult and time-consuming to


compute, see [3]. Further, the design of robust Newton-type methods in this
context is not evident since one has to decide whether or not to take into
account the derivative of each stabilization term in the approximate Jacobian.
Related to the just mentioned problem of computing the derivatives of the
stabilization terms is the discretization of optimal control problems. Here, the
critical terms directly influence the quality of the computed gradients of the
cost functional, see [1] for a discussion.
In Section 2, we describe the two-level discretization of the stationary
Navier-Stokes equations. In the following sections, we present some aspects
of the analysis: Section 3 deals with the arising algebraic system for the Stokes
equations and Section 4 with stabilization of the convective terms.

2 Two-level scheme for the Navier-Stokes equations

Let D C lR 2 be a polygonal domain. We want to solve the Navier-Stokes


equations for an incompressible fluid with fluid velocity v and pressure p,

(v·\l)v-vi1v+\lp=f m D, (1)
divv = 0 in D, (2)
supplied with homogeneous Dirichlet boundary conditions:

v =0 on oD, (3)
and the normalization of the pressure

LPdX = o. (4)

In (1), f E L2(D)2 represents given data. For the weak formulation of (1)-(4)
we introduce the following notations:

u:= (p,v) E X:= L2(D)jlR x HJ(D)2,


a(u)(¢) := ((v.\l)v,?j;) + v(\lv\l?j;) - (p,div?j;) + (divv,~), (5)

for test functions ¢ = (?j;,~) EX. Now the weak formulation of (1)-(4) reads
in compact notation:

a(u)(¢) = (1,?j;) V¢ E X. (6)


We denote by Th a shape regular partition of the domain into quadrilaterals.
Hanging nodes are allowed with moderation for ease of local mesh refinement.
We consider two finite element spaces W 2h , Vh , which are constructed in the
following way.
Vh consists of bilinear finite elements on Th. On hanging nodes, the finite
element functions are interpolated by the neighbor nodes so that no degrees
A Two-Level Stabilization Scheme for the Navier-Stokes Equations 125

of freedom are present on those irregular nodes. The space W2h consists of
constants on each cell of a coarser mesh ~h obtained by one global coarsening
of Th : Each quadrilateral K E ~h is cut into four new quadrilaterals (dividing
all lengths of edges of K by 2) in order to obtain the fine partition Tt.,. Note,
that the functions in W2h are discontinuous across edges of elements of~h. We
indicate the subspaces of discrete functions respecting the Dirichlet condition
by an additional subscript Vh,o C Vh . The restriction to a mean value of zero,
cf. (4), is denoted by V,? The discrete solution Uh = (Vh,Ph) is searched in the
discrete space X h := V~ x V';o.
Furthermore, we use the L2-projections on the piecewise constants Pw
L2([2) ---+ W2h and the fluctuation operator 7rh : Vh ---+ Vh :

7rh:=I-Pw, (7)

where I denotes the identity mapping.


We use the following stabilization terms, defined on X h x X h :

Here, a and 0 denote piecewise constant functions which depend among other
things on the local cell size h K. The precise definition is:

OIK := mm
. (h'k hK )
-;;' Ilvlloo,K . (9)

The discrete problem reads: Find Uh E X h such that

(10)

One remarkable feature of (10) is, that the stabilization terms only act
on the diagonal of the coupled system. The structure of the stabilization is
unchanged, if additional lower order terms are added to the equations.
Our numerical experience shows that the resulting scheme has very similar
properties to SUPG concerning stability and accuracy. In the following we
present some aspects of the analysis of (10).

3 Structure of the system matrix for the Stokes equations

As the first step, we consider the proposed stabilization in the case of the Stokes
equations. It can be easily seen that the stabilization term (7rh \lPh, a7rh \l~) in
(8) leads to a larger stencil for the pressure then the original one coming from
the Galerkin part. The discrete problem for the Stokes equations reads: Find
Uh E X h such that

(11)
holds, where now
126 R. Becker, M. Braack

a(u,¢) := v(\7v, \71jJ) - (p,div1jJ) + (divv,O, (12)


s(u,¢):= (7rh\7p, a7rh\7~). (13)
The analysis presented in [2] gives us the following error estimate in terms
of the L2-norm II . II and Sobolev norms II ·llw(n).
Theorem 1. Letp E H2(D) andv H 3 (D)2. Then, ifVh consists ofisopara-
E
metric biquadratic functions, the solution (Ph, Vh) of (10) for the bilinear forms
(12), (13) allows for the following error estimate:
lip - Phil + 11\7(v - vh)11 ::; Ch 2 (llpIIH2(n) + IlvIIH3(n)), (14)
where h denotes the maximal mesh size.

A similar error estimate holds true for the classical Taylor-Hood element
with biquadratic velocities and bilinear pressure (which does not require any
stabilization), see [7]. One might therefore wonder what the additional pressure
degrees of freedom in our scheme produce. The answer to this question is
provided in the following.
We denote by Vh E 1R2n the vector of coefficients of the function Vh with
respect to the canonical finite element nodal basis {1jJD, i.e., Vh = 2:i(Vh)i1jJ~.
Now, we split the pressure Ph into a coarse grid part Ph and small-scale fluc-
tuations p~:
(15)
i!
The coefficient vectors Ph and hare defined analogously by the nodal basis.
Then, the matrix representation of the linear system (11) reads

[ ~ -~)* -~')*l'~~l
B' S2 ~ S3
[~l 0,

where lh
has the obvious meaning. The matrix A stands for the Laplacian, B
and B' for the divergence, and S for the stabiliza~on.
Block elimination of the pressure component p~ leads to:

where A=A+D,
S = Sl - S~S:;lS2'
13 = B - S~S:;l B'.
We provide the following result as an interpretation of our findings.

Remark 1. The additional diagonal block D in the system matrix is suspected


to act as an additional stabilization term controlling the discrete divergence.
A Two-Level Stabilization Scheme for the Navier-Stokes Equations 127

We denote the l2 scalar product by (-, .). On quasi-regular meshes where all
cells are parallelograms, we have a mesh-size independent constant c > 0 so
that
! L Iidivvhllk::; (DVh' Vh) ::; c L
Iidivvhllk· (16)
C KETh KETh

Such a term is of common use in stabilized schemes, see [13].

Proof. By definition of D it holds:


(DVh, Vh) = (S;1 B'Vh, B'Vh) .
We denote by If the index set of fine grid nodes Ni, and by ~i E Vh the
standard nodal hat functions of node Ni with support Pi. It holds

(B'Vh, B'Vh) = IIB'Vhl1 2 = L (divvh' ~i)2


iElf

Since S3 scales like h 2 , we get (DVh,Vh)::; c2:KETh Iidivvhlik. The proof for
the opposite direction

~ L hklldivvhllk::; (B'Vh, B'Vh)


KETh

will be given in [12].

4 Analysis for a transport equation

We consider the following transport equation with a given constant transport


vector {3 E ]R2 and given continuous data f and g:

u + ({3. \l)u = f in D, u = g on r_, (17)


where r_ is the inflow part of the boundary:

r_ := {x E fJD : n(x) . {3 < O}.


Denoting the L2-scalar product on the boundary fJD by (-, .), we define the
following bilinear form and linear functional:

a(u, ¢) := (u, ¢) + (({3·\l)u, ¢) - (({3. n)_u, ¢)


l(¢) := (1, ¢) - (({3. n)-9, ¢).
Here we have used the notation x_ := min(x, 0) and 9 is a prolongation of g to
the whole boundary. Later on, we will use the notation x+ := max(x, 0). Then,
the continuous solution u of equation (17) satisfies the variational equation:
128 R. Becker, M. Braack

a(u, ¢) = l(¢) V¢ E V, (18)


where, for instance, V = H1(D). The discretization of (17) to be considered
here is based on the stabilized version of (18) using the space of continuous
piecewise bilinear finite elements Vh as before:

(19)
The stabilization term is defined similar to one part used for the Navier-Stokes
equations (8):

From the following stability result we obtain existence and uniqueness of the
discrete solution:
Lemma 1. We have

(20)
with

(21)

Proof. Follows from integration by parts.

In the following we give an error estimate. The proof is very similar to the
classical one in [11] for the same equation supplied with SUPG stabilization.
However, notice that in contrast to the proof for SUPG, we only have control
over the streamline derivative of the fluctuations 1l'h[(!3' \7)Uh] in (21).

Theorem 2. Let u be the continuous solution to (17) satisfying u E H2(D)


and Uh the discrete solution of (19). Then we have the following estimate:

(22)
The estimate is similar to the standard estimate for SUPG or the discontinuous
Galerkin method, [11]. With respect to the interpolation error we loose a power
of 1/2.
Proof. By jh : V -+ Vh we denote the modified Scott-Zhang interpolation
operator introduced in [2] which has the following orthogonality property:

(23)
and allows for optimal interpolation in L2(D) and H1(D). That is, there exist
a constant C such that

Ilu - jhull :::; Ch21IuIIH2(st),


11\7(u - jhu)11 :::; ChlluIIH2(st).
A Two-Level Stabilization Scheme for the Navier-Stokes Equations 129

We split the error as U - Uh = 'I] -~, ~ := Uh - jhu, 'I] := U - jhU. Due to this
interpolation result, it is sufficient to show

(24)
with (another) constant C for proving the assertion. We have:

111~1112 = a(uh'~) + S(Uh'~) - a(jhu,~) - S(jhU,~)


= l(~) - a(jhu,O - S(jhu,~)
= a(u,~) - a(jhu,~) - S(jhu,~)
= a('I],~) - S(jhu,~)
= ('I],~) + (({3. V)'I], 0 - (({3. n)_'I],~) - S(jhu,~).

The only critical terms are (((3·V)'I],~) and S(jhU, ~). We use partial integration
to obtain

(({3·V)'I],~) = + (({3·n)'I],~)
-('I],({3·V)~)
= -('I],7rh({3·V)~) + (({3. n)'I],~).
In the last line, we have used the orthogonality property (23) of the inter-
polation operator jh. Furthermore, the stabilization term can be bounded as
follows:

:::; hl/2 ( L II7rh({3· V)jhUIIK) III~III


KET2h

:::; h3/21IuIIH2(n)III~111
Here, the last line is obtained by stability of the L2 projection Pw and the
interpolation property of jh:

Now we get:

111~1112 = ('I],~) - ('I], 7rh[({3·V)W + (({3. n)+'I],~) - S(jhu,~)


:::; 1i'f71111~11 + Ilvo- 1 'l]1111v67rh[({3·V)~lll
+llv({3· n)+ 'l]llanIIV({3· n)+ ~llan + Ch3/21IuIIH2(n)III~111
:::; (11'1]11 + 11y6=1 '1]11 + Ilv({3· n)+ 'l]llan + Ch 3/ 21I u IIH2(n») III~III,
130 R. Becker, M. Braack

and hence by the trace theorem:

III~III :::; IITJII + 11vT-T TJII + 11/(/3· n)+ TJllaf2 + Ch 3 / 2 1I u IIH2(f2)


:::; Ch- 1/ 2 1ITJII + Ch 3 / 2 1I u IIH2(f2)
:::; Ch 3 / 2 1I u IIH2(f2) ,

which shows (24) since II~II :::; III~III.

References

1. R. Becker. Adaptive Finite Elements for optimal control problems. Habilitation-


sschrift, Institut fiir Angewandte Mathematik, Universitiit Heidelberg, 2001.
2. R. Becker and M. Braack. A finite element pressure gradient stabilization for
the Stokes equations based on local projections. Caleolo, 38(4):173-199,2001.
3. M. Braack. An Adaptive Finite Element Method for Reactive Flow Problems.
PhD Dissertation, Universitiit Heidelberg, 1998.
4. R. Cod ina and J. Blasco. A finite element formulation for the Stokes problem
allowing equal velocity-pressure interpolation. Comput. Methods Appl. Meeh.
Engrg., 143:373-391, 1997.
5. P. Hansbo E. Burman. Edge stabilization for galerkin approximations of the
generalized stokes' problem. submitted to M2AN, 2004.
6. L.P. Franca and S.L. Frey. Stabilized finite element methods: II. The incompress-
ible Navier-Stokes equations. Comput. Methods Appl. Meeh. Engrg., 99:209-233,
1992.
7. V. Girault and P.-A. Raviart. Finite Elements for the Navier Stokes Equations.
Springer, Berlin, 1986.
8. J.-L. Guermond. Stabilization of Galerkin approximations oftransport equations
by subgrid modeling. Model. Math. Anal. Numer., 33(6):1293-1316, 1999.
9. P. Hansbo and A. Szepessy. A velocity-pressure streamline diffusion finite ele-
ment method for the incompressible Navier-Stokes equations. Comput. Methods
Appl. Meeh. Engrg., 84:175-192, 1990.
10. T.J.R. Hughes, L.P. Franca, and M. Balestra. A new finite element formulation
for computational fluid dynamics: V. circumvent the Babuska-Brezzi condition:
A stable Petrov-Galerkin formulation for the Stokes problem accommodating
equal order interpolation. Comput. Methods Appl. Meeh. Engrg., 59:89-99, 1986.
11. C. Johnson. Numerical Solution of Partial Differential Equations by the Finite
Element Method. Cambridge University Press, Cambridge, UK, 1987.
12. M. Braack R. Becker. A two-level finite element stabilization for Navier-Stokes.
in preparation, 2004.
13. M.A. Olshanskii T. Gelhard, G. Lube. Stabilized finite element schemes with
LBB-stable elements for incompressible flows. submitted, 2003.
14. L. Tobiska and R. Verfiirth. Analysis of a streamline diffusion finite element
method for the Stokes and Navier-Stokes equations. SIAM J. Numer. Anal.,
33(1):107-127, 1996.
A Posteriori Error Estimates for Parameter
Identification

Roland Becker! and Boris Vexler2

1 Laboratoire de Mathematiques Appliquees, Universite de Pau et des Pays de


l'Adour,
BP 1155, 64013 Pau Cedex, France
roland. [email protected]
2 Institut fii Angewandte Mathematik, Universitat Heidelberg, 1m Neuenheimer
Feld 294,
69120 Heidelberg, Germany
boris. [email protected]

Summary. In this paper we present an a posteriori error estimator for parameter


identification problems governed by partial differential equations. This estimator
aims to control the error in parameters due to the discretization by finite elements.
It is used in an adaptive mesh refinement algorithm generating a sequence of locally
refined meshes for efficient computation of the parameters. Comparison with some
heuristic mesh refinement algorithms is done for a simple example inverse problem.

1 Introduction
We consider parameter identification problems involving a finite number of
unknown parameters in the following form: The state variable u in an appro-
priate Hilbert space V is determined by a partial differential equation (state
equation) in weak form:
a(u, q)(¢) = !(¢) Y¢ E V, (1)
where q E Q = lRnp denotes the unknown parameters. The form a is defined
on the Hilbert space V x Q x V and the linear functional! E V' represents
the right hand side of the state equation, where V' denotes the dual space of
V. Further, we are given an observation operator C : V ---+ Z, which maps
the state variable u to the space of measurements Z = lR nm , where we assume
nm ~ np. We denote by (., ·)z the scalar product of Z and by I . liz the cor-
responding norm. Similar notations are used for the scalar product and norm
in the space Q.

The values of the parameters are estimated from a given set of measure-
ments C E Z using a least squares approach such that we obtain a constrained
optimization problem with the cost functional J : V ---+ lR:
1 - 2
Minimize J(u) := 21IC(u) - CIIz (2)
132 R. Becker, B. Vexler

under the constraint (1). Here, the cost functional is the squared norm of the
residual R LS defined by
RLS(U) := C - C(u). (3)
The state equation is discretized by conforming finite elements on a regular
mesh Th, resulting in a finite element space Vh C V, see e.g. Ciarlet [8] for the
standard construction. In order to ease mesh refinement, the cells are allowed
to have nodes, which lie on midpoints of faces of neighboring cells. But at most
one such hanging node is permitted for each face, see Carey & Oden [7] for
implementation details.
The discrete state Uh E Vh and parameter qh E Q are determined by:

Minimize J(Uh) (4)


under the constraint

(5)
Due to the finite dimension of Q, we suppose the parameter qh in (4) to be
sought in the space Q.

The paper is organized as follows: In the next section we describe an opti-


mization algorithm for solving the problem (4, 5) on a fixed mesh T,.. Section 3
is devoted to a posteriori error estimation. Here, we present an a posteriori
error estimator for the error in parameter E(q) - E(qh) for a given error func-
tional E : Q ---+ JR. This error estimator is developed in [5]. It is based on
the optimal control approach to a posteriori error estimation from Becker &
Rannacher [4]. However, a direct application ofthe techniques described in [4]
leads to an estimator which controls the error in the cost functional J (2). In
general, such an estimator does not provide useful error bounds for the pa-
rameters, in contrast to the approach presented here. In Section 4 we discuss a
numerical example illustrating the usage of the error estimator. The presented
approach is compared with some heuristic methods with respect to the quality
of generated meshes. Conclusions are given in the last section.

2 Optimization algorithm
In this section we discuss an optimization algorithm for solving the prob-
lem (4, 5) on a fixed mesh T,..
Under the assumption ofregularity of the partial derivative a~, the implicit
function theorem in Banach spaces implies the existence of an open set Qo C Q,
containing the optimal parameter q, and a continuously differentiable solution
operator S : Qo ---+ V, q ---+ S(q), so that (1) is fulfilled for U = S(q). This
allows us to reformulate the problem (1, 2) as an unconstrained optimization
problem:
A Posteriori Error Estimates for Parameter Identification 133

Minimize j(q):= ~llc(q) - CII~, q E Q, (6)

where the reduced observation operator c is given by c(q) = C(S(q)). Denoting


by G = c'(q) the Jacobian matrix of the reduced observation operator c, the
first-order necessary condition j' (q) = 0 for (6) reads:

G*(c(q) - C) = 0, (7)

where G* denotes the transpose of G. The Jacobian matrix G can be obtained


using tangent solutions Wj E V determined by:

Then, one simply proves that the entries of the matrix G are given by G ij =
CI(u)(Wj).
Similarly to the continuous case, we introduce a discrete solution operator
Sh : Q----7 Vh for the discretized state equation (5) and obtain an unconstrained

formulation of the discretized problem (4, 5) by:

(9)

where Ch is the discrete reduced observation operator defined by Ch(%) =


C(Sh(qh)). The corresponding Jacobian matrix G h can be computed similarly
to the continuous case using the discrete tangent solution Wj,h E Vh determined
by the discrete version of (8).
The problem (9) is solved iteratively starting with an initial guess q~ and
using the recursive setting q~+1 = q~ + Jqh. The update Jqh is obtained using
a symmetric approximation Hk of the hessian \l2jh(q~) as the solution of the
system of linear equations:

(10)

where G h = Ch(q~). The most widely used choice of the matrix Hk = G'h G h
leads to the Gauss-Newton algorithm, see e.g. Nocedal & Wright [10].
For one step of the Gauss-Newton algorithm the state equation and np
tangent problems (8) have to be solved which originate from the same linear
operator but with different right-hand sides. Due to the small dimension np of
the parameter space Q the solution of (10) is uncritical.

3 A posteriori error estimation

In this section we present an error estimator for the error with respect to a
given error functional E : Q ----7 JR. The precise error representation is given in
the following theorem. Here, we use an interpolation operator ih : V ----7 Vh ,
see e.g. Clement [9].
134 R. Becker, B. Vexler

Theorem 1. Let (u, q) be a solution of the parameter identification prob-


lem (1, 2) and (Uh, %) the corresponding discrete solution of the problem (4, 5).
Then, for a given error functional E, we have that:

E(q) - E(qh) = ~P(Uh)(Y - ihY) + ~P*(Uh' Yh)(U - ihU) + P + R, (11)


where Y E V is the solution of the adjoint problem:
a~(u,q)(¢,y) = -(G(G*G)-l"VE(q),G'(u)(¢)) Y¢ E V (12)
and p(.) (.) and p* (.) (.) are the residuals of the state and adjoint equation de-
fined by:
p(Uh)(¢) := f(¢) - a(uh, qh)(¢)
p*(Uh' Yh)(¢) := -(Gh(G'f.Gh)-l "V E(qh), G'( Uh)(¢)) - a~(uh' qh)(¢, Yh)'
(13)
The remainder term R (due to linearization) is quadratic in the error and the
additional remainder term P admits the estimate:

where e u := u - Uh, eq := q - qh and Oh¢ := ¢ - ih¢ is an interpolation error


operator. The mean tangent solution v E V is given by
np

V = - 2)(G*G)-1"V E(q)) jWj (15)


j=l

and the normalized adjoint solution Z E V is determined by:


, _ RLS(u) ,
au(u, q)(¢, z) = (-IIRLS(u)II' G (u)(¢))z Y¢ E V, (16)

if the least squares residual R LS (u) does not vanish; otherwise we set z = O.
The constant C does not depend on the mesh parameter h nor on the measure-
ments C.
Proof. For proof we refer to [5].
For evaluation of the error estimator, denoted by 'f/h, the local interpolation
errors y-ihY and U-ihU have to be approximated. In our numerical examples,
we use interpolation of the computed bilinear finite element solutions Yh and
Uh on the space of biquadratic finite elements on patches of cells. The main
computational cost for the a posteriori error estimator described above is the
solution of one auxiliary equation (12). This is cheap, even in comparison with
only one Gauss-Newton step, which includes solution of the state (nonlinear)
and of the several (linear) tangent equations.

In order to illustrate the typical use of the error estimator 'f/h, we sketch
a generic adaptive mesh refinement algorithm. Such an algorithm generates
a sequence of locally refined meshes and corresponding finite element spaces
until the estimated error with respect to E is below a given tolerance TOL.
A Posteriori Error Estimates for Parameter Identification 135

Adaptive Mesh Refinement Algorithm

1. Choose an initial mesh Tho and set k =0


2. Construct the finite element space Vhk
3. Compute Uh k E Vhk , qh k E Q solving (4,5)
4. Evaluate the a posteriori error estimator 'T]hk

5. If'T]h k ::; TOL quit


6. Refine 4.. ----+ 4. k + 1 using information from
'T]h k

7. Increment k and go to 2.

Remark 1. In step 3, the parameter identification problem is solved on a fixed


mesh. As initial data, we use the values from the computation on the previous
mesh. This allows us to avoid unnecessary iterations of the optimization loop
on fine meshes.

4 Numerical result

In this section we compare our general approach to mesh refinement for param-
eter identification problems with some heuristic methods. We consider three
types of heuristic approaches for mesh refinement: a strategy based only on
the information obtained from the computed state variable, a strategy based
only on the a priori knowledge of the structure of the observation operator and
a strategy, which combines both types of information.

We consider the following diffusion-reaction equation with unknown coef-


ficient q in the unit square D = (0,1 )2:

-qi1u + su = 2 in D,
(17)
u=o on aD,

where s is chosen as s = 200. The parameter q is estimated using measurements


given by the values ofthe state variable at nine different points ~i' see Figure 1.
The exact value of the parameter is q = 1.
The components of the corresponding observation operator C have the
following form:
(18)
and the parameter identification problem is formulated as follows: For (u, q) E
V x Q with V = HJ (D) and Q = lR
9
Minimize ~ L:: (U(~i) - Ci )2 (19)
i=l
136 R. Becker, B. Vexler
o

o 0 0
o

Fig. 1. The computational domain with measurement points marked by circles

under the constraint (17), where Ci denote the components of the measurement
vector C E Z = ]R9 and are given by the values of the state variable u for the
exact parameter q, i.e. Ci = U(~i).
First, we compare the quality of meshes generated by our a posteriori error
estimator for this problem with a typical strategy based on a posteriori infor-
mation obtained by the state variable, i.e with the mesh refinement guided by
one of the well-known "energy" type error estimators for uncontrolled equa-
tion, see e.g. Bank & Weiser [2] and Babuska & Miller [1]. This estimator aims
to control the error U - Uh in HI-norm, but they do not take care of the struc-
ture of the parameter identification problem. As seen from Figure 2, adaptive
refinement based on the "energy" estimator leads to a similar reduction of the
error as global refinement. However, the strategy based on our error estimator
leads to an obvious saving in the number of unknowns necessary to achieve a
prescribed accuracy level.

global - + -
ene~~~ ~~.~~~'-

0.01

0.001

0.0001

Fig. 2. Errors in q for different refinement strategies vs. number of nodes (global
refinement, "energy"-based refinement and refinement resulting from our a posteriori
error estimator)
A Posteriori Error Estimates for Parameter Identification 137

Next, we compare our strategy for mesh refinement with the following
heuristic approach: In each iteration of the mesh refinement we refine the
cells, which lie close to one of the measurement points, i.e. in the set

u{x Illx - M :::; r}, (20)

where r E IR+ is a given number. In contrast to our approach, this strategy


is unable to weight the relative importance of the measurement points. The
corresponding comparison is done in Figure 3 for two choices of r (r = 0.04
and r = 0.1).

global_
r=O.l ---)(---
r=zO.04 ....... .
our ... -8-.,-

0.01

0.001

0.0001

Fig. 3. Errors in q for different refinement strategies vs. number of nodes (global
refinement, refinement across measurement points and refinement resulting from our
a posteriori error estimator)

After several steps one does not observe any error reduction despite in-
creasing the number of nodes for both choices of r in the described strategy,
as could be expected. Typical meshes resulting from application of our a pos-
teriori error estimator, "energy" estimator and the last strategy are shown in
Figure 4.

We also compare our mesh refinement procedure with a combination of the


last heuristic methods. By this strategy both the cells marked by the "energy"
estimator and the cells across the measurement points (20) are refined. The
corresponding comparison with our mesh refinement procedure is made in
Figure 5 for two choices of r (r = 0.04 and r = 0.1).
The typical meshes resulting from application of this strategy for two
choices of r (r = 0.04 and r = 0.1) are shown in Figure 6.
138 R. Becker, B. Vexler

Fig. 4. Typical meshes produced by our a posteriori error estimator (left), energy
error estimator (right) and the refinement across the measurement points

5 Conclusions
We presented an a posteriori error estimator for finite element discretization
of parameter identification problems. This error estimator is cheap to evaluate

global ----+--
r=O.l ---)(---
r=O.04 ... * ...

0.0001

la-OS L......_ _ _ _ ~~~ _ _ _ _ _ _. . . . . , . , : _ _ - - - - '


1000 10000 100000

Fig. 5. Errors in q for different refinement strategies vs. number of nodes (global re-
finement, refinement produced by combing the refinement across measurement points
and "energy" -based refinement, and refinement resulting from our a posteriori error
estimator)
A Posteriori Error Estimates for Parameter Identification 139

Fig. 6. Typical meshes produced by combining the refinement according to "energy"


error estimator and the refinement across the measurement points for r = 0.1 (left)
and r = 0.04 (right)

and assess the error we are interested in, i.e. the error in parameters. We com-
pared our approach with some heuristic methods with respect to the quality
of the generated meshes. The presented error estimator is successfully applied
to parameter identification in CFD problems, see Becker & Vexler [6J and to
estimation of chemical models in multidimensional reactive flows, see Becker,
Vexler & Braack [3J.

References
1. Babuska, I., Miller, A.D. (1987): A feedback finite element method with a pos-
teriori error estimation. Comput. Methods Appl. Mech . Engrg, 61:1-40
2. Bank, R.E., Weiser, A . (1985): Some a posteriori error estimators for elliptic
partial differential equations. Math. Comp., 44:283-301
3. Becker, R., Braack, M., Vexler, B. (2003) : Numerical parameter estimation for
chemical models in multidimensional reactive flows. Combustion Theory and
Modelling, submitted
4. Becker, R., Rannacher, R. (2001): An optimal control approach to a posteriori
error estimation in finite element methods. In Acta Numerica 2001 (A. Iserles,
ed.), Cambridge University Press, Cambridge
5. Becker, R., Vexler, B. (2003): A posteriori error estimation for finite element dis-
cretization of parameter identification problems. Numerische Mathematik, pub-
lished online
6. Becker, R., Vexler, B. (2003): Calibration of PDE Models and Sensitivity Anal-
ysis with Adaptive Finite Elements: Application to CFD Problems. Journal of
Computational Physics, submitted
7. Carey, C.F., Oden J .T . (1984): Finite Elements, Computational Aspects. Vol. III.
Prentice-Hall
8. Ciarlet, P.G. (1978): The Finite Element Method for Elliptic Problems. North-
Holland Publishing Company, Amsterdam
140 R. Becker, B. Vexler

9. Clement, Ph. (1975): Approximation by finite element functions using local reg-
ularization. [J] Revue Franc. Automat. Inform. Rech. Operat. 9(R-2), 77-84
10. Nocedal, J., Wright, S.J. (1999): Numerical Optimization. Springer Series in
Operations Research, Springer New York
On a Phase-Field Model with Advection

Michal Benes

Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering,


Czech Technical University in Prague, Trojanova 13, 120 00 Prague 2, Czech
Republic [email protected]

Summary. In this contribution, we present a phase-field model of advected mean-


curvature flow and of advected pattern formation in solidification. The model is
based on the approach presented in [3], where an extensive literature list on meth-
ods treating mean-curvature problems can be found. The model represents a step
towards simulation of solidification processes where the melt motion is important.
We give a basic mathematical information concerning the weak solution of the model
equations, introduce a numerical scheme based on the Finite-Difference Method, and
show several numerical studies demostrating basic qualitative effects of advection in
the given context.

Mean curvature flow with advection. The problem of mean curvature


flow of hypersurfaces (see [7, 6, 5]) is usually set as follows:

normal velocity = -mean curvature + forcing,


in the normal direction to a closed hypersurface r. In this work, we consider
the law modified by an imposed velocity field V, which means that the hy-
persurface is advected by the vector field, as well. Using the notations nr for
the Euclidean normal vector to r, Vr for the normal velocity, I'£r for the mean
curvature, and F for the forcing term, we formulate the motion law for r as
follows
vr - V . nr = -I'£r + F. (1)
The equation (1) has origin in the modified Stefan problem with advection
describing the solidification of crystalline materials where the bulk mixture of
liquid and solid is carried by an imposed vector field V:

(2)

au I au
-;;;--- I = L(vr - V· nr) on r(t),
anr s unr I

F(u) = I'£r + a(vr - V· nr) on r(t),


where u denotes the temperature field, u* the temperature of melting point,
Ds , Dl the solid and liquid sub domains of D, L the latent heat per unit volume,
a a material parameter, F(u) a coupling term (~u-u*), and ~ = ft
+ V· V
the material derivative.
142 M. Benes

For details of physical context, we refer the reader to [8, 10]. Obviously,
the above given physical problem simplifies the problem of density driven flow
around the solidifying structure in a real material. In this general case, the
law of momentum conservation would be needed, and the boundary condi-
tions should also correspond to the conservation of quantities in question. Our
purpose is to study the problem (2) by a diffuse-interface approach which al-
lows us to design a suitable numerical algorithm and to perform qualitative
numerical studies.

Allen-Cahn equation with advection. The law (1) can be treated by


a particular method oflevelset type (see [3]), which relies on solution properties
of a reaction-diffusion equation of Allen-Cahn type (see [9]). We refer the
reader to [3], a sample of literature resources on this topic. Introducing the
thickness ~ > 0 of the diffuse layer surrounding r and the polynomial fo (s) =
ap(l-p)(p-~) as the minus derivative of a double-well function wo, we present
the phase-field approximation of the advected mean curvature flow
as follows

(3)

In Figure 1, we illustrate the expected fact, that the ~-levelset of a solution to


(3) converges to the set r(t) evolved by (1) - see [2] for the no-advection case.

Fig. 1. Schematic relationship to the original motion law

Phase-field equations with advection. The above indicated approach can


be used to approximate (or regularize) the physical problem (2). In this case,
we obtain a complete system of phase-field equations with advection reading
as

dmu = L1u L '( )dm p (4)


dt + X P dt '

a~d~P = ~L1p + ~fo(p) + F(u)~I\7pl,


On a Phase-Field Model with Advection 143

with the initial conditions U It=o= Uini, P It=o= Pini, and with the homoge-
neous Dirichlet boundary conditions (set for the sake of simplicity). Addition-
ally, we assume that F(u) is a bounded Lipschitz-continuous function.
The enthalpy of the system 1i( u) = u - LX(p) is expressed by means of
a focusing function X, which is monotone with bounded, Lipschitz-continuous
derivative:

x(O) = 0, X(0.5) = 0.5, X(l) = 1, supp(X') C (0,1).


In the following theorem, we set X(P) = P for simplicity, although the compu-
tations are performed for a nontrivial X. The general case is investigated in
[2]. Considering a bounded domain [2 C ]R2 with C2 boundary, we can state
the following basic property of the system (4):
Theorem 1. Ijuini,Pini E H6([2), V E Loo([2;]R2), and ~ remains fixed, then
there is a unique solution U,p E L 2(0, T; H6([2)) oj the weak problem
Vv, q E D([2), a.e. in (0, T) :
d
dt(u - Lp,v) + (V· V(u - Lp),v) + (Vu, Vv) = 0, (5)

ae (! (p, q) + (V . Vp, v)) + e(Vp, Vq) = (Jo(p) , q) + e(F(u)IVpl, q),


u(O) = Uo, p(O) = Po,
jor which

Proof is an extension of the result stated in [2]. We concentrate ourselves on


issues closely related to the advection terms in both equations. By means of a
total set in L 2 ([2) (e.g., consisting of eigenvectors of -,1) denoted as {VdiE]\[,
we define a finite-dimensional subspace

Vm = span{vihE]\[= where Nm = [l, ... ,m],


and consider the projector Pm : L 2 ([2) ---+ Vm . By means of the Faedo-Galerkin
method, we derive a semi-discrete scheme Vv, q E Vm

(Ot(u m - Lpm), v) + (V . V(u m - Lpm), v) + (Vu m , Vv) = 0 a.e. in (0, T),


ae ((Otpm, q) + (V . Vpm, v)) + e(Vpm, Vq) (6)
= (Jo(pm), q) + e(F(um)IVpml, q),
um(O) = PmUini, pm(o) = PmPini.
The approximate solution is given by basis functions of Vm
144 M. Benes

In fact, the semi-discrete scheme is a system of ODEs for unknown functions


of time: 'Yi, f3i . Next step is derivation of energy estimates. We multiply the
equations by i'i(t), (3i(t), respectively, and sum over i E filffi'
1d
II Ot Uffi l1 2 + 2" dt lI\7u ffi l1 2 + (V· \7(uffi - Lpffi) , Ot Uffi ) = L(Otpffi, Ot Uffi ) ,
.:-2 d
a~21lotpffil12 + ~ dt II \7pffi 112 + a~2(V. \7pffi, Otpffi)
= (fa (pffi) , Otpffi) + e(F(u ffi )l\7pffil, Otpffi).

Consequently, we obtain the inequalities

where w~ = -fa, IF(u)1 ::; C F , V = IIVII=. They allow us to use the com-
pactness method in the same manner as it is presented in [2]. Namely, the
theorem assumptions together with the above estimates processed by the
Gronwall lemma give that, independently of m, \7u ffi , \7pffi are bounded in
L=(O, T; L 2(D)), and pffi are bounded in L=(O, T; Ls(D)) for each finite time
8 8
T > 0, and for any s ::::: 1. Repeated integration says that ~t' ~t are
~ ~

bounded in L 2(0, T; L2(D)) for each finite time T > 0, independently of m.


Therefore, we are able to pass to a weak limit u ffi ' ----' u in L 2(0, T; H6(D))
pffi' ----' P in L 2(0, T; H6(D) n L4(D)) via a subsequence m', and additionally,
thanks to the compact-imbedding theorem with the assumptions

{Uffi}~=l bounded in L2(0,T;H6(D))'{O~~}~=1 bounded in L2(0,T;L2(D)),

{pffi}~=l bounded in L 4(0, T; H6(D) n L 4(D)),

{O~~ }~=1 bounded in L2(0, T; L2(D)),

to the strong limits u in L 2(0,T;L 2(D)), p in L4(0,T;L4(D)).


The passage to the limit in the semi-discrete scheme (6) can be accom-
plished due to the following facts:
1. \7pffi' converges strongly in L 2(0, T; L2(D; ]R2)) to \7p (Lemma 3.4 of [2]),
\7u ffi ' converges weakly in L 2(0, T; L 2(D; ]R2)) to \7u,
2. V· \7(u ffi ' - Lpffi') converges weakly in L 2(0, T; L 2(D)) to V· \7(u - Lp),
V· \7pffi' converges weakly in L 2(0, T; L2(D)) to V . \7p,
3. fo(pffi') converges weakly in L1.3 (0, T; L1.3 (D)) (polynomial nonlinearity and
Aubin lemma),
4. F(u ffi )l\7pffil converges weakly to F(u)l\7pl in L 2(0, T; L 2(D)) (Lemma 3.5
of [2]),
On a Phase-Field Model with Advection 145

5. Pm'Pini, Pm,Uini converge strongly to Pini, Uini in L2(D),


6. pm' (0) = Pm'Pini, um' (0) = Pm,Uini.

Additionally, we observe that the function P belongs to L 2(0, T; H6(D)nH2(D))


(Lemma 3.3 of [2]). The weak solution satisfies the initial condition (again see
[2] for details).
Due to Lipschitz-continuity of F with the Lipschitz constant denoted by
L F , we prove uniqueness of the solution of (5). We consider two solutions
of the problem (5), denoted by [UI,PI] and [U2,P2]. Subtracting corresponding
systems of equations and denoting [UI2,P12] = [UI - U2,PI - P2], multiplying
the first equation by UI2 - LPI2 and the second equation by P12, we have

1 d 2 2
2 dt II u I2 - LPI211 + IIV(UI2 - Lp12) II + L(Vp12 , V(UI2 - LpI2))
+(V . V(UI2 - LpI2)), UI2 - Lp12) = ° in (0, T),

~ae :t IIpI211 2 + ellVPI211 2 + ae(v. VPI2),PI2) = (fo(pI) - fo(P2),PI2)


+e(F(UI)IVPII- F(U2)IVp21,P12) in (0, T),
(U12 - LpI2)(0) = 0, P12(O) = 0.

Due to the form of fa, we have that

Using the Schwarz inequality, we get

:t II u I2 - LPI211 2 ::; V 2 11 u 12 - LPI211 2 + L 211VP1211 2,


1 2d 2 2 2 )
2a~ dtllpI211 +~ IIVpdl ::; (7
7a 2 2
411PI211 +~ LFll u dIII V pIilL 4 (!2)llpI21IL 4 (!2)
+(eCF + aeV)IIVPI2111IPI211,
in (0, T). Due to the Young inequality, we obtain

:t II u I2 - LPI211 2 ::; L 211VPI211 2 + V211uI2 - LPI211 2 in (0, T),

1 2d 2 ~2 12 ( )
2a~ dt Ilp1211 + 211VPI21 ::; 8
3 2 2 7a 2 2 2 2 2 2 2 2 I 112
2(CF~ + 6""" + a V ~ + 2L ~ L F C 4 1I V pIilL 4 (!2)) IPI2
+3e L}clIIVpIilL(!2) IIUI2 - LpIZII2.
Combining these inequalities, we have in (0, T):
146 M. Benes

(9)

with

Considering the fact that there is a constant Op for which

where 0 4 is the norm of the imbedding H6(D) into L4 (D), and


Pi E L 2(0, Ti H2(D)), we have that

as follows from the Gronwall lemma. D

Asymptotic behaviour. A priori estimates in the above given proof imply


that
02
E~[P](t)::; Edp](O)exp{ 2~t} t E (O,T),

where we denoted

and additionally, there is an estimate for the time derivative given by

We therefore can state that the function p = p(t, Xi~) tends to a stepwise
constant function as in the Theorem 2.2 of [4]. As in [2], the matching procedure
recovers the Gibbs-Thompson law a(vr - V . nr) = -K,r + F(u~) + O(e), as
well as the Stefan condition at r.
Remark. As indicated in [1], the model (4) can be modified by incorporating
anisotropy into coefficients of (1). The above given analysis is applicable again,
namely for weak anisotropies.
On a Phase-Field Model with Advection 147

Method of lines in 2D. In the computations, we take n = (0, L 1 ) x (0, L 2 )


and use a uniform rectangular grid. The following notations are introduced:
Ll L2
h = (hI, h2), hI = N 1 ' h2 = N 2 ' Uij = u(ihl' jh2),

Wh = {[ihl,jh2ll i = 1, ... ,Nl-1; j = l, ... ,N2 -1},


Wh = {[ihl,jh2ll i = 0, ... , N 1 ; j = 0, ... , N 2}, /'h = Wh - Wh,

Uij - Ui-l,j Ui+l,j - Uij


Ux1,ij = hI Ux1,ij = hI
Uij - Ui,j-l Ui,j+l - Uij
U x2 ,ij = h2 U x2 ,ij = h2
1
Uhxl,ij = h 2 (Ui+l,j - 2Uij + Ui-l,j) ,
1

and

The set of grid functions is denoted by 'Hh. The semi-discrete scheme has the
following form

d - h h h
(dt +V . "h)(u - LX(p )) = Llhu ,

ae (! + V· '9h)ph = e Llhph + fO(ph) + eIVhphlF(uh ) on Wh,

u h IT'h = 0, ph IT'h = 0,
uh(O) = PhUini, ph(O) = PhPini,
where its solution is a map uh,ph :< 0, T >---> 'Hh and Ph : C(Q) ---> 'Hh is
a restriction operator. The stability and convergence of the scheme can be
investigated in a way very similar to the proof of the Theorem 1. The scheme
is designed to meet real conditions, where the diffusion and growth process
dominates the advection.

Computational results. We present several results of advected curve dy-


namics and advected pattern formation in solidification. Figure 2 shows how
the advection field influences particular situations of curve dynamics. In Figure
2a, the circle of critical radius is carried down, and when it interacts with the
domain boundary, it is brought to shrinking. Obviously, such circle remains
unchanged, when no advection is imposed. In Figure 2b, the initial circle is
converted to an anisotropic pattern due to the anisotropy incorporated into
the model. When it expands, it interacts with the boundary. In Figure 2c, a
circle is shrinking when being carried around by advection. In Figure 2d, a cir-
cle at critical radius is carried around by advection. As there is no interaction
with boundary, it remains unchanged.
148 M. Benes

In Figure 3, we observe a single-dendrite growth. The pattern falls down-


wards. In Figure 4, three nucleation sites were imposed. The dendrites growth,
interact and fall downwards, where they touch the domain boundary.

Circle at critical radius Expansion with anisotropy

'......\ph854caruw -
0.2

0.15

~ =O.OOB, h = 0.022, F =-20.0, dt =0.00015, = =


~ O.OOB, h 0.022, F =-100.0, dt =0.00015,
v =(0.0,-100.0) v = (0.0,-100.0)
Circle shrinking with rotation Circle at critical radius with rotation

·.... 5Iph6S4corl.av'- '.\.s\ph854con.crv'-

0.15

0.1 0.1

0.05

oL-__ ____ ____ ____


oL---~----~----~----~~
~ ~ ~ ~-J

o o 0." 0.1 0.115 0.2

s=O.OOB, h =0.022, F =0.0, dt =0.00015, ~ =O.OOB, h =0.022, F =-25.0, dt =0.00015,


v: 1000" v: 1000"

Fig. 2. Various situations of the advected mean curvature flow

Acknowledgement. The author was partly supported by the project of the


Czech Grant Agency No. 201/01/0676. Some computations were performed
under the support of the Institute of Computer Science, Czech Academy of
Sciences within the project No. A1030103 of the Grant Agency of the Czech
Academy of Sciences.
On a Phase-Field Model with Advection 149
Pattern

D.D .olld r liquid

Fig. 3. Solidification of a single falling dendrite - parameters are u· = 1.0, Uini =


0.0, L = 2.0, f3 = 300, a = 4.0, a = 3, L1 = L2 = 3.0, initial radius = 0.025,
e
N1 = N2 = 300, = 0.015

References
1. M. Benes, Anisotropic phase-field model with focused latent-heat release, FREE
BOUNDARY PROBLEMS: Theory and Applications II (Chiba, Japan), 2000,
GAKUTO International Series Mathematical Sciences and Applications, Vo1.14,
pp.18-30.
2. ___ , Mathematical analysis of phase-field equations with numerically efficient
coupling terms, Interfaces and Free Boundaries 3 (2001), 201-221.
3. ___ , Mathematical and computational aspects of solidification of pure sub-
stances, Acta Mathematica Universitatis Comenianae 70, No.1 (2001), 123-
152.
4. M. Benes and K. Mikula, Simulation of anisotropic motion by mean curvature-
comparison of phase-field and sharp-interface approaches, Acta Math. Univ.
Comenianae 67, No.1 (1998), 17-42.
5. Y.-G. Chen, Y. Giga, and S. Goto, Uniqueness and existence of viscosity solutions
of generalized mean curvature flow equations, J. Diff. Geom. 33 (1991), 749-786.
6. L.C. Evans, H.M. Soner, and P.E. Souganidis, Phase transitions and generalized
motion by mean curvature, Comm. Pure Appl. Math. 45 (1992), 1097-1123.
7. L.C. Evans and J. Spruck, Motion of level sets by mean curvature I, J. Diff.
Geom. 33 (1991), 635-681.
8. M. Gurtin, On the two-phase Stefan problem with interfacial energy and entropy,
Arch. Rational Mech. Anal. 96 (1986), 200-240.
9. T. Ohta, M. Mimura, and R. Kobayashi, Higher-dimensional localized patterns
in excitable media, Physica D 34 (1989), 115-144.
10. A. Visintin, Models of phase transitions, Birkhiiuser, Boston, 1996.
150 M. Benes
Pattern

Time level t = 0.09000

... solid
-
r
j
liquid

Fig. 4. Solidification of three falling dendrites - parameters are u* = 1.0, Uini = 0.0,
L = 2.0, (3 = 300, a = 4.0, ex = 3, Ll = L2 = 3.0, initial radius = 0.025, Nl = N2 =
e
300, = 0.015
Fast Evaluation of Eddy Current Integral
Operators

Steffen Barm

Max-Planck-Institute fur Mathematik in den Naturwissenschaften


InselstraBe 22-26, 04103 Leipzig, Germany
[email protected]

Summary. Boundary element formulations for eddy current problems are based on
non-local operators. Discretizing these operators by standard Galerkin techniques
leads to large dense matrices. In order to treat the discretized system efficiently, we
cannot store these dense matrices directly, but use data-sparse approximations.
We present an approach based on piecewise polynomial interpolation of the un-
derlying kernel functions. The resulting rf? -matrix approximation can be stored using
only O(nm 3 ) units of storage, where n is the number of degrees of freedom and m
is the order of the interpolation. Construction and evaluation of the approximated
matrix requires only O(nm 3 ) operations.
This paper presents joint work with Jiirg Ostrowski.

1 Introduction

1.1 Problem

The eddy-current model introduced in [6] in combination with impedance


boundary conditions [9] leads to the variational equation

a(v,u) -b(v, ¢) =f(v) for all v E V


(1)
-b(u, 1j;) -q(1j;, ¢) =((1j;) for all 1j; E W

for the unknown vector field u E V and the unknown potential ¢ E W, where
the bilinear forms have the form

a(v, u) = II (curlr v(x), curlr u(y))1i(x, y) dy dx + sparse,

q(1j;, ¢) = II (eurlr 1j;(x) , eurlr ¢(y))1i(x, y) dy dx,

b(v, ¢) = II (eurlr ¢(y), v(x)) (grad x 1i(x, y), n(x)) dy dx

-ll (eurlr ¢(y), n(x)) (grad x 1i(x, y), v(x)) dy dx

+ sparse.
152 S. Barrn

Here, <I> denotes the singularity function for the Laplace equation, i.e.,
1
<I>(x, y) = 471"IIX _ yll ' (2)

and the surface differential operators are defined by


curlr u := (n, curl u), curlr 1/! := (grad 1/!) x n.
The sparse parts of the bilinear forms can be treated by standard techniques,
so we consider only the non-local components of the system.

1.2 Compression

Discretizing the non-local operators leads to large densely populated matrices


that cannot be handled efficiently by standard techniques, therefore we use
the 1-{2-matrix approximation technique [2] to reduce the storage requirements
and the complexity of the discretization and the matrix-vector multiplication.
The application of 1-{2-matrices to the problem of the eddy current model was
investigated in [7] and [3].
We remark that there is a close relationship of 1-{2-matrices to the panel
clustering technique [5] and the fast multi-pole method for integral operators
[8,4]. While the multi-pole technique applies an expansion specially designed
for the kernel function under investigation in order to reach the, in some sense
optimal, complexity of O(nlog2 n), the 1-{2-matrix technique has a complexity
of O( n 10g3 n) but can be applied to any asymptotically smooth (cf. (4)) kernel
function.

2 Approximation
2.1 Discretization

We approximate the surface r by a polyhedron rh described by a conforming


triangulation. Its triangles, edges and nodes are denoted by T, E and N.
For each edge e E E, we denote the surface edge element basis function (cf.
[3]) by be and define the finite-dimensional subspace h := span{b e : e E E}
of V. For each node v EN, the surface nodal basis function is denoted by 1/!v,
and Wh := span{ 1/!v : v EN} is a finite-dimensional subspace of W.
The standard Galerkin approach with these basis functions leads to

( A-B)
_Q = rhs
_BT

defined by

Aef=a(be,b f ), Qvw=q(1/!v,1/!w) and Bew=b(be,1/!w)


for e, fEE and v, wEN, where all matrices are densely populated.
Fast Evaluation of Eddy Current Integral Operators 153

2.2 Simplification

For b E Vh and 'ljJ E Wh, the functions curlr band curlr'ljJ are piecewise
constant. Therefore we introduce the auxiliary space X h spanned by piecewise
constant basis functions Xt for t E T and the auxiliary matrix G defined by

Gts := ££ Xt(x)<p(x, Y)Xs (y)dydx

and observe that

A= Li GL 1 + sparse and (3)

hold for sparse matrices L1 and L2 . This representation is more efficient than
the original one, but it is still based on a densely populated matrix.

3 Approximation

3.1 Interpolation

The idea is to replace the kernel function <P on a sub-domain T x (T ~ rxr


by
k k
$T,O'(X,y):= LL<P(x~,x~)£~(x)£~(y),
1/=11'=1

where (X~)~=1 and (X~)~=1 are interpolation points and (£~)~=1 and (£~)~=1
are the corresponding Lagrange polynomials.
This results in an approximation of the local matrix

G;t := 11 11
Xt(X)<p(x, Y)Xs (y) dy dx ~ Xt(X)$T,O' (x, Y)Xs (y) dy dx

LL<P(X~,x~) 1Xt(x)£~(x)dx 1xs(y)£~(y)dy


k k

=
1/=11'=1 "---v--" T v " 0' v '
=:V~J.t

The approximation requires only 2nk + k 2 units of storage. For typical interpo-
lation schemes, k will be much smaller than n, so the factorized representation
will by much more efficient (d. Figure 1).
Typical interpolation schemes, e.g., tensor-product Chebyshev interpola-
tion, work only for smooth functions. Since the function <P is not globally
154 S. Borm

Fig. 1. Compressed representation of local matrices

smooth, we cannot hope to find a factorized representation for the entire ma-
trix G. Instead, we make use of the fact that {P is asymptotically smooth, i.e.,
that there are constants Casymp, co, dE lR>o such that

holds for x, y E lR3 with xi=- y.


Combining this inequality with standard estimates for the interpolation
error, we find that there is a polynomial C apx such that

holds for tensor-product Chebyshev interpolation of order mE 1N (correspond-


ing to a rank of k = (m + l)d) on axis-parallel boxes BT and B" satisfying
T ~ BT and a ~ B".
To ensure a uniform rate of convergence, the admissibility condition

(5)
has to hold for a constant 'TJ E lR>o (cf. Figure 2).

Fig. 2. Admissibility condition

3.2 Clust~r tree and block partition

Since we can apply our interpolation scheme only to sub-domains T x a satisfy-


ing the admissibility condition (5), we have to split the entire domain rh x rh
Fast Evaluation of Eddy Current Integral Operators 155

into sub-domains that either fulfil this condition or are so small that we can
store the corresponding local matrix block GT,U in the standard format.
We construct the splitting of rh x rh using a hierarchical splitting of the
domain r h that is called a cluster tree:
Definition 1. A tree C is called a cluster tree for a set rh if
- the set rh is the root ofC, i.e., root(C) = rh, and
- if a node T E C is not a leaf, then it is the (up to sets of measure zero)
disjoint union of its sons, i. e.,

Each node T E C is called a cluster.

A cluster tree can be constructed from an arbitrary set of triangles by binary


space partitioning: We start with the root cluster containing all the triangles,
corresponding to the entire domain, split it into two son clusters and repeat
the procedure recursively until the clusters contain less than k triangles.
Using the cluster tree, we can construct a partition
P ~ {T X (J : T, (J E C}
of rh x rh containing only admissible and small blocks (cf. [1]).

3.3 Construction of matrices

The matrix VT for each cluster T is given by

VTv := 1Xt(x)L~(x) dx.

Since Xt and L~ are polynomials, the entries of the matrix can be computed
by standard quadrature.
Since the coefficient matrix ST,U is defined by
ST,U '= <l?(x T XU)
VI" . v' I"

for each admissible sub-domain T x (J, it can be constructed by simply evalu-


ating the expression (2).
Setting
Har := {T X (J E P : Tx (J is admissible}, P near := P \ Pfar ,
the approximation of the matrix G is defined by

(6)
TXUEPfar TXO"EPnear

The approximation (6) requires O(nk log n) = O(nm 3 log n) units of storage,
and the approximation error decreases exponentially in m.
156 S. Barm

4 Nested cluster basis


Now, we aim to use the special structure of the matrices Vr , the cluster bases,
in order to reduce the storage complexity for the matrix G.

4.1 Nested approximation spaces

Let T be a cluster and T' one of its sons. Since we use the same order of
interpolation for all clusters, we have
k
which implies (7)
r/Esons(r)

so we need to store the n x k-matrix Vr only for clusters without sons, since
we can reconstruct it for all other clusters by using the k x k-matrices Tr',r
(cf. Figure 3). This reduces the storage complexity to O(nk) = O(nm 3 ).

Fig. 3. Representation of cluster basis matrices by transfer matrices

4.2 Fast matrix-vector multiplication

The equation (7) can also be used to perform the matrix-vector multiplication
efficiently: We introduce the auxiliary variables

uO' := VO' T u,
a,TXaEPfar

for clusters T, IJ' E C and find

Gu= sr,O'u + vrsr,O'vO'T U


2: 2:
(r,O') EPnear (r,O') EPfar

Gr,O'u + vrsr,O' uO'


2: 2:
(r,O') EPnear (r,O') EPfar

Gr,O'u+ 2:vrvr =v.


2:
(r,O') EPnear rEe
Fast Evaluation of Eddy Current Integral Operators 157

For typical domain splittings, the set {(J : T x (J E Prar} contains only a small
number of elements, so the computation ofyT can be done in complexity O(nk)
for all T E C. This leaves us with the task of evaluating uO" = VO" T U and VTyT
efficiently. Due to (7), we have
, T ,T
uO" = VO"T U = TO" 'O" VO" u =
0"' Esons( 0") 0"' Esons( 0")

so we can use a recursive procedure to compute the vectors uO" for all (J E C
in O(nk) operations. A similar recursion can be used to construct LTEC VTyT
in O(nk) operations.

5 Approximation of the double layer potential

Now that we know how to store and multiply by the matrices A and Q m
complexity O(nk), we only have to handle the matrix B efficiently.
We replace if> by the local approximations ;PT,O" and grad x if> by grad x if>T,O"
and recall the sparse lifting matrix L2 from (3) in order to find that

B=

is a good approximation of B, where

Wj~R = 1(bj)R(x)(grad.c~(x), n(x)) - nR(x)(grad.c~(x), bj(x)) dx


for j E E, f.1, E {1, ... , k} and £ E {1, ... , 3}.
For clusters with sons, the matrices WO",R can again be expressed by the
transfer matrices TO""O", so we can re-use most of the matrices involved in the
approximation of G and require additional storage only for clusters without
sons and for the near-field.

6 Numerical experiment

We approximate the auxiliary matrix G on the unit sphere using a local inter-
polation operator of m = 2. We use the admissibility condition (5) for T) = 2
and find the results given in Table 1. We can see that the approximation error
is stable, while the memory and time requirements grow linearly.
158 S. Borm

Table 1. Approximation of the auxiliary matrix G

nMem[MB] Mem[KB]/n Build[s] MVM[ms] Error


512 3.9 7.68 1 42.3_ 4
2048 21.6 10.54 5 46 4.2-4
8192 113.9 13.90 18 269 4.3-4
32768 461.4 14.08 79 1138 4.2-4
131072 2024.4 15.45 305 4990 -
524288 7976.1 15.21 1224 19772 -
2097152 32023.8 15.27 4974 83181 -

References

1. S. BORM, L. GRASEDYCK, AND W. HACKBUSCH, Introduction to hierarchical ma-


trices with applications, Engineering Analysis with Boundary Elements, 27 (2003),
pp. 405-422.
2. S. BORM AND W. HACKBUSCH, 1-[2 -matrix approximation of integral operators by
interpolation, Applied Numerical Mathematics, 43 (2002), pp. 129-143.
3. S. BORM AND J. OSTROWSKI, Fast evaluation of boundary integral operators aris-
ing from an eddy current problem, Tech. Rep. 33, Max Planck Institute for Math-
ematics in the Sciences, 2003. To appear in Journal of Computational Physics.
4. L. G REENGARD AND V. ROKHLIN, A fast algorithm for particle simulations, Jour-
nal of Computational Physics, 73 (1987), pp. 325-348.
5. W. HACKBUSCH AND Z. P. NOWAK, On the fast matrix multiplication in the
boundary element method by panel clustering, Numerische Mathematik, 54 (1989),
pp. 463-491.
6. R. HIPTMAIR, Symmetric coupling for eddy current problems, SIAM J. Numer.
Anal., 40 (2002), pp. 41-65.
7. J. OSTROWSKI, Boundary Element Methods for Inductive Hardening,
PhD thesis, University of Tiibingen, Germany, 2003. available under
http://w210.ub.uni-tuebingen.de/dbt/volltexte/2003/672.
8. V. ROKHLIN, Rapid solution of integral equations of classical potential theory,
Journal of Computational Physics, 60 (1985), pp. 187-207.
9. C. SCHWAB AND O. STERZ, A scalar BEM for time harmonic eddy current prob-
lems with impedance boundary conditions, in Scientific Computing in Electrical
Engineering, M. Giinther, D. Hecht, and U. van Rienen, eds., vol. 18 of Lecture
Notes in Computational Science and Engineering, Springer, Berlin, 2001, pp. 129-
136.
Adaptive Computation of Reactive Flows with
Local Mesh Refinement and Model Adaptation

Malte Braack 1 and Alexandre Ern 2

1 Institut fUr Angewandte Mathematik, Universitat Heidelberg


malte. [email protected]
2 Cermics, Ecole nationale des ponts et chaussees, Marne la Vallee, France
[email protected]

Summary. An adaptive method for reactive flows involving locally refined meshes
and different types of diffusion models is proposed. Starting with a less exact diffusion
model, the model is changed locally throughout the computational domain to a more
accurate and much more expensive model. An a posteriori error estimator provides
reliable information on where to refine the mesh and where to adapt the model.
Discretization and modeling errors are equilibrated.

1 Introduction
The underlying equations describing reactive flows are well-known but may
involve models of different complexity, scales and accuracy. In various cases, the
most accurate and validated model cannot be chosen in numerical simulations
because of the large amount of computational costs. Simpler models usually
need less computing time and involve less couplings between variables. For
instance, the choice of diffusion models in gas mixtures is not straightforward.
Although multicomponent diffusion models are accepted to be accurate [8],
simpler and less accurate models, e.g. Fick's law, are widely used in practice
for two- and three-dimensional simulations. While simple diffusion models, as
for instance Fick's law, may involve only diagonal diffusion, multicomponent
diffusion models leads to couplings between all chemical variables. For implicit
solvers, these couplings lead to a huge fill-in in the (sparse) Jacobians. Due to
the resulting high numerical cost of complex models, it is desirable to apply
the complex diffusion model only in those regions of the computational domain
where necessary; for instance in the flame front where a complex balance of
reaction, convection and diffusion phenomena takes place. However, it is not
a priori known, where an accurate diffusion model is necessary. In this work,
we present an adaptive method which automatically detects the regions where
an accurate diffusion model is important.
In the previous work [6]' the mathematical background for a posteriori
control of modeling errors and discretization errors is given. Other work ad-
dressing the estimation of modeling error includes [9, 13, 14]. For measuring
the modeling error the variational formulation of the partial differential equa-
tion together with a duality argument is used. The a posteriori error estimator
160 M. Braack, A. Ern

for the discretization error is obtained by using Galerkin orthogonality of a


finite element discretization [3]. These two aspects allow to measure the over-
all error in terms of user-defined output functionals of the numerical solution.
Both types of adaptivity (mesh-size and diffusion model) are merged in a such
a way that both sources of errors are equilibrated and subsequently reduced.
This strategy avoids the situation that the accuracy of the solution is affected
by a poor diffusion model although very fine meshes are used, and vice-versa.
As an extension of the theoretical fundamentals in [6]' here we present the
application of mesh size adaptation and diffusion model adaptation and show
numerical results for a combustion problem with two types of diffusion models.
We use the following notations: The usual L 2 -scalar-product in a sub do-
main w C [J will be denoted by (-,·)w. By (u, v) = (u, v) n we denote the
integration over the entire computational domain [J C IR d , d = 2, 3. We de-
note the velocity by v, pressure by p, temperature by T, the ns species mass
fractions by Yk, k = 1, ... ,nS) and density by p.

2 Variational formulation of the underlying equations


We start from the basic equations in variational formulation for steady-state
reactive viscous flow describing the conservation of mass, momentum, energy
and species mass fractions. For this we assemble all variables in the vector
U = (p, v, T, Y1, ... , Ys) which is an element of a functional Hilbert space V.
For test functions ¢ = (~, 1j;, cr, To, ... Ts) E V, s := ns - 1, the following
nonlinear forms are used:
a1(U)(¢) := (div (pv),~),
a2(U)(¢) := (p(v· V')v, 1j;) + (Jr, V'1j;) - (p, div1j;) ,

a3(U)(¢) := (pCpv· V'T,cr) + ()"V'T, V'cr) + l)hkmkwk,cr),


k=l
s
a4(U)(¢) := L {(pv· V'Yk,Tk) - (mkWk,Tk)} ,
k=l
4
a(u)(¢) := Lai(U)(¢).
i=l
The mass fraction Yn s of the last species is set to Yn s = 1 - 2::=1 Yi , in order
to ensure that the sum over all Yk is equal to 1. The density is considered as
a coefficient determined by the gas law

p=~;, m= (t~k)-l (1)


k=l k
with the universal gas constant R and the mean molar weight m. Coefficients
are the viscosity fL, the heat capacity cp at constant pressure, and the head
Adaptive Computation of Reactive Flows 161

conductivity>.. For each species k, we have the molecular weight mk, the
specific enthalpy hk' and the molecular production rate Wk. The viscous tensor
7r is given as usual for compressible Newtonian fluids. For ease of presentation,
in the form a3(u)(¢) the effect of temperature flux due to diffusion fluxes of
species with different specific heat capacities is neglected.
The form a3(u)(¢) does not yet include species diffusion. We consider the
following two models for the mass diffusion fluxes Fk:
Fick's law: diagonal diffusion driven by the gradient of mole fractions Xk =
Ykm/mk,
(2)
The diffusion coefficients Dr, = Dr, (y) are given by an empirical law which
is about 10% accurate, see [12].
Multicomponent diffusion: a full diffusion matrix driven by the gradients
of species mole fractions:

F!: = -PYk {~DklV'XI + Ok V'(lOgT)} . (3)

The diffusion coefficients Dkl are given only implicit as solutions of linear
systems. Therefore, the computational costs are higher as for the previous
diffusion model. However, this form of diffusion flux can be derived by the
theory of gases, see [8, 10].
Recent DNS computations [7] investigate the differences of these two models
and show a legibly impact of model (ii) especially for lean and rich hydrogen
flames.
Partial integration of the multicomponent model (3) leads to the semi-linear
form

d(u)(¢) := '2) -F!:, V'Tk).


k=l
The residual of the problem will be denoted by
e(u)(¢) := (1, ¢) - a(u)(¢) - d(u)(¢).
Now, the solution u fulfills the equation
u E U + V: e(u)(¢) =0 'I/¢ E V,
where non-homogeneous Dirichlet conditions for u are described by U. Homo-
geneous Dirichlet conditions are already included in the choice of the space
V.

3 Discretization
The discrete counterpart of the equations is the basis for the a posteriori error
estimator we need later for the adaptive procedure.
162 M. Braack, A. Ern

In order to switch locally between the diffusion models (2) and (3), we
introduce a (symbolic) parameter m and a subdomain am of n. In am, we
use multicomponent diffusion, and in a f := a \ am the simpler Fick's law.
We define the diffusion part
s
dm(u)(¢) := L -{(F/:, \lTk)n~ + (F(, \lTk)nf} '
k=O
and formulate a perturbed solution Um E V solving
a(u m)(¢) + dm(u m )(¢) = (j, ¢)
V¢ E V.
The way to determine the subdomain am will be explained later on.
The discretization is done by conforming finite elements on a triangulation
~ of a. We denote the corresponding space by Vh C V. In order to handle
the convective terms and the stiff pressure-velocity coupling one should add
stabilization terms Sh(·,·) to the discrete systems. Such stabilization can be
done in different ways. We use the stabilization concept for the v, p-coupling
proposed in [1] and for the convective terms in [11] and [2]. To this purpose, we
need certain restrictions on the meshes used. We assume that the triangulation
Th is organized patch-wise: ~ results from a global refinement of a mesh ~h.
Note that ~ contains in two dimensions twice as much hanging nodes as ~h.
The same construction is possible in three dimensions.
By i~h : Vh --* V2h we denote the nodal interpolation to the coarse grid, and
by 7rh := i - i~h : Vh --* Vh the projection operator which filters the small-scale
fluctuations The stabilization form reads now:
s(u)(¢) = (\l7rhP, op \l7rh~) + ((f3v . \l)7rhV, ov(f3v . \l)7rh'l/J)
s
+(f3r . \l7r hT , orf3r . V 7rhCJ ) + L(f3v . \l7rhYk, Okf3v . \l7rhT) ,
k=1
with f3v := pv, f3r := (pcpv + 0:) and piece-wise constant functions
op, ov, or, 01, ... , Os, depending on the mesh-size h. For further details, we refer
to [2].
This leads us to the definition of the discrete residual
ehm(U)(¢) := (j, ¢) - a(u)(¢) - dm(u)(¢) - Sh(U)(¢).
The reduced discrete system to be solved reads
Uhm E Vh : ehm(Uhm)(¢) = 0 V¢ E Vh .
The difference between the continuous and the reduced discrete residual is
ghm(U)(¢) := e(u)(¢) - ehm(U)(¢)
-d(u)(¢) + dm(u)(¢) + Sh(U)(¢)
L(F/: - F(, \lTk)nf + Sh(U)(¢).
k=O
This is the difference of the two diffusion models in the domain af' where
Fick's law is used, and the additional stabilization terms, which are usually
small. Since, in this work we do not focus on adaptively chosen stabilization
terms, we neglect the contribution Sh(U)(¢) in the expression ghm(U)(¢).
Adaptive Computation of Reactive Flows 163

4 A posteriori control

For the local refinement of the mesh and the adaptivity respect to the diffusion
model, we need an a-posteriori error estimator which gives us information
of the two error contributions. We aim to measure the error respect to an
arbitrary output functional j : V ~ JR. We formulate the following dual
residuals
i?*(U)(Z, cjJ) := j(cjJ) - a'(u)(cjJ, z) - d'(u)(cjJ, z),
i?hm(u)(z, cjJ) := j(cjJ) - a'(u)(cjJ, z) - d'r,,(u)(cjJ, z) - s~(u)(cjJ, z).
We use the dual solution Z E V to capture the influence of the error to the
functional:
Z E V: i?*(u)(z,cjJ):= 0 'VcjJ E V.
The corresponding reduced discrete version reads:
Zhm E Vh : i?hm(Uhm)(Zhm, cjJ) := 0 'VcjJ E Vh
The errors will be denoted by eu = u - Uhm and e z = Z - Zhm. We recall the
error representation in [6] wherein a proof is also given.

Theorem 1. If the semi-linear forms a(u)(.), d(u)(.), Sh(U)(cjJ) and the func-
tional j (u) are sufficiently differentiable with respect to u, then it holds
1
j(u) - j(Uhm) = ghm(Uhm)(Zhm) + 2"{ghm(uhm)(eZ) + g~m(Uhm)(eu, Zhm)}
1
+ 2"{i?hm(Uhm)(z - ihz) + i?hm(Uhm, Zhm)(U - ihu)} + R,
where ih : V ~ Vh is an arbitrary interpolation operator and a remainder R
which is cubic in the error e = {e u , e z}.

For the specific form of the remainder R and a proof of this Theorem, we refer
to [6].
The error representation of the Theorem stated above cannot be directly
used numerically, because it involves the unknown primal and dual solutions u
and z, respectively. However, the first term ghm(Uhm)(Zhm) can be easily com-
puted, because it depends only on the reduced discrete solutions Uhm and Zhm.
Furthermore, the terms ghm(Uhm)(e z ) and g~m(Uhm)(eu, Zhm) are quadratic in
the modeling error since they involve both, e as an argument and the difference
of the two diffusion models in the expression ghm' We neglect these contribu-
tions in the estimator. However, if more accuracy of the estimator is required,
these terms can also be approximated. Evaluation of the remainder R is not
worthwhile, because it is cubic in the error.
The terms i?hm(Uhm)(Z - ihz) and i?hm(Uhm, Zhm)(U - ihu) describe the
discretization error and have to be approximated by a numerical evaluation
of the interpolation errors Z - ihz and U - ihu. An efficient possibility to
do this, is the recovery process of the computed quantities by higher-order
polynomials, see [4]. For instance, in the case of quadrilaterals and piecewise
164 M. Braack, A. Ern

bilinear elements (so called Q1 elements), the interpolation can be done on


biquadratic elements. Let i~~ : Vh -+ ~~) be the quadratic interpolation of
piecewise bilinears on Th onto biquadratic finite elements on 12h.
The interpolation errors of z, for instance, will be numerically approximated
by
. ~ .(2)
Z - 2hZ ~ 22h Zhm - Zhm •

Taking into account that the residuals ehm(Uhm)(¢) and ehm(Uhm, Zhm)(¢)
with respect to a discrete test function ¢ E Vh vanish, leads to the following
estimator T) consisting of two parts
j(U) - j(Uhm) ::::; T) := T)h + T)m, (4)
._ 2"1{ ehm (Uhm )(.(2)
T)h . -
) * ( )(.(2))} , (5)
22h Zhm + ehm Uhm, Zhm 22h Uhm

T)m := ghm(Uhm)(Zhm). (6)


The part TJh of the estimator can be considered as contributions of the dis-
cretization, and the part T)m measures the influence of the model. For multi-
component diffusion, the evaluation of T)m is expensive. However, the gain of an
adaptive algorithm with local model modification becomes substantial, since
we do not need to include the global detailed model neither in each residual
evaluation nor in the Jacobian for solving the equations.
On the basis of the estimators T)h and T)m, local error indicators can be
derived. A standard method is partial integration of the diffusive parts and
the application of Cauchy-Schwarz on each element. We proceed in a different
way by filtering coarse grid contributions. We obtain nodal quantities T)h,i and
T)m,i for each node Hi of the mesh:
n

;=1
A proof is given in [6]. Further details, especially concerning the adaptation
strategy for equilibrating both error contributions are given in [5].

5 Computation of an ozone flame


We investigate the methodology presented above for an ozone decomposition
flame in a two-dimensional geometry with a moderate impact of the diffusion
model. Moreover, the complexity of the problem size is moderate enough to
obtain a reference solution on a very fine mesh. This reference solution is used
for validation of the error estimators. A more involved example can be found
in [5], where an hydrogen flame is investigated.
The flame under consideration is modeled with three chemical species,
namely 0 3 , O 2 , and a-atoms. The reaction mechanism consists of six reac-
tions, see [3]. At the inflow (x = 0), we prescribe 20% mass fraction for ozone
and 80% mass fraction for oxygen. Furthermore, the inflow velocity profile is
Adaptive Computation of Reactive Flows 165

parabolic with maximum velocity v = 35 cm/s. The computational domain


corresponds to a Cartesian tube, n = [0,2 cm] x [0,5 mm]. The temperature
is fixed at the inflow, bottom, and top boundary according to
T = T min + i1T· exp{ -O"(x - XO)2} ,
with Xo = 5mm, 0" = 105 , Tmin = 298K, and i1T = 502K. A homogeneous
Neumann condition is imposed at the outflow boundary.
Figure 1 shows the computed profile of the mass fractions of O-atoms
when the multicomponent diffusion model is used over the whole computa-
tional domain. For this type of flame, Fick's law is a good approximation of
the multicomponent diffusion model. Therefore, the difference between both
diffusion models is small, and the computation with Fick's law yields very
similar pictures. However, the difference in the model is in the same range as
the discretization error on the coarsest mesh.
Now, we report on the adaptive algorithm described before. The initial
mesh is an equidistant mesh with 585 nodes. We start with Fick's law (2)
for the species mass diffusion fluxes over the whole computational domain.
After computing the stationary solution Uh on this mesh and with the crude
diffusion model, we compute the associate dual solution Zh for the functional
j(u) = c In yodx, with a constant scaling factor c. This functional gives the
mean value of mass fractions of O-atoms in the computational domain n. The
error indicators rJh and rJm are obtained from (5) and (6). On the basis of rJh we
change the mesh-size locally by bisection of element edges. According to rJm,
we change the diffusion model locally from Fick's diffusion to multicomponent
diffusion.
This procedure is iterated various times. Results are listed in Table 1. The
first column contains the number of nodes of the mesh, the second column
contains the relative amount of cells (in percent) where the multicomponent
diffusion model is used. The two parts of the estimator rJh and rJm are listed
in columns 3 and 4, respectively. The sum of these two terms yields the error
estimator rJ (column 5). The effectivity index Ieff = rJjj(u - Uhm) is listed
in the last column and is obtained by using the reference solution computed
on a very fine mesh and with the accurate diffusion model. For an exact error
estimator, the efficiency index would be equal to 1. Our values are in the range
of 1.5 - 2.6, which means that the estimator rJ slightly overestimates the error.

Fjg. 1. O-atom mass fraction for the ozone flame. The maximum value is 1.24.10- 3 .
166 M. Braack, A. Ern

Table 1. Results for the ozone flame: number of nodes (#nodes); fraction of cells
flagged for multicomponent diffusion (% of multi.); estimator of the discretization
error 'f/h; estimator of the model error 'f/m; their sum 71; the true error j(u - Uhm);
the effectivity index Jeff

#nodes % of multi. 'f/h 'f/m 71 j(u - Uhm) Jeff

585 0 2.168 1.043 3.210 2.031 1.58


1047 21.1 1.250 9.953e-2 1.350 5.385e-1 2.51
2085 37.4 1. 584e-1 7.72ge-2 2.356e-1 1.378e-1 1.71
4871 48.9 7.488e-2 3.830e-2 1.132e-1 5.351e-2 2.12
12421 52.4 5.605e-2 1.602e-2 7.206e-2 3.186e-2 2.26
30013 66.3 5.02ge-2 9.372e-3 5.966e-2 2.757e-2 2.16
81021 79.4 2.160e-2 6.017e-3 2.761e-2 1.065e-2 2.59

10

0.1

0.01

0.001
500 1000 2000 4000 8000 16000 32000 64000
# nodes

Fig. 2. Ozone flame: estimator 'f/m for the modeling error; estimator 'f/h for the
discretization error; their sum 71; true error j(u - Uh) as a function of the number of
nodes (mesh points)

However, the error estimator is reliable since it provides an upper bound for
the actual error.
The adaptive algorithm balances both types of errors by adapting the mesh-
size and the model. Figure 2 illustrates how the two sources of error are equi-
librated. We plot the estimators 'TJh and 'TJm, their sum 'TJ, and the true error
j (u - Uhm) as a function of the number of mesh nodes. The estimator and the
true error clearly show the same asymptotic behavior.
The sequence of locally refined meshes with 1047, 2085, and 4871 nodes
is shown in Figure 3. The darker areas indicate the part of the computational
domain where the multicomponent diffusion model is used. In the remaining
(light) part, the simple Fick law is used. We observe that the estimator de-
Adaptive Computation of Reactive Flows 167

Fig. 3. On the upper half of each picture, the areas where multicomponent diffusion
is used (dark/red areas) and where Fick's law is used (light areas) are indicated; the
lower half shows the corresponding locally refined mesh

tects quite well the reaction area where a difference in both diffusion models
influences the accuracy of the output functional j(u).

References
1. R. Becker and M. Braack. A finite element pressure gradient stabilization for
the Stokes equations based on local projections. Calcolo, 38(4):173-199,2001.
2. R. Becker and M. Braack. A two-level stabilization scheme for the Navier-Stokes
equations. In Feistauer, editor, Enumath Proceedings, Berlin, submitted, 2003.
Springer.
3. R. Becker, M. Braack, and R. Rannacher. Numerical simulation oflaminar flames
at low Mach number with adaptive finite elements. Combust. Theory Modelling,
3:503-534, 1999.
4. R. Becker and R. Rannacher. An optimal control approach to a posteriori error
estimation in finite element methods. In A. Iserles, editor, Acta Numerica 2001.
Cambridge University Press, 2001.
5. M. Braack and A. Ern. Coupling multimodeling with local mesh refinement for
the numerical solution of laminar flames. in preparation, 2003.
6. M. Braack and A. Ern. A posteriori control of modeling errors and discretization
errors. Multiscale Model. Simul., 1(2):221-238, 2003.
7. J. de Charentenay and A. Ern. Multicomponent transport impact on turbulent
premixed H2/02 flames. Combust. Theory Modelling, 6:439-462, 2002.
8. A. Ern and V. Giovangigli. Multicomponent Transport Algorithms. Lecture Notes
in Phys\cs, m24, Springer, 1994.
9. L. Fatone, P. Gervasio, and A. Quarteroni. Multimodels for incompressible flows:
iterative solutions for the navier-stokes/oseen coupling. M2AN, Math. Model.
Numer. Anal., 35(3):549-574, 2001.
168 M. Braack, A. Ern

10. V. Giovangigli. Multicomponent Flow Modeling. Birkhauser, Boston, 1999.


11. J .-L. Guermond. Stabilization of Galerkin approximations of transport equations
by subgrid modeling. Model. Math. Anal. Numer., 33(6):1293-1316, 1999.
12. J. O. Hirschfelder and C. F. Curtiss. Flame and Explosion Phenomena. Williams
and Wilkins Cp., Baltimore, 1949.
13. J. T. Oden and S. Prudhomme. Estimation of modeling error in computational
mechanics. 1. Comput. Phys., 182:496-515, 2002.
14. E. Stein and S. Ohnimus. Anisotropic discretization- and model-error estimation
in. Comput. Methods Appl. Mech. Engrg., 176:363-385, 1999.
An Alternative to the Least-Squares Mixed
Finite Element Method for Elliptic Problems

Jan Brandts 1 and Yanping Chen 2

1 Korteweg-de Vries Institute for Mathematics, Faculty of Science, University of


Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam, Netherlands.
[email protected]
2 Department of Mathematics, Xiangtan University, Xiangtan 411105, China.
[email protected]. en

Summary. In this paper we derive a strengthened Cauchy-Schwarz inequality that


enables us to formulate a short and transparant proof of the coercivity of a Least
Squares Mixed Finite Element bilinear form. Also, it shows that the coupling between
HJ(D) and H(div; D) is weak enough to be neglected. This results in an alternative
way to compute approximations of both the scalar variable and its gradient for second
order elliptic problems.

1 Least squares mixed finite elements

We consider the application of the least-squares mixed finite element method


to the following second order elliptic problem. Let D be a bounded domain of
arbitrary dimension with Lipschitz continuous boundary. Given f E H-1(D),
find U E H6(D) such that

-div(A\7u) = f in D, U = 0 on aD, (1)

where A is uniformly symmetric positive definite with Lipschitz continuous


coefficients and all eigenvalues in the interval [,82, ,8-2] for some ,8 E (0,1].
The least-squares mixed finite element method first writes the second order
differential equation as a system of two first order equations,

p = -A\7u in D, divp = f in D. (2)

Denoting the L2 inner product and norm by (', ')0 and I . 10, the following
quadratic functional J : HJ (D) x H( div ; D) -> lR

J(v, q) = If - divql6 + (q + A\7v, A-1(q + A\7v))o, (3)

is minimized over suitable subspaces Vh x r h C HJ(D) x H(div; D). Setting


the first variation in (3) to zero leads to the following discrete problem to solve,
170 J.R. Brandts, Y.P. Chen

where the bilinear form B : [HJ(D) x H(div; D)] x [HJ(D) x H(div; D)] ~ lR.
is defined by
B(w, r; v, q) = (div r, div q)o + (r + AV'w, A -1 (q + AV'v))o. (5)
It was proved in [4] that B is continuous and coercive. The proof for coercivity
was in our opinion somewhat tedious and may also be derived as a simple
corollary of the following lemma. This lemma, which may be called a strength-
ened Cauchy-Schwarz inequality, may also serve useful in a different context.
Firstly, define norms on H(div; D) and HJ(D) and on their product space by
Ilqll~iv,A = (A-lq, q)o + Idivql5, IvitA = (AV'v, V'v)o, (6)

II(v, q)llixdiv,A = Ivli,A + Ilqll~iv,A- (7)


If A = I, they reduce to the usual norms on those spaces. They are equivalent
to them in case A i- I. In particular, we will explicitly use that for all v E
HJ(D),
,62I v li:::; IvitA :::; ,6-2I v li- (8)
Lemma 1. For all q E H( div ; D) and v E HJ (D) we have

I(q, V'v)ol :::; ,lvll,Allqlldiv,A (9)


where
d Ivlo
o< , = < 1 with 0<d = sup -I-I . (10)
V d2 + ,62 O#vEH.j (st) v 1

Proof. Let q E H( div ; D) and v E HJ (D) be given. Then

I(q, V'v)ol = I(A- l / 2q, Al/2V'v)ol :::; Ivll,AV(A-lq,q)O. (11)


By definition of d and in view of (8), we obtain by using Green's formula that
d
I(q, V'v)ol = I(divq,v)ol :::; Ivloldivqlo:::; j3lvll,Aldivqlo. (12)

Multiply (11) by d/,6 and square the result, and add it to the square of (12).
This gives

d2 v ll,A
d2 ) I( q, V'v)o I2 :::; ,62I
( 1 + ,62 2 ( (A -1 q, q)o + Idlv
. q 12)
0 = ,62
2
d Iv 121,A II q 112div,A-
(13)
This proves the statement. <>
Remark 1. Instead of using Ivlo :::; dlvll :::; d/,6lvll,A as in (12), we might also
have chosen to define just one single constant 1] giving rise to a possibly smaller
constant, in (9) as follows,

o< ,= ~+ 1]2 1
with 0 < 1] = sup
O#VEH.j (st)
~
IVIr,A
(14)
Alternative Least-Squares Mixed Elements 171

Remark 2. If we define [[ . [ii A = [ . [6 + [ . [i A' then following the same lines


as in the proof of Lemma 1 ';"e get, ,

(15)

This seems a more symmetric and in a sense even stronger result. However,
the norm [[ . [[ 1,A is too strong for the applications to be discussed below.

Corollary 1. With"( as in (10), for all non-zero (v, q) E HJ([2) x H(div; [2)
we have
B(v, q[v, q) ~ (1- "()[[(v, q)[[ixdiv,A > O. (16)
Proof. By definition of B we find

B(v,q[v,q) = [[(v,q)[[IxdivA+2(q,
, \7v)o ~ [[(v,q)[[ixdivA-2[(q,
, \7v)o[. (17)

Lemma 1 and the inequality 2[ab[ :::; a2 + b2 give that

2[(q, \7v)o[ :::; 2"([V[1,A[[q[[div,A :::; ,,([[(v, q)[[IXdiv,A- (18)

Since "( < 1, the statement follows. <;


Remark 3. Using Lemma 1, it can also easily be shown that B is continuous
in the sense that

[B(w, r; v, q)[ :::; (1 + "() [[ (w, r) [[ lxdiv,A [[ (v, q) [[ lxdiv,A,

which, in view of the constant for the coercivity in Corollary 1, is an appealing


result.

The continuity and coercivity of B give that the Lax-Milgram lemma assures
the existence of a unique pair (Uh,Ph) E Vh x rh of approximations of (u,p).
Cea's lemma now implies quasi-optimality in the sense that for all (Vh, qh) E
HJ([2) x H(div; [2),

From this perspective, it may seem as if it is not wise to employ spaces Vh and
rh which have different approximation qualities, because the approximation
quality of the product space is not better than that of the worst of the two.
However, it can be shown that under rather weak conditions, there is still a
certain amount of independence present between the two approximations. In
terms of the linear algebraic problem that arises from (4) this means that the
two diagonal blocks of the system matrix, which correspond to the interactions
of Vh with itself and of rh with itself, are only weakly coupled by the two off-
diagonal blocks, which represent the interactions between Vh and rho In fact,
172 J.H. Brandts, Y.P. Chen

this coupling is so weak that solving just two well-chosen linear systems with
the diagonal blocks only, leads to approximations u~ and p~ for which

II(Uh - U~, Ph - p~)IIIXdiv,A ::::: Chll(u - Uh, P - Ph)lhxdiv,A. (20)

This defines an approximation method for (u, p) in Vh x rh that gives ap-


proximations that are superclose to the least-squares mixed approximations
(Uh' Ph)' but which is computationally much more efficient.

2 Solving the system of linear equations

Let (Vj)~l be a basis for Vh and (qj)f=l a basis for rho Define matrices
S = (Sij), C = (Cij), D = (d ij ) and G = (gij) by means of their entries

Sij = (A'Vi' 'VVj), Cij = (vi,divqj), dij = (A-1qi,qj), 9ij = (divqi,divqj).


Let iN E ]RN have entries (j, div qj), and iM E ]RM entries (j, Vj). Define
PN E ]RN and UM,uk E ]RM as the solutions of the systems

Let e~ be the lh row of the k x k identity matrix, then


N M
Ph = 2)e1vPN)~ E rh and Uh = L(e~VfuM)Vj E Vh (22)
j=l j=l

are the solution of (4), whereas


M
ui, = L(ekuk)vj E Vh
j=l
yields the standard finite element approximation of U in Vh .
Now, given an approximation u'Jvt of UM, we may substitute it in the first
(block)-equation of (21) to obtain an approximation p~ of PN from

(D + G)p~ = iN - Cu'Jvt. (23)

Write u~ and p~ for the finite element functions corresponding to the vectors
u'Jvt
and p~, respectively. Then the counterpart of (23) in terms of the finite
element spaces is to write out (4) and substitute u~ for Uh, resulting in

Notice that
Alternative Least-Squares Mixed Elements 173

(A-1qh, qh) + (divqh' divqh) = Ilqhll~iv,A' (25)


hence D + G is invertible and the approximation p~ well-defined. It can be
used to compute a hopefully better approximation uk of UM by substituting
p~ into the second (block)-equation in (21), and to solve Suk = -C*p~. If
we continue like this, we get the Block Gauss-Seidel iteration for the linear
system (21) as follows,

(D + G)p~ = fN - Cu~,
given u~, iterate { (26)
Su{jl = -C*p~.

If we write u{ and p{ for the finite element functions that correspond to the
vectors u~ and p~, respectively, this iteration reads as

We will now study its convergence.

Corollary 2. With, as in (10), the iterates (u{,p{) defined in (27) satisfy

,-llluh - u~+III,A ::; Ilph - p~lldiv,A ::; ,IUh - U~II,A' (28)


Proof. Subtracting (27) from the least-squares mixed discrete equations (4)
and testing with qh = Ph - p{ and with Vh = Uh - u~+1 results in

(29)
and
Iuh - u Hl12
h I,A =- h , d'IV (Ph - Phj)) .
(uh - u HI (30)
Applying Green's formula and Lemma 1 gives the statement. I)

This proves convergence of the Block Gauss-Seidel iteration (21) with a con-
vergence factor that is independent of the dimension of the subspace. In each
step of the Block Gauss-Seidel iteration, a system with S and one with D + G
should be solved. Even though for many finite element spaces this can be
done in optimal computational complexity with MultiGrid solvers [1], there is
no need to do so more than once if the right start vector is chosen. Indeed,
choosing u~ := U h, the standard finite element approximation of the problem
defined via the linear system SUM = fM, and computing p~ is already suffi-
cient in most situations. This procedure costs only one solve with S and one
with D+G.

Theorem 1. Let u~ = uh be the standard finite element approximation of


(1) resulting from solving the linear system SUM = fM. Solve p~ from (D +
G)p~ = fN - CUM' Then, under the assumtion that Vh and Th satisfy
174 J.H. Brandts, Y.P. Chen

- (rh' div rh) is a Babuska Brezzi stable mixed finite element pair,
- For all f E L2(D), the solution u of (1) satisfies II u l12 ::; Clflo,
- \Iv E HJ(D) n H2(D), 3Vh E Vh, Iv - vhll ::; Chlvl2,
- \lq E [Hl(D)j2, 3qh E rh, Iq - qhlo ::; Chlqll,
- \lq E [H2(D)j2, 3qh E rh, Idivq - divqhlo ::; Chlql2,
we have that

Proof. The assumptions above imply that IUh - Ui,ll ::; Chll(u - Uh,P-
Ph)IIIxdiv, which is a super closeness result proved in [2]. By equivalence of the
norms defined in terms of A with their counterparts defined by taking A = I,
we may switch to A norms. Corollary 2 then shows that Ilph - p~lldiv,A shares
the same upper bound. Hence, the statement follows. <>

3 Conclusions

The above shows that under rather weak conditions, ui" p~ are higher order
perturbations of the least-squares mixed finite element solutions Uh, Ph. Since
the computation of ui" p~ requires only one solve with S and one with D +
G, they are much cheaper to compute than Uh, Ph, whereas they cannot be
distinguished from one another. Hence, it does not make sense to apply the
least-squares mixed method under these conditions.
In fact, instead of putting energy into solving for p~, it is also possible 1 to
construct yet another approximation for Ph in rh by means of a simple local
post-processing ui,. It is however not clear when or if such a postprocessed
approximation will also be a higher order perturbation of Ph, as p~ is.

Acknowledgments

J.H. Brandts was supported by a Research Fellowship of the Royal Nether-


lands Academy of Arts and Sciences. Y.P. Chen was supported by National
Science Foundation of China, the Foundation of China State Education Com-
mission and the Special Funds for Major State Basic Research Projects. Both
authors gratefully acknowledge this support, and thank the referee for valuable
comments on an earlier version of this paper.

References
1. Arnold, D.N., Falk, R.S, Winther, R. (1997). Preconditioning in H(div; Q) and
applications. Math. Comp., 66, 957-984.
1 We thank Prof. R. Rannacher for pointing this out.
Alternative Least-Squares Mixed Elements 175

2. Brandts, J.H., Chen, Y., Yang, J. (2003). Analysis of least-squares mixed finite
elements in terms of standard and mixed elements. UvA Numerica Preprint 08,
University of Amsterdam, Netherlands. Submitted.
3. Pehlivanov, A.I., Carey, G.F., Lazarov, R.D. (1994). Least-squares Mixed Finite
Elements for Second Order Elliptic Problems. SIAM J. Numer. Anal., 31, 1368-
1377.
4. Pehlivanov, A.I., Carey, G.F., Vassilevski, P.S. (1996): Least-squares mixed finite
element methods for non-selfadjoint elliptic problems. I. Error estimates. Numer.
Math., 72, 501-522.
Limit Analysis Method in Electrostatics

Igor A. Brigadnov

North-Western State Technical University, Millionnaya 5, St. Petersburg, 191186,


Russia brigadnov@nwpi. ru

Summary. The limit analysis problem (LAP) for estimation of electric durability
for a dielectric in a powerful electric field is examined. The appropriate dual problem
is formulated. After the standard piecewise linear continuous finite-element approxi-
mation the dual LAP is transformed into the problem of mathematical programming
with linear equality constraints. This finite dimension problem is effectively solved
by the standard method of gradient projection.

1 Introduction
Investigation of the electrostatic boundary-value problems (BVPs) for di-
electrics in a powerful electric field is of particular interest in both theory
and practice. The current research is motivated by significance and practical
interests in Electrical Engineering and Microelectronics.
The electric state of a medium in a given domain fl C ]R3 is characterized by
the bulk and surface density of charges and by vectors of electric field intensity
E = {Ed E ]R3, electric induction D = {Dd E ]R3 and electric current density
J = {Ji } E ]R3. Vector D is introduced by the relation D = coE + P, where
co ~ 8.85.10- 12 is the electric permittivity of a vacuum and P E ]R3 is the
vector of polarization density [11, 14, 16]. For the electric field intensity the
electrostatic potential u is introduced such that E(x) = - Vu(x) for almost
every x E fl, where V = a/ax is the differential vector-operator.
In weak electric fields the conductivity current in a dielectrical medium
is practically absent and the simplest linear constitutive relation E f---+ D is
used [11, 14, 16]. As a result, for the solution of the appropriate linear BVPs
various effective analytical and numerical methods have been worked out, for
example, in [15].
The classic method of estimation of puncture conditions is based on the
point criteria. Namely, it is assumed that the electric puncture sets in when
max{IE(x) I : x E fl} :2: Eo, where E is the solution of the linear electrostatic
BVP and Eo > 0 is the critical value, which is measured in physical experi-
ments on thin plates in a homogeneous electric field [15, 16]. Unfortunately,
for dielectrics with a complex shape in nonhomogeneous electric fields this
method introduces a large error.
In powerful electric fields the essentially nonlinear phenomena of polariza-
tion saturation (IPI :::; p* < +(0) and powerful growth of the electric current
Limit Analysis Method in Electrostatics 177

(aIJI/aIEI » Eo) must be taken into account [9, 11]. As a result, the com-
plimentary physical parameter of the dielectrical medium A > 0 always exists
such that ID - JI :S A < 00.
Within the framework of the variational method, the existence of the limit
electrostatic load (such external charges with no solution of the electrostatic
BVP) was pointed out in [6, 7]. From the physical point of view this effect is
treated as a loss of electrostatic balance, i.e. as the beginning of the electric
puncture of dielectric. For calculation of the limit electrostatic load the original
variational limit analysis problem (LAP) was formulated. From the mathemat-
ical point of view this problem needs a relaxation because its solution belongs
to the space BV(D) of scalar functions with bounded variations having the
generalized gradient as the bounded Radon's measure [1, 13, 17].
Unfortunately no clear physical interpretation can be provided for the fully
relaxed LAP. Therefore, the original partial relaxation of LAP was proposed
in [6, 7]. This relaxation is based on the special discontinuous finite-element
approximation (FEA) , which was proposed earlier by the author for LAP in
non-linear elasticity [4, 5]. But after relaxation the appropriate finite dimen-
sional problem becomes ill-conditioned and thus needs special preconditioned
numerical methods as, for example, presented by the author in [3].
In this paper the dual LAP in electrostatics is formulated. It has a clear
physical interpretation and after the standard piecewise linear continuous FEA
is transformed into a problem of mathematical programming with linear equal-
ity constraints. This finite dimensional problem is effectively solved by the
standard method of gradient projection, which is easily adapted for parallel
computations.
The numerical results show that the proposed limit analysis method has
a qualitative advantage over the classic technique of estimation of puncture
conditions. This method can be used in Electrical Engineering and Microelec-
tronics.

2 LAP in Electrostatics
Let a dielectrical medium occupy a domain D C ]R3. The polarization and
ionization properties of the dielectrical medium are described by two constitu-
tive relations D = :O(x, E) : D x ]R3 -7 ]R3 and J = j (x, E) : D x ]R3 -7 ]R3, as
shown, for example in [11, 14, 16]. In practice the relations Di = Eij(X, E)E j
and J i = G"ij (x, E)Ej are used, where {Eij} and {G"ij} are the symmetric
second-order tensors of dielectric permittivity and conductivity, respectively.
Here and in what follows over repeated indices the summation rule applies.
For an isotropic medium Eij = EOij and G"ij = G"Oij, where 10 = E(X, lEI) and
G" = G"(x, lEI) are scalar functions and Oij is the Kronecker symbol. For a ho-
mogeneous medium {Eij} ,{ G"ij} = const(x).
Let the following quasi-static electric influences act on the dielectric: a bulk
charge with density (! in D, a surface charge with density g on a portion r 2
178 LA. Brigadnov

of the boundary, and a portion r l of the boundary is grounded, i.e. u == 0 on


rl. Here r l u r2 = an, r l n r 2= 0 and ITII > O. Point charges are absent.
In electrostatics it is assumed that an external source of the electric field
compensates the work of the electric current in the dielectric. In accordance
with the classical Thomson principle the true electrostatic potential is the
solution of the following variational problem:

u*=arginf{II(u)-A(u): UEV}, (1)

II(u) = J
n
<p(x, V'u(x)) dn, A(u) = J {2udn+ J gud"(,
n r2
where V = {u: n -+ lR; u(x) = 0, x E rl} is the set of admissible electro-
static potentials, <P is the specific and II (u) is the full potential energy of the
electric field, A( u) is the work of an external source on a transference of charges
from infinity to D-
In compliance with the Thompson and Joule-Lenz laws the function
<p(x, E) is calculated as

J
I
<p(x, E) = Ei [l\(x,pE) - Ji(x,pE)] dp.
o
In Fig. 1 experimental constitutive relations lEI f--+ IDI (line 1) and lEI f--+
IJI (line 2) for real isotropic dielectrical media are presented [9, 11, 16]. The
appropriate function of effective induction is shown by the line 3. It is easily
seen that for every dielectrical medium the scalar ..\ > 0 always exists such
that for every E E lR 3 and almost every x E n the following estimation is true:

<P(x, E) ::::; ..\(x) lEI, (2)

where
..\(x) = max { Ii>(x, E) - J(x, E) I : E E lR3 } .

From the physical point of view ..\ is the electric saturation. In what follows,
we consider a homogeneous dielectrical medium for which ..\ = const(x).
From the estimation (2) it follows [12, 17] that the set of admissible elec-
trostatic potentials is defined as the following subspace

V = {u E WI,I(n): u(x) = 0, x E rl}. (3)

From the mathematical point of view the variational problem (1) can have
no solution because the functionalII(u) -A(u) can be unbounded from below
on the set V. In particular, after the point Eo = IE*I (see Fig. 1) the full
potential energy of the electric field II( u) has growth in Ilulll,l less than linear.
But the work of the electric field on the external charges A(u) always has linear
growth in Ilulll.l. As a result, for an admissible minimizing sequence U m E V
Limit Analysis Method in Electrostatics 179

Fig. 1. Experimental (lines 1,2) and effective (line 3) constitutive relations

with Ilu m l1 11 ~ 00 we have II(u) - A(u) ~ -00 as m ~ 00 (details see in


[7]), i.e. th~ electrostatic variational problem (1) is not well-posed. Namely,
the limit electrostatic load exists, i.e. external charges (e, g) with no solution
of the problem (1). From the physical point of view this effect can be treated
as the beginning of the electric puncture of dielectric because it corresponds
to a loss of the electrostatic balance between an external source of charges and
the dielectrical medium. Here we have the full analogy with some problems of
global stability and fracture in Mechanics of Solids [2, 4, 5, 17].
For a definition of the limit electrostatic load we introduce the set of ad-
missible external charges for which the minimized functional in problem (1) is
bounded from below on V and, therefore, a solution of this problem exists:
B = { (e,g) E LOO(Q)xLOO(r2): inf(II(u) - A(u): u E V) > -oo}.
The set is non-empty because for small external charges the problem (1) is
transformed into the classic variational problem of linear electrostatics, which
always has a solution [9].
For arbitrary external charges (eo,go) E B we examine the sequence of
charges, which are proportional to the real parameter t ;::: O.
Definition 1. The number t* ;::: 0 is defined as the limit parameter of electro-
static loading and (t*eo, t*go) is the limit electrostatic load, if (teo, tgo) E B
for 0:::; t :::; t* and (teo, tgo) f. B for t > t*.
As a result, the analysis of the electrostatic balance in a dielectric comes
to the investigation of the set of positive parameters t for which the one-
parametric functional
180 I.A. Brigadnov

It(u) = II(u) - t A(u)


is bounded from below on the set of admissible electrostatic potentials V from
(3). The following basic result has been proven recently by the author in [6, 7].

Theorem 1. The finite limit parameter of electrostatic loading exists. It is


calculated as the solution of the following limit analysis problem:

(4)

where A is the electric saturation from (2).

From the Definition 1 it follows that for t* < 1 the electrostatic variational
problem (1) has no solution. This phenomenon corresponds to the beginning of
the electric puncture of the dielectrical medium. Therefore, the limit analysis
problem (4) is the main problem for the estimation of puncture conditions for
dielectrics of complex shape in powerful nonhomogeneous electric fields that
closes one of the modern fundamental problems [11, 14, 16].
It was pointed out in [6, 7] that the solution of LAP (4) belongs to the
space of scalar functions with bounded variations BV(D) :J W1,1(D). This
space consists of functions u E Ll(D) having the bounded total variation

where au is the vector Radon's measure denoting the gradient of the function u
J
in the sense of distribution theory [1, 13, 17]. In this case laul = lV'u(x) I dDJ
n n
for every u E W1,1(D). Here and in what follows the point between vectors
denotes the scalar product of these vectors in 1R3.
The Banach space BV(D) is weak* sequentially compact with the norm
IlullBV = IIul11 + J laul, therefore, the mathematically correct and fully relaxed
n
LAP has the following form:

t. ~ A inf { f 18u l' u E BV(I?), "I" ~ 0, A(u) ~ 1 }.


Unfortunately, at present this problem has no a clear physical interpretation.
Within the framework of the classic approach the original partial relaxation
of LAP can be used [6,7]. This relaxation is based on the special discontinuous
FEA [4, 5]. But after relaxation the appropriate finite dimensional problem be-
comes ill-conditioned and thus needs special preconditioned numerical methods
as, for example, presented by the author in [3].
Limit Analysis Method in Electrostatics 181

3 Dual LAP
We construct here the dual LAP for a homogeneous dielectric using methods
from duality theory [10]. Thus, we introduce the set of admissible fields of the
effective induction compensating for the electric field of external charges in the
weak sense [12],

G ~ { DE L=(!l,ll!.") , [<V. D + ")ud!l ~ 0,

!(n.D-g)Ud"f=O, 'v'UEV}, (5)


r 2

where n is the unit normal vector on the boundary r 2 and \7 . D is the


distribution.
The dual LAP is formulated as the problem of finding an admissible effec-
tive induction of minimal intensiveness

T. = inf { IIDlloo: D E G} , (6)

where IIDlloo = inf{ sup(ID(x)1 : x E Q \ w): Iwl = a}.


Theorem 2. For solutions of the problems (4) and (6) the relation t.T* = A
is true.

Proof. It is easily verified that dual LAP (6) is equivalent to the problem

T;l = sup {v> 0: DE G, IlvDlloo :::; I}. (7)

For every value v > 0 and field D E G the following equality is true:

V=V+Vinf{!(n.D-g)Ud"f- !(\7.D+e)udQ: UEV}.


r2 n
After integration by parts and taking into account the boundary condition on
r l , we find

(8)

We introduce the bilinear functional L(D,u) = JD. \7udQ on


n
LOO(Q,JR3 )XWl,1(Q) [12], then from (8) it follows that the problem (7) has
the form
182 I.A. Brigadnov

7;1 = sup{inf(L(D,u): u E V, A(u) = 1): DEC, IIDlloo ~ I}.

For the bilinear functional L(D, u) the classic equality

supinf L(D, u) = inf sup L(D, u)


D U U D

is fulfilled [10]' therefore,

7;1=inf{sup(L(D,u): DEC, IIDlloo~I): uEV, A(u)=I}.

For every vectors E, D E lR 3 the relation lEI = sup{E . D : IDI ~ I} is


true. As a result, for every u E V we have the equality completing the proof:

sup{L(D, u): DEC, IIDlloo ~ I} = J


n
lV'u(x)1 dn.

We showed that the estimation of the electric durability of a homogeneous


dielectric is equivalent to finding an effective induction of minimal intensive-
ness, which compensates the electric field of external charges. As a consequence
of Theorem 2, if 7* > .A then the electrostatic variational problem (1) has
no solution. The problem (6) is fully correct because the admissible effective
induction has three independent components satisfying only one differential
equation, i.e. a minimization in two independent components is possible.

4 Finite-element approximation of dual LAP


By the standard FEA for the domain n c lRn (n = 1,2,3) the sets nh = uTh
and n = anh are constructed such that In\nhl --+ 0 and Ir\r,,1 --+ 0 for
h --+ +0, where h is the characteristic step of approximation and Th is a simplex
[8]. Every FEA is described by the set of nodes {xk} :=1'
For the admissible effective induction the piecewise linear continuous ap-
proximation is used [8]:

(k=I,2, ... ,m),

where Dk = {Df} E lR 3 is the admissible effective induction in the node xk,


l}ik : n h --+ lR is continuous and linear on every simplex scalar function such
that l}ik(Xr ) = Okr (k,r = I,2, ... ,m). The SUpp(l}ik) consists of simplices
having xk as a common node.
After the standard piecewise linear continuous FEA of external charges
({lh,9h) and normal vector lih on the boundary r2, the set of admissible ef-
fective inductions (5) is approximated by the set

Ch = {Dk E lR 3 : Dk. V'l}ik(X) + (lh(X) = 0, x E nh;


lih(Xr) . Dkl}ik(Xr) = 9h(X r ), Xr E r~},
Limit Analysis Method in Electrostatics 183

which is the convex set with piecewise linear boundaries, i.e. it is a simplex in
the space of global variables ]R3m.
As a result, the dual LAP (6) is approximated by the problem of mathe-
matical programming with linear equality constraints

If the number of finite elements equals ml and the number of nodes on


r£ equals m2 then the number of free variables in the problem (9) equals
3m - (ml + m2). It is easily verified that the minimum number of variables
equals 2n + 1 (n = 1,2,3), which is reached for the domain coinciding with
the simplest n-dimension simplex because in this case m = n + 1, ml = 1,
m2 = n+ l.
The objective functional in the problem (9) is a combination of convex hy-
percones in the space ]R3m. Therefore, due to the linearity of the constraints in
the set G h this finite dimensional problem is effectively solved by the standard
method of gradient projection, which is easily adapted for parallel computa-
tions.

5 Numerical results

In the numerical experiment the following electrostatic BVP was considered:


an isotropic and homogeneous dielectric has the form of a finite cylindrical rod
with the radius of section a and length 2l. The small round blocks of dielectric
are covered by conductors having charges ±Q.
In view of the axial symmetry, the initial LAP (4) has the following form:

t* = A inf{I(u) : u E V, u(r, 1) = I},

I (u) = 2 JJ ~2 (~~) + (~~)


o
1

0
1 [ 2 2]1/2
rdrdz,

V = {
II au
u E W ' ((0, l)x(O, 1)) : u(r,O) = 0, o)r,O) = 0, or (O,z) =
au °} ,
where A is the electric saturation from (2) (see Fig. 1), ~ = lla is the geometric
parameter and r E [0,1]' r.p E [0,21T), z E [0,1] are the reduced cylindrical co-
ordinates [4, 6]. Here fl = (0, l)x(O, 1) and r 2 = {r E [0,1]' z = I}. It is easily
verified that puncturing charge Q* = 1Talt •.
From Theorem 2 we have t* = AIT* ,where the parameter T* is the solution
of the appropriate dual LAP (6) on the following set of admissible effective
inductions:

G={DEL OO (fl,]R3): V'·D=O III fl, Dz=l on r 2, D<p=O in fl}.


184 LA. Brigadnov

In the computer experiment a uniform N x N triangulation of the domain


n was used. As a result, the problem of mathematical programming (9) was
solved for 3(N +1)2 variables satisfying 2N2+N +1 linear equality constraints.
The experimental results for ..\ = 1 are shown in Table 1. It is easily seen that
7h '\. 7* = 1 as h ---> +0 that fully coincides with results presented in [6, 7].

Table 1. Numerical results.

The above analytical and numerical results are new. They are of practi-
cal interest, but more theoretical and experimental research is desirable. For
example, the presented limit analysis method can be used for the design of
electric isolators.

References
1. Ambrosio, L., Fusco, N., Pallera, D. (2000): Functions of Bounded Variations
and Free Discontinuity Problems. Oxford Uni. press, New York
2. Brigadnov, LA. (1996): Existence theorems for boundary value problems of hy-
perelasticity. Sbornik: Mathematics, 187, 1-14
3. Brigadnov, LA. (1996): Numerical methods in non-linear elasticity. In: Desideri,
J.-A., Le Tallec, P., Onate, E., Periaux, J., Stein, E. (eds.) Numerical methods
in engineering. Wiley, Chichester, 158-163
4. Brigadnov, LA. (1998): Discontinuous solutions and their finite element ap-
proximation in non-linear elasticity. In: Van Keer, R., Verhegghe, B., Hogge,
M., Noldus, E. (eds.) Advanced computational methods in engineering. Shaker
Publishing B.V., Maastricht, 141-148
5. Brigadnov, LA. (1999): The limited static load in finite elasticity. In: Dorfmann,
AI, Muhr, A. (eds.) Constitutive models for rubber. A.A.Balkema, Rotterdam,
37-43
6. Brigadnov, LA. (2001): Numerical analysis of dielectrics in powerful electrical
fields. Computer Assisted Mech. Eng. Sci., 8, 227-234
7. Brigadnov, LA. (2002): Variational-difference method for estimation of electrical
durability of dielectrics. Mathematical Modeling. 14(4), 57-66 (in Russian)
8. Ciarlet, Ph.G. (1980): The Finite Element Method for Elliptic Problems, North-
Holland Pub!. Co., Amsterdam
9. Duvaut, G., Lions, J.-L. (1972): Les Inequations en Mecanique et en Physique.
Dunod, Paris
10. Ekeland, 1., Temam, R. (1976): Convex Analysis and Variational Problems.
North-Holland Pub!. Co., Amsterdam.
11. Eringen, A.C., Maugin, G. (1989): Electrodynamics of Continua, VoU and II.
Springer, New York.
12. FuCik, S., Kufner, A. (1980): Nonlinear Differential Equations, Elsevier Sci. Pub!.
Co., Amsterdam-Oxford-New York
Limit Analysis Method in Electrostatics 185

13. Giusti, E. (1984): Minimal Surfaces and Functions of Bounded Variations.


Birkhiiuser, Boston
14. Landau, L.D., Lifshitz, E.M. (1957): Electrodynamics of Continuum Media.
Gostekhizdat, Moscow (in Russian)
15. Mirolyubov, N.M., Kostenko, M.V., Levinstein, M.L., Tikhodeev, N.N. (1963):
Solving Methods for Electrostatic Fields. Vischaya Shkola, Moscow (in Russian)
16. Tamm, I.E. (1989): Bases of the Electricity Theory. Nauka, Moscow (in Russian)
17. Temam, R. (1983): Problemes Mathematiques en Plasticite. Gauthier-Villars,
Paris
Finite Element Mesh Adjusted to Singularities
Applied to Axisymmetric and Plane Flow

Pavel Burda!, Jaroslav Novotny2, Bedfich Sousedik 3 and Jakub Sistek!

1 Department of Mathematics, Faculty of Mechanical Engineering, Czech Univ.


Technology, Karlovo nam. 13, CZ-121 35 Praha 2, Czech Republic
[email protected]. cz, sistek@seznam. cz
2 Institute of Thermomechanics AS CR, Dolejskova 5, CZ-18200 Prague 8, Czech
Republic [email protected]
3 Department of Mathematics, Faculty of Civil Engineering, Czech University of
Technology, Thakurova 7, CZ-166 29 Praha 6, Czech Republic
sousedik@seznam. cz

Summary. We consider the Navier-Stokes equations for incompressible flow in ax-


isymmetric tubes with abrupt changes of diameter. This paper is based on results for
the asymptotics of the solution in the vicinity of nonconvex internal angles where the
velocities possess an expansion u(p, 19) = p"l cp( 19) + ... , where p, 19 are local spherical
coordinates. Problems with corners, edges, etc. are often successfully solved by the
finite element method using an adaptive strategy usually based on a posteriori error
estimates. In our paper we suggest an alternative approach for the mesh refinement
near the corners, which makes use of the above expansion. It gives very precise re-
sults in a cheap way. We give numerical results and show the pros and cons of this
approach.

1 Introduction

We consider the Navier-Stokes equations for incompressible fluid flow in ax-


isymmetric tubes with abrupt changes of diameter. At present, problems with
corners, edges, etc. are often successfully solved by the finite element method
using an adaptive strategy based on a posteriori error estimates. Concerning
linear elliptic equations let us mention the work of I. Babuska and W. C. Rhein-
boldt [2], concerning the Stokes problem the paper of M. Ainsworth and J.T.
Oden [1]. Other references can be found in [5]. In [5] we derived an a posteriori
error estimate for the Stokes problem in a 2D polygonal domain, and in [6]
also for 3D case.
In this paper we apply an alternative approach for the mesh refinement
near the corners, which makes use of the knowledge of the local behaviour of
the solution near the corners.
For the stationary Navier-Stokes equations we proved in [4] that for non-
convex internal angles the velocities near the corners possess an expansion
u(p,1'J) = p"Y<p(1'J) + ... (smoother terms), where p,1'J are local spherical coor-
dinates. E.g. for the angle a = ~7r we have 'Y = 0.5444837. It is well-known
Finite Element Mesh Adjusted to Singularities 187

that using the standard finite element method on triangles with polynomials
of degree p = 1,2,3 we have the a priori error estimate

which cannot be improved by increasing the degree of the approximating poly-


nomials.
In this paper the local behaviour of the solution near the singular point is
used to design a priori, a mesh which is adjusted to the shape of the solution.
The first part of the paper is devoted to the behaviour of the singularity
near the corner. The second part deals with the impact of the singularity on
the refinement of the mesh. We show an example of the mesh with quadratic
polynomials for velocity. Then we use this mesh for the numerical solution of
flow in the channel with corners.

2 Steady N avier-Stokes Equations Near the Corner


The asymptotic behaviour of plane flow with corner singularities has been
studied e.g. by Kondratiev [11], Ladeveze, Peyret [12].
In this section we deal with pipe flow (axially symmetric). To study the
asymptotic behaviour of the solution of the Navier-Stokes equations for an
incompressible fluid, we utilize the stream function - vorticity formulation,
which in cylindrical geometry reads

(1)

(2)

(3)

where r, Z are cylindrical coordinates, Ul = VZ , U2 = Vr are velocity compo-


nents in the z, r directions, respectively, w is the vorticity, 'IjJ is the stream
function, and v is the viscosity. We assume that all derivatives exist here at
least in the generalized sense.
In [4] we studied the stationary Stokes flow. There, substituting w, Ul, U2
from (2) - (3) into (1) we got
1 {)4'IjJ {)4'IjJ {)4'IjJ
o= v{-(~+2"'2"'2+~)-
r uZ uZ ur ur
1 {)3'IjJ {)3'IjJ {)3'IjJ {)3'IjJ 3 {)2'IjJ 3 {)'IjJ
- r2 ({)z3 + {)z2{)r + {)z{)r2 + {)r3 ) + r3 {)z2 - r4 {)z}· (4)
We are interested in the asymptotic behaviour of the solution near the
corners. One example of our solution domain is shown in Figure 1, where the
188 P. Burda et al.

corners are the points P, Q. The lower edge of the domain coincides with the
axis of symmetry.

~_I_____6______~
Fig. 1. The solution domain fl
Investigating the equation (4) and using the technique of Kondratiev [11],
we showed in [4J that near the corners such as P or Q, the solution 1/J possesses
the expansion

L L ajSp-iAj Ins p. 1/Jsj(fJ) + w(x, y),


pj-l

1/J(x, y) = (5)
j s=o
where p, fJ are polar coordinates, w is smooth, and Aj are the poles of multiplic-
ity Pj of the corresponding resolvent R(A). More specifically, for the internal
angle a = ~1T', we proved that the leading term of the expansion for the velocity
components is as follows
Ul(p, fJ) = pO.54448374cpl(fJ) + ... , 1= 1,2, (6)
Similar results have been proved for the Navier-Stokes equations.

3 Finite Element Solution to Steady Navier-Stokes


Equations
We are concerned with the finite element solution of the stationary Navier-
Stokes equations. We would like to use the information about the asymptotics
of the flow near the singular point, in order to suggest adequate local mesh
refinement.
In [4J we showed that the behaviour near the singular point, of the ax-
isymmetric flow and of the plane flow, are the same. So in what follows, for
simplicity we deal with plane flow, in the domain fl which has the same shape
as in Figure 1. We investigate the Navier-Stokes equations in primitive vari-
ables u (velocity vector) and P (pressure)

(Y'u) . u - v..du + Y'p = 0 in fl. (7)


For the finite element approximation we take fl to be a polygon in R2, for
simplicity. Let {lhh-+o be a regular (d. [10]) family of triangulations of n.
Let pm be the set of all polynomials of degrees m > O. Let X h , Mh be
the finite element spaces of Hood-Taylor elements (cf. e.g. F. Brezzi, M. Fortin
[3]), i.e.
Xh = {u E HJ(fl)2 n C(fl), ul K E P2(K)2, K E lh}, (8)
Mh = {p E L6(fl),pl K E pl(K), K E Th},
Finite Element Mesh Adjusted to Singularities 189

cf. [7]. Velocity values are given in corner nodes and midside nodes of the
quadrilateral or the triangle, and pressure values only in corner nodes, in order
to satisfy Babuska-Brezzi condition [3]. Then the velocity components and the
pressure are approximated as continuous functions of spatial variables.

4 Refinement of FEM Mesh Adjusted to Singularity


Near the corner P where the angle is ~11', velocities have the leading term in
the expansion as given in (6). There p is the distance from the corner, {) the
angle. Note that 8u iJ;,{J) -7 (X) for p -7 O.
Now, in this section we assume the Stokes flow, for simplicity. A priori
estimate of the finite element error is (cf. [10]' [8])

(9)

where k = 2. Taking into account the expansion (6), and following Johnson's
idea [9], we derived in [4] the estimate

1 ~C
U 12Hk+l(T)~ JrT

"" C r T2h - k ),
p2 h -k-1) P dp ,- (10)
rT-hT

where hT is the diameter of the triangle T of a triangulation Th , and rT is the


distance of the element T from the corner. So, in order to get an error estimate
of the order O(hk), we should guarantee

h T2k r T2h-k) ~
~
h2k . (11)

This lead us in [4] to an algorithm for generating the mesh near the corner:
Algorithm. Let r1 be the distance of the large element from the corner. For
given auxiliary stepsize h we compute recursively:
for i = 1,2, ... ,N:

5 Model Problem
Consider two-dimensional flow of a viscous, incompressible fluid described by
the Navier-Stokes equations in a domain with corner singularity, cf. Fig. 2.
190 P. Burda et al.

Fig. 2. Geometry of the channel


Due to symmetry, we solve the problem only on the upper half of the
channel, cf. Figs 1, 3. On the inflow we consider a parabolic velocity profile,
at the outflow 'do nothing' boundary condition. On the upper wall no-slip
condition is imposed and on the lower wall a condition of symmetry is assumed
(i.e. only the y- component of velocity equals
zero). We consider the following parameters: Tab. 1. Resulting refinement
v = 0.0001 m 2 /s, Uin (max.) = 1 m/s.
The algorithm for mesh refinement de- i ri(mm) hi(mm)
scribed in the previous section is applied to 1 0.25000 0.06316
the corner where the channel or tube sud- 2 0.18685 0.05110
denly decreases in diameter (forward step, 3 0.13575 0.04050
corresponding to point A in Fig. 1). 4 0.09526 0.03129
We start with rl = 0, 25mm, h = 0,1732 mm, 5 0.06396 0.02342
k = 2, "( = 0,5444837. This corresponds to 6 0.04054 0.01681
the contribution of cca 3% of individual el- 7 0.02374 0.01138
ements to the global error. This way we get 8 0.01235 0.07077
ten diameters of elements, cf. Tab. 1. 9 0.00528 0.00381
10 0.00147 0.00147

6 Design of the mesh detail near the corner


Using the parameters from Table 1, J. Sistek [13] suggested three variants of
mesh refinement near the corner (Figures 3-5). Mesh No. 1 was a classical
mesh used before. He suggested two other variants in order to fit better to
the algorithm of mesh refinement, especially to its polar coordinate nature
(Figures 4-5).

Fig. 3. Mesh No.1


Finite Element Mesh Adjusted to Singularities 191

Fig. 4. Mesh No.2 Fig. 5. Mesh No.3

In Figs. 6-7 we present the whole computational mesh and its detail, using
Mesh 3.

I I I I I I I I I I
Fig. 6. The whole computational mesh

- i""/ ""/
::::
- :::::
~

------
-====
'\I

"-
--
V\V\ /\V\
Fig. 7. The mesh - detail

7 Evaluation of the Approximation Error

To evaluate the error we use an a posteriori error estimate derived for the
Stokes problem e.g. in [6]. Here {u, p} = (u, v, p) is the vector of the exact
solution, {Uh,Ph} = ('il, 'iJ,p) is the approximate solution computed by the
192 P. Burda et al.

FEM. The error e = (e u , ev , ep ) = (u - u, v - v,p -]5). The a posteriori


estimate is:
(12)
where
U2(u - u, V - v, p - p, [2) = L:z I (e u , ev ) III,f}z + Ilep I16,nl'
£2(u,v,p,[2)=CL:z (hffnl(rr(u,v,p)+r§(u,v,p)) d[2+ Inl r~(u,v,p)d[2)
where rl, r2 a r3 denote the residuals with respect to the I-st and the 2-nd
N-S equation and the continuity equation (cf. [7], concerning choice of the
constant C). To evaluate the error on elements we use the modified absolute
error, defined as

(13)

where I[21 is the area of the whole domain lind I[2zl is the mean area of elements
obtained as I[2zl = ~. Here n denotes the number of all elements in the
domain.

8 Numerical results

On Figures 8 - 11 we present the graphical output of entities that chartacterize


the flow in the channel. On Figs. 8, 9 we observe how strong the singularity is
both for the velocity and for the pressure (note that here the flow is from the
right to the left, to have better view). Figure 8 shows that the location of the
peak of the singularity of velocity is outside the patch where the refinement
was done. One can see that, again it is the behaviour of the pressure that is
decisive, cf. Fig. 9.

Fig. 8. Velocity component v


(Mesh 3) Fig. 9. Pressure p (Mesh 3)
Finite Element Mesh Adjusted to Singularities 193

Fig. 11. Streamlines near the forward


Fig. 10. Isolines of v (Mesh 1) and backward step.

On Figures 12-13 we present errors on elements near the corner (forward


step) for Mesh 1 and Mesh 2.

9 Conclusions
Pros:
- distribution of the error on elements is quite uniform (esp. for Mesh 2);
- strength of singularity (both for the velocity and the pressure) is well cap-
tured;
- the algorithm of adjusted mesh refinement has been confirmed;
- the efficiency of the algorithm: to achieve the desired precision one needs to
carry out only one computation (compared with an adaptive approach, the
same precision would require approximately 10 successive refinements).
Cons:
- suitable only for sigularities due to "geometry";
- adaptive approach is much more robust.
Nevertheless, efficient refinement of the mesh near the corners still remains a
challenge.

References
1. Ainsworth, M., Oden, J., T. (1997): A posteriori error estimators for the Stokes
and Oseen problems. SIAM J. Numer. Anal., 34, 228 - 245
2. Babuska, 1., Rheinboldt, W., C. (1978): A posteriori error estimates for the finite
element method. Internat. J. Numer. Meth. Engrg., 12, 1597 - 1615
3. Brezzi, F., Fortin, M. (1991): Mixed and Hybrid Finite Element Methods.
Springer, Berlin
4. Burda, P. (1998): On the F.E.M. for the Navier-Stokes equations in domains
with corner singularities. In: Krizek, M., et al. (eds) Finite Element Methods,
Supeconvergence, Post-Processing and A Posteriori Estimates, Marcel Dekker,
New York, 41-52
194 P. Burda et a!.

0.B42

1.680 0.741

Fig. 12. Errors on elements (Mesh 1)

5. Burda, P. (2000): An a posteriori error estimate for the Stokes problem in a


polygonal domain using Hood-Taylor elements. In: Neittaanmiiki, et a!. (eds)
ENUMATH 99, Proc. Conf. Numerical Mathematics and Advanced Applications.
World Scientific, Singapore, 448 - 455
6. Burda, P. (2001): A posteriori error estimates for the Stokes flow in 2D and 3D
domains, In: Neittaanmiiki, P., Krizek, M. (eds) Finite Element Methods, 3D.
(GAKUTO Internat. Series, Math. Sci. and App!., vo!' 15) Gakkotosho, Tokyo,
pp. 34-44
7. Burda, P., Novotny, J., Sousedik, B. (2003): Adaptive mesh refinement based on
a posteriori error estimates for the Stokes flow in a 2D Problem, in: F. Brezzi et
a!. (ed), ENUMATH 2001, Proc. Conf. Numerical Mathematics and Advanced
Applications, Springer Verlag, pp. 681 - 690
8. Douglas, J., Jr., Wang, J. (1989): An absolutely stabilized Finite Element
Method for the Stokes problem, Math. Comp., 52, 495-508
9. Johnson, C. (1994): Numerical Solution of Partial Differential Equations by the
Finite Element Method, Cambridge University Press
10. Girault, V., Raviart, P. G. (1986): Finite Element Method for Navier- Stokes
Equations. Springer, Berlin
11. Kondratiev, V. A. (1967): Asimptotika reshenija uravnienija Nav'je-Stoksa v
okrestnosti uglovoj tocki granicy, Prik!. Mat. i Mech., 1, 119-123
Finite Element Mesh Adjusted to Singularities 195

Fig. 13. Errors on elements (Mesh 2)

12. Ladeveze J., Peyret, R. (1974): Calcul numerique d'une solution avec singularit
des equations de Navier-Stokes: ecoulement dans un canal avec variation brusque
de section, Journal de Mecanique, 13, 367-396
13. Sistek, J. (2003): Solution of the fluid flow in a tube with singularity using
suitable mesh refinement (in Czech), Student Conference STe, eVUT Praha

Acknowledgements
This research has been supported partly by the GACR Grant No. GA 101/02/0391.
and partly by the State Research Project No. J04/98/210000010.
The Edge Stabilization Method for Finite
Elements in CFD

Erik Burman l and Peter Hansb0 2

1 Department of Mathematics, Ecole Poly technique Federale de Lausanne,


Switzerland, erik. [email protected]
2 Department of Applied Mechanics, Chalmers University of Technology, SE-41296
Goteborg, Sweden, [email protected]

Summary. We give a brief overview of our recent work on the edge stabilization
method for flow problems. The application examples are convection-diffusion, with
small diffusion parameter, and a generalized Stokes model.

1 Introduction

In order for the standard Galerkin finite element method to be stable for
problems in CFD, some care must be taken. For convection-dominated flow
problems, stabilization must be introduced, while for mixed methods the ap-
proximations of velocities and pressure must either be carefully balanced or,
again, stabilized. Examples of stabilization methods are the SUPG lSD-method
[8]' the discontinuous Galerkin method [9]' the residual free bubbles [2], sub-
viscosity models [7], and pressure projection methods for the Stokes problem
[5]. The relation between the different approaches is also well understood in
most cases. However for complex flow problems, most of these methods have
drawbacks. The SUPG stabilization becomes non-symmetric and the formula-
tion does not permit lumped mass; the residual free bubbles and discontinuous
Galerkin method add additional degrees of freedom; the projection methods in-
troduce the need of hierarchical meshes for the projection or the sub-viscosity
model. In this paper we give a brief review of our recent work, [3, 4], on
an alternative method originally proposed by Douglas and Dupont [6]. This
method stabilizes convection-diffusion-reaction problems, as well as equal or-
der interpolation methods for the generalized Stokes problem, by adding a
least-squares term based on the jump in the gradient of the discrete solu-
tion over element boundaries. With this simple concept we obtain stability for
convection-reaction-diffusion problems also in the vanishing viscosity limit as
well as for the generalized Stokes problem with equal order interpolation.
The advantage of this method, in comparison with the others mentioned,
is that no additional degrees of freedom are added, no hierarchical meshes are
needed, the formulation remains symmetric, and the mass can be lumped for
efficient time marching and treatment of stiff source terms. The drawback is
an increased number of non-zero elements in the stiffness matrix due to the
The Edge Stabilization Method for Finite Elements in CFD 197

fact that the gradient jump term couple neighboring elements. The implemen-
tation also requires an element neighbor data structure that is not necessarily
available in standard finite element codes.

2 Model problems
As a first model problem, we consider, in D C lR d , d = 2,3, the problem of
solving
cy u + f3 . 'V u - 'V . (s 'V u) = f in D (1)
with, for simplicity, u = 0 on aD. Here, f is a given source term, f3 is a given
smooth velocity field, satisfying 'V . f3 = 0, and cy and s are bounded positive
functions.
The weak form of this problem is to find u E HJ
(D) such that

A(u,v) = (f,v) \:Iv E HJ(D), (2)


where

A(u,v):= !n(cyuv+s'Vu.'Vv+ f3 .'VUV)dX and (f,v):= !nfVdX.

The second model problem that we consider is a generalized Stokes prob-


lem, given by the partial differential equation

cyu - vLlu + 'Vp = f in D,


'V·u=g in D,
(3)
u·n=O on aD
vu· t = ° on aD.

where D is bounded polygonal domain in lR d with boundary aD, d = 2,3 and


cy and v are two positive parameters, that are not allowed to vanish simulta-
neously, and f is a force term. By n we denote the outward pointing normal
to D and t is a vector orthogonal to n. This problem can be written in weak
form as follows: Find

when v > 0, alternatively

for v = 0, and p E Q = L2(D)/lR when v > 0, alternatively p E Hl(D) for


v = 0, such that

a(u, v) + b(p, v) - b(q, u) = L(v, q), \:I (v , q) E V x Q, (4)


198 E. Burman, P. Hansbo

where

r LCTuivi+vVui,Vvidx,
d

a(u,v) = b(P,v)=-tpV.VdX,
Jn i=1 "

in in
and
L(v,q) = f· vdx - gqdx.

In the following, we shall denote the L 2 -scalar product by (.,.) and the
corresponding norm by I . II·

3 The finite element methods


For our first model problem, the finite element method consists of seeking
piecewise polynomial approximation u h E V h c HJ(rl), and for our second
model problem Uh and Ph, where Uh E W h C V and Ph E Qh C Q, with V h ,
W h , and Qh built from continuous functions.
Consider a partitioning of rl into a conforming triangulation Th of affine
simplicies K. We shall be concerned with the approximation

and a continuous pressure space,

It is well known that Galerkin methods, based on V h , for convection domi-


nated problems produce severe oscillations in the presence of rapidly varying
features of the solution, and that the combination Wh x Qh is unstable for the
generalized Stokes problem. The stabilization method is similar in both cases
Our finite element methods are defined as follows.
Model problem I: find Uh E V h such that
A(Uh'V) + J(Uh' v) = (j,v) \Iv E vh, (5)
where
J(U, v) = L ~
K 2 JoK
r Ih~K[VU]' [V v] ds
(6)
= L ~ r
K 2 JoK
Ih~K[nK' Vu] [nK . Vv] ds.
Here, hoK is the size of EJK, [q] denotes the jump of q across EJK for EJKnEJrl =
0, [q] = 0 on EJKnEJrl 1= 0, nK is the outward pointing unit normal to K, and
I is a constant. We also introduce the local mesh size hK as the largest of the
hoK associated with element K, and we will assume that hK/hoK < C where
C is a fixed constant.
The Edge Stabilization Method for Finite Elements in CFD 199

Model problem II: Find (uh,ph) E W h X Qh such that

a(uh, v) + b(ph' v) + 3(Uh, v) = L(v, 0) in [2,


(7)
b(q, u) - j(Ph, q) = L(O, q) in [2,

for all (v,q) E W h X Qh, where

(8)

and
3(u, v) := r 'Yh~l [V' . u][V' . v] ds,
LK ~218K (9)

For these methods we have, under additional regularity conditions, con-


sistency in that the exact solution fulfills the discrete equations (because the
jumps in the continuous variables vanish). What we also need to show is sta-
bility. The stability estimates obtained using edge stabilization is less imme-
diate than that obtained in the case of streamline diffusion or discontinuous
Galerkin. For our first model problem, it is well known (cf. [8]) that the crucial
term to control is IlhiP,6· V'uhll. For a Galerkin method, no control whatsoever
of this term exists. In a discontinuous Galerkin method one exploits the fact
that hK,6 . V'Uh is in the finite element test space and hence can be chosen
as test function, while in a streamline diffusion method it is simply included
in the test space. In the case of edge stabilization we do not have hK,6 . V'Uh
in the finite element space. However, something which is close is there (e.g.,
its interpolant), and the point is that the difference is controlled by the edge
stabilization. Thus, thanks to the term J(Uh' v), we get the necessary control
of Ilh~P,6· V'uhll·
For the second model problem the jump terms similarly give us control of
Ilh~PV'Phll necessary in the Stokes case and also Ilh~PV' . uhll needed in the
Darcy case. We then need to separate the Darcy case, where we use s = 1,
and the Stokes problem, where we use s = 2. This is similar to streamline
diffusion stabilization in the N avier-Stokes case, where the streamline diffusion
stabilization parameter must change from O(h'k) to O(hK) when going from
viscosity-dominated to convection-dominated situations.
The details are given in [3, 4]; here we just give an indication of where the

°
stability comes from, exemplified using model problem I. More precisely, we
shall sow that there exists some ( ~ (0 > such that

for ,6 constant. Extensions to ,6 E W~ are straighforward. To this end, let Hi


be the set of all triangles Ki containing node i and assume that the cardinality
200 E. Burman, P. Hansbo

of M is bounded uniformly in i. Let F K be the set of all test functions 'Pi such
that K C SUPP'Pi and Di = UNi Ki. We will consider a function wE Po(K),
and its representation in the finite element basis w defined by

WIK = WIK 2: 'Pi· (10)


iErK
It follows that w = w everywhere except on elements adjacent to Dirichlet
boundaries where the boundary nodes are not included in the finite element
space. We have:
Lemma 1. Suppose that K is an element with at least one node on a Dirichlet
boundary then
2 d+1 _
IlwilK = -ni( w , w), (11)
where ni denotes the number of interior nodes of the element.
Proof. The proof is immediate noting that

We will now proceed to prove that

The operator 7rh : Po(K) -7 Vh, which denotes the lowest order Clement
operator is constructed as follows.

7rhW = '"""'
L... Wi'Pi, where Wi = m(Di) 7t
1 '"""' wIKim(K').
. (12)

In the following we will also write WIKi - WIK = I:~i [w], with [w] denoting
the jump across element boundaries and the sum is taken over the shortest
"path" from element Ki to element K.
It is now straightforward to show that the projection error is controlled by
the operator Js(w, w)

Ilh~p(7rhW-w)112=2:1 hk(2:(m(~.) 2: WIKi m(Ki))'Pi-wfdx


K K iErK 'KiENi
=2:1 hk(2: m(~.) 2: (wIKi-wIK)m(Ki)'Pi)2dx
K K iErK 'KiENi
r 1 K 2
:::; C ~ JK hk i~K m(Di)2 K~Ni{~[W]} m(Ki)2dx

:::; c 2: r h~1[w]2ds:::; cJs(w,w),


K J8K
The Edge Stabilization Method for Finite Elements in CFD 201

where we used the upper bound on the number of triangles neighboring to


a node and a scaling argument. We have proved the following:
Lemma 2. If w is some piecewise constant function, w is defined by (10) and
is the Clement interpolant on V h , then the edge stabilization term satisfies
7rh

(13)

for some / ;:::: /0 > °independent of hK.


Note that by the construction of w we get less stabilization in elements
adjacent to Dirichlet boundaries than in the interior of the domain, hence
we expect to get poorer stabilizing properties close to sharp out flow layers
(when diffusion is present), something which is confirmed by the numerical
experiments.

4 Numerical examples

4.1 Convection-diffusion

We consider the domain (0, 1) x (0, 1), with (J = 0, E = 10- 6 , and f3 = (1-y, x).
° ° °
°
The boundary condition on the inflow boundary is u = at y = and u =
at x = except for 0.3 < y < 0.5, where u = 1. We compare the the numerical
solutions using the present method and the streamline diffusion method in
Figure 1. The results are of comparable quality.

Fig. 1. Streamline diffusion (left) and gradient jump stabilization (right) for
convection-diffusion

4.2 Stokes problem

To illustrate the absence of boundary layers in the pressure using the gradient
jump method, we consider the Poiseulle flow problem on (0,4) x (0,1) on an
202 E. Burman, P. Hansbo

unstructured mesh. The inflow boundary velocity is u = (y * (y - 1),0), we


have set a = 0 and 1/ = 1. At x = 4, we apply a natural boundary condition.
We solve this problem using P1P1-approximations with the streamline diffu-
sion method and the gradient jump method on the mesh shown in Figure 2.
Also shown is the computed velocity field (indistinguishable for the different
methods). Finally, in Figure 3, we show the pressure isolines for the streamline

Fig. 2. Mesh and computed velocity field

diffusion method (top) and the gradient jump method (bottom).

References

1. Brezzi, F, Fortin, M. (1991): Mixed and Hybrid Finite Element Methods.


Springer, New York
2. Brezzi, F., Hughes, T.J.R., Marini, L.D., Russo, A., Stili, E.A. (1999): A priori
error analysis of residual-free bubbles for advection-diffusion problems. SIAM J.
Numer. Anal., 36, 1933-1948
3. Burman, E., Hansbo, P. (2002): Edge stabilization for Galerkin approximations of
convection-diffusion problems. Chalmers Finite Element Center Preprint 2002-17
4. Burman, E., Hansbo, P. (2003): Edge stabilization for the generalized Stokes
problem: a continuous interior penalty method. Chalmers Finite Element Center
Preprint 2003-16
5. Codina, R., Blasco, J. (2000): Analysis of a pressure-stabilized finite element
approximation of the stationary Navier-Stokes equations. Numer. Math., 87, 59-
81
The Edge Stabilization Method for Finite Elements in CFD 203

~1[~ [~1 1 1~ [ [ 1 [1 1 I) 1 1~1 1) 1~1 ~ )~1 1 1 1 1 ~


I I I I I I I I I I I I I I I I~ I ~ [ 1 [ 1 1 1 1 1 [1 1 1 1 1 1
Fig. 3. Pressure isolines for streamline diffusion (top) and the gradient jump method
(bottom)

6. Douglas, J., Dupont, T. (1976): Interior penalty procedures for elliptic and
parabolic Galerkin methods, in: R. Glowinski, R., Lions, J. L. (eds) Comput-
ing Methods in Applied Sciences. Springer Berlin
7. Guermond, J.L. (1999): Stabilization of Galerkin approximations of transport
equations by subgrid modeling. M2AN Math. Model. Numer. Anal., 33, 1293-
1316
8. Johnson, C., Navert, U., Pitkaranta, J. (1984): Finite element methods for linear
hyperbolic equations. Comput. Methods Appl. Mech. Engrg., 45, 285-312
9. Johnson, C., Pitkiiranta, J. (1986): An analysis of the discontinuous Galerkin
method for a scalar hyperbolic equation, Math. Comp., 46, 427-444.
Analysis and Computation of Dendritic Growth
in Binary Alloys Using a Phase-Field Model

Eric Burman l ,2, Marco Picasso l and Jacques Rappaz l

1 Institut d'Analyse et Calcul Scientifique, Ecole Poly technique Federale de


Lausanne, 1015 Lausanne, Switzerland
2 supported by the Swiss National Foundation

Summary. A solutal, anisotropic phase-field model for dendritic growth of an


isothermal binary alloy is considered. Existence of a weak solution is established pro-
vided the physical anisotropy is small enough. A semi-implicit finite element method
is proposed to solve the problem. A priori and a posteriori error estimates are de-
rived when the physical anisotropy is small. An adaptive algorithm which aims at
producing meshes with large aspect ratio is proposed. Numerical results show that
accurate solutions can be obtained, even when the physical anisotropy is large.

1 Introduction

A phase-field model for the dendritic growth in a domain [2 of an isothermal


binary alloy is considered. The mathematical model contains the physical de-
scriptions of [22, 20] and consists in a parabolic system of nonlinear equations
set in [2. The unknowns are the phase-field ¢, which is an order parameter
taking the value 1 in the solid phase and 0 in the liquid phase (see Fig. 1)
and the concentration c of the binary alloy. In the phase-field approach, ¢ is
regularized, both ¢ and c vary rapidly but smoothly across the thin solid-
liquid interface. A typical profile of ¢ and c across the horizontal abscissa Xl
is presented in Fig. 2.

Thickness
of the solid-liquid
interface

000 • • • • • • • • • • • • • • • • / Xl

¢=o
Liquid

Fig. 1. The phase-field ¢


Analysis and Computation of Dendritic Growth in Binary Alloys 205
c

Fig. 2. Typical profiles of the phase field (left) and concentration field (right). The
phase-field has values zero or one, except in the phase change region. The concentra-
tion field changes rapidly across the phase change region, but may also vary outside
the phase change region.

A general mathematical formulation of this solidification problem is the


n
following. Given a bounded domain of]R2 with boundary an
and outer unit
normal n, two functions Co, ¢o : n --+ ]R and a time interval (0, T) we consider
the problem of finding ¢, C : n x (0, T) --+ ]R such that

a~~ -div (A(V'¢)V'¢)) -S(c,¢) =0 in n x (O,T), (1)

~~ -div (D l (¢)V'c+D 2 (c,¢)V'¢) =0 in n x (O,T), (2)


A(V'¢)V'¢· n = 0 on an x (0, T), (3)
(Dl(¢)V'c + D 2 (c, ¢)V'¢) . n = 0 on an x (0, T), (4)
¢(O) = ¢o, c(O) = Co in n. (5)
Here a E ]R is a positive parameter and A(·) is the matrix defined for ~ E
]R2 \ {O} by
(6)

where B(~) denotes the angle between the vector ~ and a preferential direction,
a is the real-valued function defined by

a(B) = 1 + a cos(/"i:B) , (7)


with a :2: 0 (the anisotropy parameter) and /"i: a positive integer corresponding
to the number of dendritic branching directions. Without loss of generality, the
horizontal abscissa Xl can be choosed as preferential direction and we have

where el is the unit vector in the horizontal direction. From the physical point
of view, (1) is nothing but a generalization of a mean curvature - or Allen-Cahn
206 E. Burman et al.

- equation, see for instance [9, 21] for a general presentation. This equation
is obtained by taking the derivative of a free energy functional accounting
for phase transformation, double well barriers and an interfacial anisotropic
energy contribution

It can be noticed that the derivative DE of E with respect to 'V ¢ is given by

< DE('V¢), 'V1jJ >= L A('V¢)'V¢. 'V1jJdx,

which is the weak form corresponding to the second term of (1). Due to the
double well barriers, the function S(·,·) in (1) contains small parameters that
force the phase-field ¢ to take values 0 or 1, except in a small region, see again
Fig. 1. Finally, equation (2) corresponds to solute conservation. We refer to
[22, 20] for details about the physical derivation of the model.

2 Existence

Mathematical and numerical studies corresponding to (1)- (5) have already


been presented in the isotropic case, that is when a = 0 thus A(·) = I. More
precisely, existence has been proved in [17] using a Faedo-Galerkin method,
a finite element procedure together with a priori error estimates have been
proposed in [13] and adaptive finite elements have been presented in [14].
Existence of a weak solution in the anisotropic case a#-O has been proved in
[8]. More precisely, if ¢o, Co E L2(Q), if S, D1 and D2 are continuous, bounded
Lipschitz functions satisfying, for all ¢ in 1R :

(8)

and if a ~ "/-1' then problem (1)-(5) has a weak solution

such that

a(~~,v)+ LA('V¢)'V¢.'Vv- LS(c,¢)v=o,


(~~'W)+ LD 1 (¢)'VC.'VW+ LD2 (C,¢)'V¢.'Vw=0,
for all v, W E H1(Q), a.e. 0 < t ~ T. Hereabove, < ',' > denotes the duality
pairing between H1(Q) and its dual. Moreover, if the functions Sand D2
satisfy, for all ¢, c in 1R
Analysis and Computation of Dendritic Growth in Binary Alloys 207

S(c, 0) = S(c, 1) = 0,
D 2(0,¢) = D 2(1,¢) = 0,

they can be extended by zero outside the interval (0,1), and a maximum
principle holds for ¢ and c, that is, if 0 ::::: ¢o, Co ::::: 1, then 0 ::::: ¢(t), c(t) ::::: 1,
for all t.
In order to prove the existence result, the implicit Euler scheme is consid-
ered. Given an integer N, we set T = TIN the time step, t n = nT, n = 0, ... , N,
¢o = ¢o and CO = Co. For n = 1,2, ... , N, ¢n-l and cn - 1 being known, let ¢n
and cn be approximations of ¢(tn) and c(t n ) such that

in fl, (9)

in fl. (10)

Existence of ¢n is obtained rewriting the first equation as a minimization


problem. The functional to be minimized is defined by the potential

where the function f is such that ~~ (c, ¢) = -S(c, ¢). The existence of a unique
minimizer is a consequence of the fact that the Ginzburg-Landau potential EO

1a2(B(~))1~12dx
defined by
E(~) = ~
2 J!

is strictly convex V~ E L2(fl)2, whenever a < "L 1 .

3 Finite elements and error estimates


3.1 A priori error estimates

For any 0 < h < 1, let T;" be a conforming triangulation of fl into triangles
K with diameter hK less than h. Let Vh be the usual finite element space of
continuous, piecewise linear functions on the triangles of T;". The finite element
scheme corresponding to (9) (10) is considered. For each n = 1..., N, we are
first looking for ¢f; in Vh such that

for all Vh E Vh and then for cf; in Vh such that


208 E. Burman et al.

for all Wh E Vh .
A priori error estimates in the L2(0, T; H1(f?)) norm have been proved in
[5] in the case when Ii < /\/-1.
The convergence proof strongly relies on the
strong monotonicity on the operator A. More precisely, the following result is
used to prove convergence.

Lemma 1. Let A(·) be the operator defined by (6) and let the convexity con-
dition Ii < /\/-1
hold. Then there exists p, (depending on Ii) such that, for all
¢, 'lj; E H1(f?), we have

p,11\7(¢ - 'lj;)III2(.rt) :::; (A(\7¢)\7¢ - A (\7'lj;) \7'lj;, \7(¢ - 'lj;)). (13)

Then, assuming that the solution (¢, c) is smooth the following result is
proved in [5].
Theorem 1. Let ¢, c be the weak solution of (1)-(5), let ¢I;" cl;, be defined by
(11)-(12). Assume that

(¢, c) E (L2(0, T; H 3 (f?)) nLOO(O, T; H2(f?) n W 1 ,OO(f?)) nH 1(0, T; H1(f?))) 2.

Then, there is a constant C such that, for all h, T > 0, we have

N N
LTII\7(¢h - ¢(tn))llI2(.rt) + LTII\7(Ch - c(tn))ll 12 (.rt)
n=l n=l

3.2 A posteriori error estimates for meshes with high aspect ratio

As in [15, 16], our goal is to perform adaptive finite elements with high aspect
ratio for solving numerically (1)-(5) in the general case when Ii i= 0. The
general framework of [10, 11] is considered.
For any triangle K of the mesh, let TK : k -+ K be the affine transforma-
tion which maps the reference triangle k onto K. Let MK be the Jacobian of
TK that is
x = TK(x) = MKx + tK.
Since MK is invertible, it admits a singular value decomposition MK =
RlilKPK, where RK and PK are orthogonal and where ilK is diagonal with
positive entries. In the following we set

and RK = (~~K),
2,K

with the choice .A1,K ;::: .A2,K. A simple example of such a transformation is
Xl = HX1, X2 = hX2' with H ;::: h, thus
Analysis and Computation of Dendritic Growth in Binary Alloys 209

see Fig. 3.

Fig. 3. A simple example of transformation from element K to K

In the framework of meshes with high aspect ratio, the classical minimum
angle condition must be avoided. However, it is required that, for each vertex,
the number of neighbouring vertices is bounded above, uniformly with respect
to the mesh size h. Also, for any patch LlK (the set of triangles having a vertex
common with K), the diameter of the corresponding reference patch Ll j{, that
is Llj{ = Tj(I(LlK)' must be uniformly bounded independently of h. This latter
hypothesis excludes some distorted patches, see Fig. 4. Let h : HI (fl) -+ Vh
be a Clement or Scott-Zhang like interpolation operator. We now recall some
interpolation results due to [10, 11].

Lemma 2. There is a constant (: depending only on the reference element k


such that, for all v E HI(fl), for all K E 4" we have

Here we have set

w'k(v) = ,\i,K(r[KGK(v)rl,K) + '\~,K(rr,KGK(v)r2,K),


and where G K (v) denotes the 2 x 2 matrix defined by

For °: ; t ::; T, we denote (/Jh(t) and Ch(t) E V h the semi-discrete finite


element approximation corresponding to (11) and (12), respectively. In order
to prove a posteriori error estimates we need to assume that the error in the
L2(0, T; L2(fl)) norm converges faster than the error in the L2(0, T; HI(fl))
210 E. Burman et al.

1 H

1 h

1 H

Fig. 4. Example of an acceptable patch (top): the size of the reference patch Ll R does
not depend on the aspect ratio H/h. Example of a nonacceptable patch (bottom):
the size of the reference patch Ll R now depends on the aspect ratio H / h.

norm, that is, there are two constant C >


mesh T", we have
° and s E]O,l] such that, for all

T
fa (111) - (hlli2(st) + Ilc - chlli2(st))

~ C(IfEai, A2,K) 28 faT (11Y'(1> - 1>h)lli2(st) + 11Y'(c - ch)lli2(st)). (15)

Let us comment this assumption in the frame of meshes with small aspect ratio,
that is to say when Al,K and A2,K are of order h, for all K E T",. According to
the results of section 3.1, the error in the L2(0, T; Hl(Q)) norm is shown to be
O(h) whenever the time step is small. On the other hand, O(h2) convergence
in the LOO(O, T; L2(Q)) norm is proved for in [13]' but only in the case when
a = 0. Therefore, we expect the following assumption
Analysis and Computation of Dendritic Growth in Binary Alloys 211

laT (11<p - <PhIII2(n) + Ilc - chIII2(n))

::; Ch2laT (11\7(<p - <Ph)III2(n) + 11\7(c - ch)III2(n))


to hold for meshes with small aspect ratio, provided optimal convergence re-
sults hold in both L2(0, T; Hl(D)) and L2(0, T; L2(D)) norms. Note that this
assumption has already been used in [14] to obtain a posteriori error estimates
in the case when a = 0. When considering meshes with high aspect ratio, h
in the above estimate should be replaced by maXKETh ).,2,K, which yields (15).
This assumption is checked numerically in section 4.

For each interior edge of Th , let us choose an arbitrary normal direction il,
let [~] denote the jump of ~ across the edge. For each edge of Th lying on the
boundary aD, we set [~] to twice the inner side value of ~. The following result
is proved in [6].

Theorem 2. Let <p, c be the weak solution of (1)-(5), let <Ph, ch be the
semi-discrete approximation corresponding to (11) (12). Assume that <p, c E
LOO(O, T; H2(D)) and that (15) holds. Then, there is a constant C depending
only on the interpolation constants of Lemma 2 such that, for all mesh Th such
that maXKETh ).,2,K is sufficiently small, we have

+--k II
2).,2,K
[Dl(<Ph) ~Chn +D 2(Ch, <Ph) ath]
n
I
L2(8K)
) x WK(C - Ch)' (16)

Here f-t is the constant of Lemma 1, Ds is the constant of (8) and M2 =


IID21IL~(lR2). Moreover, WK(-) is defined as in Lemma 2.
212 E. Burman et al.

Estimate (16) is not a usual a posteriori error estimate since ¢> and C are still
involved in the right hand side. We then proceed as in [15, 16] and introduce
an estimator based on superconvergent recovery, namely a Zienkiewicz-Zhu
(Z-Z) like estimator [23, 3, 24]. More precisely, we consider the simplest Z-Z
error estimator as defined in [19, 1]. The Z-Z error estimator corresponding
to \7(c - Ch) is defined by the difference between \7ch and an approximate L2
projection of \7 Ch onto V,;,
namely :

(17)

Here Ih : g E L2(0) ---4 (Ihg) E Vh is defined by

1 rh((Ihg)vh) = 1 gVh

where rh denotes the usual Lagrange interpolant. In other words, from constant
values of \7ch on triangles, we build values at vertices P using the formula

From [2,19] we know that for a certain class of meshes (namely parallel meshes)
and for smooth solutions, Z-Z like error estimators are asymptotically exact
(i.e. the Z-Z error estimator converges to the true error when h goes to zero).
Our error indicator corresponding to C - Ch is then obtained by replacing the
matrix GK(C-Ch) present in the definition of WK(C-Ch) in (16) by the matrix
GK (Ch) defined by

(18)

Finally, our simplified error indicator corresponding to the concentration error


C - Ch is defined on each triangle K by

!aT 2A~:i II [~~ ] t2(8Kl


x (AI,K(r[K GK(Ch)r 1,K) + A~,K (rf, K GK(Ch)r2,K) ) 1/2 (19)
Analysis and Computation of Dendritic Growth in Binary Alloys 213

4 Numerical study of the effectivity index for small time


steps

In this section, the quality of our error indicator (19) is investigated numeri-
cally. For details we refer to [6]. Let us consider ¢r;; and cr;; E Vh the solutions
of (11) and (12), respectively. In practice, ¢r;; is obtained by performing only
one Newton iteration at each time step. Proceeding as in [16] we introduce ChT
the continuous, piecewise linear approximation in time defined by
t - t n- 1 tn - t
Chr(X, t) = cr;;(x) + _-C~-l(X) x E n. (20)
T T
and the simplified error indicator for each time interval [t n- 1, tn] and triangle
Kby

( 7]n,K(Chr) ) 2 = ltn
tn-I
1 -2--
-1
2,\ /
2,K
II [oChT] II
an £2(8K)

(21)

where Ch(chT) is defined as in (18).


We consider the model of [20]' the notations being those of [12]. The source
term S in (1) is defined by

1
S(C'¢)=-V¢(1-¢)(1-2¢)+ ,\r¢ (1-¢)
5ml 2 2 (c
1-¢+k¢ -Cl
)
if 0 .:; ¢ .:; 1,
whereas S( c, ¢) = 0 if ¢ < 0 or ¢ > 1. Here ,\ is the thickness of the solid-liquid
interface, ml is the liquid slope in the phase diagram, r is the Gibbs-Thomson
isotropic coefficient, q is the liquid concentration in the phase diagram, and
k is the phase diagram partition coefficient (thus Cs = kcl, where Cs is the
solid concentration in the phase diagram). The coefficient a in (1) equals ~[" /-'k
where /-lk is the interface kinetic coefficient. The first term in the definition of
S(·,·) is nothing but the derivative of a double well potential that forces ¢ to
values zero or one, except in the phase change region. An asymptotic expansion
of the phase-field equation at first order in ,\ links the normal velocity of the
solid-liquid interface, some anisotropic measure of the interface curvature, and
the concentration field. The function D1 in (2) is given by

if 0 .:; ¢ .:; 1,

whereas D 1 (¢) = Dl if ¢ < 0 and D 1 (¢) = Ds if 1 < ¢. Here Ds and Dl


are the solid and liquid diffusion coefficients. Finally, the function D2 in (2) is
given by
D ( A.)=D (A.)(1-k)c(1-c)
2 c, 'P 1 'P 1 _ ¢ + k¢ if 0 .:; c .:; 1,
214 E. Burman et al.

whereas D 2 (c, ¢) = 0 if C < 0 or c > 1. All the physical parameters are given
below in the international MKSA unit system.
Our first goal is to validate numerically assumption (15) in the context
of meshes with high aspect ratio. For this purpose, we set the computational
domain to D = [-0.0002,0.0002]2, we add source terms in (1) (2) so that ¢
and c are given by

1 _ tanh ( Xl ~ vt)
¢(Xl,X2,t)=C(Xl,X2,t) = 2 '

where v = 2 10- 4 and 8 = 10- 5 . Dirichlet boundary conditions are prescribed


on the vertical sides of D, homogeneous Neumann boundary conditions on
the horizontal sides. The physical parameters involved in the definition of S,
Dl and D2 are given in Tab. 1, the time step is T = 5 10- 5 and is small
enough in order to overkill the error due to time discretization. Meshes with
high aspect ratio are used to validate assumption (15). In Tab. 2, errors in the
L2(0,T;L 2(D)) and L2(0,T;Hl(D)) norms (resp. ep and eHl) are reported
when using distorded meshes (h1- h2 denotes the mesh size in horizontal and
vertical directions). Also, the effectivity indices (the ratio between the error
estimator and the true error) ei zz and ei A corresponding to the Zienkiewicz-
Zhu error estimator (17) and our simplified error indicator (21) are shown.
Clearly, when the mesh is refined in the horizontal direction, assumption (15)
holds with s = 1 since the L2(0, T; L2(D)) error converges at order two and
the L2(0, T; Hl(D)) error at rate one. However, when the mesh is refined in
the wrong (vertical) direction, then the error does not decrease and (15) does
not hold.

Table 1. Test case with exact solution: parameters used for the computations.

A ml r Cs Cl Ds Dl ILk

10- 5 -2600.1 0.0150.0238 5 10- 10 5 10- 9 0.0015

5 An adaptive algorithm generating meshes with high


aspect ratio

We now present the adaptive algorithm of [6]. Given a time step T, the goal
is to build triangulations Thn, n = 1, ... , N having high aspect ratio such that
the relative estimated error in the L2(0, T; Hl(D)) norm is close to a preset
tolerance TOL. Our adaptive algorithm aims at building triangulations r"n,
n = 1, ... , N such that
Analysis and Computation of Dendritic Growth in Binary Alloys 215

Table 2. Various convergence results for the travelling wave solution.

h1- h2 eL2 eH' ei zz ei A


0.000005 - 0.0001 1.1 10- 6 0.29 1.01 1.85
0.0000025 - 0.00005 3.2 10-7 0.13 1.01 1.74
0.00000125 - 0.000025 8.8 10- 8 0.066 1.01 1.74
Anisotropic meshes refined in both horizontal and vertical directions

h1- h2 eL2 eH' ei zz ei A


0.000005 - 0.0001 1.1 10- 6 0.29 1.01 1.85
0.0000025 - 0.0001 4.410- 7 0.14 1.00 1.79
0.00000125 - 0.0001 1.2 10- 8 0.061 1.00 1.79
Anisotropic meshes refined in horizontal direction only

h1- h2 eL2 eH' ei zz ei A


0.00005 - 0.00001 4.0 10- 5 4.3 0.85 1.67
0.00005 - 0.000005 3.2 10- 5 5.3 0.86 1.87
0.00005 - 0.0000025 4.2 10- 5 8.48 0.86 1.54
Anisotropic meshes refined in vertical direction only

N 2
L L (1]n,K(Chr))
0.75 TaL:::; n=lK;!h :::; 1.25 TaL, (22)
fa In IV hrlC 2

where 1]n,K(ChT) defined by (21). A sufficient condition to satisfy (22) is to


build, for each n = 1, ... , N, an anisotropic triangulation T"n such that

for all triangle K E T"n, where Nvhn is the number of vertices of the mesh T"n.
We then proceed as in [15, 16] to build such an anisotropic mesh, using the
BL2D mesh generator [4].

5.1 Computations with small anisotropy

We now consider the following physical situation. At initial time, the computa-
tional domain is liquid, with homogeneous concentration 0.02. Then, a circular
216 E. Burman et al.

solid seed of diameter 2.5 10- 6 and concentration 0.015 is placed at the center
of D. The physical parameters are now given in Tab. 3 and are taken from
[12], table 1, column B, except Cs and cz.

Table 3. Parameters used for the computations.


ml r C s Cl Ds
0.510 -260510 0.015 0.0238 5 10 510 0.0015

We first present computations in the case when the anisotropy parameter


a is small. We set the number of dendrite arms K, = 4 and choose a = 0.04 so
that a < ,,/-1 c:::: 0.0667. The time step is T = 5 10- 4 and the final time is
T = 1, making the total number of time steps 2000.
In Fig. 5, the adapted meshes, concentration and phase fields corresponding
to an adaptive computation with tolerance TOL = 0.0625 (6.25% estimated
relative error) are reported. The concentration c and phase ¢ appear to be
smooth, but exhibit strong gradients across the solid to liquid transition zone,
therefore the mesh is strongly refined in the neighbourhood of the solid-liquid
interface. Zooms of the results are shown in Fig. 6. The adaptive algorithm
generates 400 meshes from initial to final time. The computation takes about
4 hours on a Pentium III 1.2 Ghz PC, with a required memory of less than
300 Mb. The maximum aspect ratio of the generated meshes is approximately
30, without any a priori upper bound imposed by the adaptive method.

5.2 Computations with large anisotropy

We now choose the anisotropy parameter a > ,}-1 c:::: 0.0667, namely
a = 0.1. In this case there are no known existence results for the system
in L2(0, T; H 1 (D)).
In Fig. 7, the concentration fields corresponding to an adaptive computa-
tion with tolerance TOL = 0.0625 are reported. A zoom of the results at final
time, Fig. 8, shows that the gradient is discontinuous in some regions. This
phenomenon is explained in details in [6].

6 Conclusions and perspectives

A phase-field model corresponding to the isothermal, dendritic growth of a bi-


nary alloy is considered. Existence is proved when the physical anisotropy is
small. A finite element method is proposed and a priori error estimates are
obtained, again when the physical anisotropy is small. A posteriori error esti-
mates are derived for meshes with large aspect ratio and a numerical study of
Analysis and Computation of Dendritic Growth in Binary Alloys 217

1
D.'
D.'
D.7
D.6
D.5
0.'
0.3
D.2
D.1
D

Fig. 5. Computations with small anisotropy, a = 0.04. Adapted meshes (left col-
umn), concentration isovalues (middle column) and phase isovalues (right column),
from t = 0 to t = 1 s, with TOL = 0.0625 (6.25% estimated relative error). Row 1:
t = 0.05 s, 6874 vertices. Row 2: t = 0.5 s, 17170 vertices. Row 3: t = 1 s, 24441
vertices.

the effectivity index is proposed. Finally, an adaptive algorithm that generates


successive meshes with high aspect ratio is presented.
Numerical results corresponding to dendritic growth are then presented.
When the physical anisotropy is small, the numerical solution is smooth
whereas, when the physical anisotropy exceeds the predicted value, then ir-
218 E. Burman et al.

Fig. 6. Computations with small anisotropy, a= 0.04. Zooms of the mesh at final
time.

Fig. 7. Computations with large anisotropy, a = 0.1. Concentration isovalues from


t = 0 to t = 1 s, with TOL = 0.0625 (6.25% estimated relative error). Left:
t = 0.05 s, 5138 vertices. Middle: t = 0.5 s, 20580 vertices. Right: t = 1 s, 29971
vertices.

Fig. 8. Computations with large anisotropy, a= 0.1. Zooms of the mesh at final
time.

regular dendritic shapes are obtained. We are looking forward to obtaining


similar theoretical results for the multi phase-field model described in [18, 7].
Analysis and Computation of Dendritic Growth in Binary Alloys 219

References
1. M. Ainsworth and J. T. Oden. A posteriori error estimation in finite element
analysis. Comput. Methods Appl. Mech. Engrg., 142(1-2):1-88, 1997.
2. M. Ainsworth and J.T. Oden. A unified approach to a posteriori error estimation
using finite element residual methods. Numer. Math., 65:23-50, 1993.
3. M. Ainsworth, J. Z. Zhu, A. W. Craig, and O. C. Zienkiewicz. Analysis of
the Zienkiewicz-Zhu a posteriori error estimator in the finite element method.
Internat. J. Numer. Methods Engrg., 28(9):2161-2174, 1989.
4. H. Borouchaki and P. Laug. The b12d mesh generator: Beginner's guide, user's
and programmer's manual. Technical Report RT-0194, Institut National de
Recherche en Informatique et Automatique (INRIA), Rocquencourt, 78153 Le
Chesnay, France, 1996.
5. E. Burman, D. Kessler, and J. Rappaz. Convergence of the finite element
method for an anisotropic phase-field model. Technical report, Departement
de Mathematiques, Ecole Poly technique Federale de Lausanne, 1015 Lausanne,
Switzerland, 2002.
6. E. Burman and M. Picasso. Anisotropic, adaptive finite elements for the com-
putation of a solutal dendrite. Interfaces Free Bound., 5(2): 103-127, 2003.
7. E. Burman, M. Picasso, and A. Jacot. Adaptive finite elements with high aspect
ratio for the computation of coalescence using a phase-field model. J. Comput.
Phys., accepted, 2003.
8. E. Burman and J. Rappaz. Existence of solutions to an anisotropic phase-field
model. Math. Methods Appl. Sci., 26(13):1137-1160, 2003.
9. C. M. Elliott. Approximation of curvature dependent interface motion. In The
state of the art in numerical analysis (York, 1996), pages 407-440. Oxford Univ.
Press, New York, 1997.
10. L. Formaggia and S. Perotto. New anisotropic a priori error estimates. Numer.
Math., 89:641-667, 2001.
11. L. Formaggia and S. Perotto. Anisotropic error estimates for elliptic problems.
Numer. Math., 94(1):67-92, 2003.
12. A. Jacot and M. Rappaz. A pseudo front tracking technique for the modelling of
solidification microstrucures in multicomponent alloys. Acta Materialia, 50:1909-
1926,2002.
13. D. Kessler and J.-F. Scheid. A priori error estimates for a phase-field model for
the solidification process of a binary alloy. IMA J. Numer. Anal., 22:281-305,
2002.
14. O. Kruger, M. Picasso, and J.-F. Scheid. A posteriori error estimates and adap-
tive finite elements for a nonlinear parabolic problem arising from solidification.
Comput. Methods Appl. Mech. Engrg., 192:535-558, 2001.
15. M. Picasso. Numerical study of the effectivity index for an anisotropic error
indicator based on zienkiewicz-zhu error estimator. Comm. Numer. Methods
Engnrg., 19:13-23, 2002.
16. M. Picasso. An anisotropic error indicator based on zienkiewicz-zhu error es-
timator : application to elliptic and parabolic problems. SIAM J. Sci. Comp.,
24:1328-1355, 2003.
17. J. Rappaz and J.-F. Scheid. Existence of solutions to a phase-field model for
the isothermal solidification process of a binary alloy. Math. Meth. Appl. Sci.,
23:491-513, 2000.
220 E. Burman et al.

18. M. Rappaz, A. Jacot, and W.J. Boettinger. Last stage solidification of alloys
: a theoretical study of dendrite arm and grain coalescence. Met. Trans. A,
34:467-479, 2003.
19. R. Rodriguez. Some remarks on Zienkiewicz-Zhu estimator. Numer. Methods
Partial Differential Equations, 10(5):625-635, 1994.
20. J. Tiaden, B. Nestler, H. J. Diepers, and 1. Steinbach. The multiphase-field model
with an integrated concept for modelling solute diffusion. Physica D: Nonlinear
Phenomena, 115(1-2):73-86, 1998.
21. A. Visintin. Models of phase transitions. Birkhiiuser Boston Inc., Boston, MA,
1996.
22. J. A. Warren and W. J. Boettinger. Prediction of dendritic growth and microseg-
regation patterns in a binary alloy using the phase-field model. Acta metall.
mater., 43(2):689-703, 1995.
23. O. Zienkiewicz and J. Zhu. A simple error estimator and adaptive procedure for
practical engineering analysis. Internat. J. Numer. Methods Engrg., 24(2):337-
357, 1987.
24. O. C. Zienkiewicz and J. Z. Zhu. The superconvergent patch recovery and a pos-
teriori error estimates. I. The recovery technique. Internat. J. Numer. Methods
Engrg., 33(7):1331-1364, 1992.
Discontinuous Galer kin Methods for
Timoshenko Beams

Fatila Celiker h , Bernardo Cockburn 1 , Sukru Ciizey2, Ramdev Kanapady 3,


Sew-Chew Soon2, Henrik K. Stolarski 2, and Kummar Tamma3

1. School of Mathematics, University of Minnesota, 206 Church St SE Minneapolis,


MN 55455
2. Department of Civil Engineering, University of Minnesota, 500 Pillsbury Drive
SE Minneapolis, MN 55455
3. Department of Mechanical Engineering, University of Minnesota, 111 Church
Street S.E. Minneapolis, MN 55455
*Corresponding Author: e-mail: [email protected]

Summary. We devise a family of discontinuous Galerkin methods for the Timo-


shenko beam problem. Sufficient conditions for the existence and uniqueness of the
approximation are given. The method allows arbitrary meshes and arbitrary poly-
nomial degrees within the mesh, and hence is suitable for hp adaptivity. Numerical
results showing optimal and exponential convergence are provided. These features of
the method render it appealing for other problems in structure mechanics such as,
plates, shells etc.

1 Introduction

In this paper, we introduce and numerically study discontinuous Calerkin (DC)


for Timoshenko beams. The Timoshenko beam model, see [1] and [2], can be
written as
dw T(x) de M(x) dM dT
dx = e(x) - (GA)(x)' dx (EI)(x) , dx = T(x), dx = q(x), (1)

where x E n = (0, L). Here, the unknowns are the transverse displacement
w, the rotation of the transverse cross-section of the beam e, the bending
moment M, and the shear force T. The material and geometrical properties of
the beam are characterized by the shear modulus G, the cross-section area A,
the Young modulus E, and the moment of inertia I. The transverse load, q, is
part of data of the problem. To complete the model and ensure the existence
and uniqueness of its solution, we must impose suitable boundary conditions;
we take, for example,

w(O) = Wo M(O) = Mo w(L) = WL M(L) = M L . (2)


Our long-term goal is to investigate the possible advantages of DC methods
in computational structural mechanics. In this paper, we begin our efforts by
222 F. Celiker et al.

studying how to properly devise DG methods for the Timoshenko beam. In


a forthcoming paper, we give a complete error analysis of these methods and
show that they can easily overcome the so-called shear locking.
In [4] Arnold analyzed the continuous version of the Galerkin method. He
proved error estimates which degenerate as the aspect ration of the beam
tends to zero, and hence the method suffers from shear locking. In the same
paper he proved that locking is overcome if we use the so-called reduced inte-
gration technique. These findings are verified by numerical experiments. For
the relationship between mixed finite element methods and reduced integra-
tion techniques we refer to [5]. In [6]' Li analyzed the p and hp versions of
the same method and proved error estimates independent of the aspect ra-
tio of the beam. This is consistent with the well known fact that locking can
be overcome by increasing the polynomial degree of the approximations. Our
preliminary results indicate that the DG methods overcome locking even if we
approximate all the unknowns with piecewise constants and do not use reduced
integration. This and other features render DG appealing for other problems
in structure mechanics, such as plates, shells and elasticity. Of course, there
are additinal issues involved in these problems; Reissner-Mindlin plates have
boundary layers, elasticity problems have volumetric locking and shells exhibit
membrane locking. For a recent DG method for the Reissner-Mindlin plate see
[7]. Arnold and Falk provided a theoretical analysis of the boundary layers in
[8]' and a uniformly accurate finite element method in [9]. For a locking-free
finite element method for shells see [10].
The paper is organized as follows. In section 2, we introduce the weak
formulation we are going to use to define the DG methods. Then, in section
3, we introduce their general form. In section 4, we introduce what we call
a discrete energy identity which we use, in section 5, to establish conditions
that ensure existence and uniqueness of the approximate solution. In section
6, we present numerical results showing that the method can achieve optimal
h-convergence as well as exponential convergence. We end in section 7 with
some concluding remarks.

2 The weak formulation for the continuous case


To display the weak formulation we use to define the DG methods, we need
to introduce some notations. Let T = {Ij = (Xj-l, Xj), j = 1, ... , N} be a
triangulation of the computational domain [J = (0, L); we assume that the
°
nodes Xi are such that = Xo < Xl < ... < XN-I < XN = L. Then, we write
N
(R, [cpn]) := L R(xj) [cpn] (Xj).
j=O

Here, R is a function defined on the set of nodes gh := {XO, Xl, ... , X N }.


The jump of the function cp, [cpn] , is defined as follows. If the node e is in
DG methods for Timoshenko Beams 223

gJ: := {Xl, X2, ... , XN-I}, then we take [<pn](e) = <p(e+)nt + <p(e-)n;, where
<p(e±) := limdo <pee - En;) and n; = =flo For the boundary nodes, we take
[<pn](O) = -<p(O+), [<pn](L) = <p(L-). These jumps are well defined for <p in
HI([h), where [h = Uj=I, ... ,NJj .
It is now easy to see that if we assume that (T, M, e, w) E [HI(D)]4, we
have
d I I I
-(W'dxvl) + (w,[vln]) =(e,v )-(CAT,v), (3a)
d 1
- (e, dx v)
2
+ (e, [v 2 n]) = (EJM,v 2
), (3b)
d
- (M, dx v ) + (M, [v n]) = (T, v ),
3 3 3
(3c)
d
- (T, dx v )
4
+ (T, [v 4 n]) = (q, v 4 ). (3d)

for all v l ,v2 ,v3 ,v4 E HI(D h ). This is the weak formulation we use to define
the DC methods.

3 The DG Methods
The approximate solution (Th, Mh, eh, Wh) given by the DC method will be
sought in the finite dimensional space V/:' x V/:2 x V/:3 X V/:4; here,

V::= {v: Dh f-+ lR: vlIj E pk(Jj),j = 1, ... ,N},

where pk(K) is the set of all polynomials on K of degree not exceeding k. The
approximate solution is determined by requiring that

hold for all Vi EV/:i for i = 1,2,3,4. To complete the definition of the method,
we have to define the numerical traces (Th' M h , Bh, Wh) at the nodes. It is
through them that the interaction between the degrees of freedom of differ-
ent intervals is introduced and the boundary conditions are actually imposed.
Moreover, their choice is crucial as it affects both the stability and the accu-
racy of the method; see [3] for a detailed discussion of this issue for some other
problems.
224 F. Celiker et al.

Extending to our framework what have been already done for fluid flow
problems, we assume that the form of these traces is as follows. For an interior
node e E rffl:, we take

Wh + Cn[Whn] + CI2[8hn] + CI3 [Mhn] + C14 [Thn],


= {{ Wh}} (5a)
Bh = {{ 8h }} + C21[Whn] + C22 [8hn] + C23 [Mhn] + C24 [Thn], (5b)
Mh = {{Mh}} + C3dwhn] + C32 [8hn] + C33 [Mhn] + C3dThn] , (5c)
Th = {{ Th }} + C41[Whn] + C42 [8hn] + C43 [Mhn] + C44 [Thn], (5d)

where {{ip}}(e) = ~(ip(e+) + ip(e-)). At x = 0, we take


Wh(O) = Wo, (6a)
Bh(O) = 8h(0+) + C21 (0)(Wo - Wh(O+)) + C23 (0)(Mo - Mh(O+)), (6b)
Mh(O) = M o, (6c)
Th(O) = Th(O+) + C41 (O)(wo - Wh(O+)) + C43 (0)(Mo - Mh(O+)). (6d)

and at x = L,
wh(L) = WL, (7a)
Bh(L) = 8h(L-) + C21 (L)(Wh(L-) - WL) + C23 (L)(Mh(L-) - M L), (7b)
Mh(L) = M L , (7c)
Th(L) = Th(L-) + C41 (L)(Wh(L-) - WL) + C43 (L)(Mh(L-) - ML). (7d)

Note how the boundary conditions are incorporated into the DG method
through the definition of the numerical traces at the border. Note also that
the parameters C ij defining the numerical traces can have different values at
different nodes. In the next section, we investigate the role of these parame-
ters. In particular, we show that out of these sixteen parameters, six can be
expressed in terms of the remaining ten and that only four of them have an
impact on the "energy" of the discretization.

4 The Discrete Energy Identity

To see this, we consider a classical energy argument. It is not difficult to see


that if we take vI = T, v 2 = -M, v 3 = -8, and v 4 = W in the equations (3),
and add them, we obtain the energy identity

1 1
(EI M,M) + (CAT,T) = (q,w) + bc(T,M,8,w),
where
DC methods for Timoshenko Beams 225

Since this identity captures an essential feature of the problem under consid-
eration, we would like to obtain a similar energy identity for the DG method.
Such an identity is obtained in the following result.

Proposition 1 (Discrete Energy Identity). Assume that (Th' M h , Bh, Wh)


is a solution of the DC method given by (4), with numerical traces given by
(5), (6), and (7). Moreover, assume that for all nodes e E C:h we have

C21 = C43 , -C22 = C33 , C24 = C13 , C31 = C42 , C34 = C12 , -Cll = C44 .
(8)
Then, we have

Here, setting C 14 = C 32 = 0 at the boundary nodes, we have

8 bc = WO[C41 W(0+) - C21 M(0+)] + MO[C43 W(0+) - C23 M(0+)]


+ WL[C41 W(L-) - C21 M(L-)] + M L [C43 W(L-) - C23 M(L-)].
Proof. The proof of the above result follows by mimicking what was done for
the continuous case, that is, by taking VI = T h , v 2 = -Mh, v 3 = -Bh, and
v 4 = Wh in the definition of the DG method (4), adding the resulting equations,
and carrying out some simple algebraic manipulations. 0

It is now clear that if we take

(10)

then each of the terms of 8jumps can be considered to be an energy associated


with the discontinuous nature of the discretization. Thus, the above condition
ensures that the appearance of the jumps in the DG approximation is accom-
panied by an increase of the total energy of the system. Since this can also
be thought of as being a stabilizing effect, the above parameters are called
the stabilization parameters. None of the remaining parameters appear in the
expression for the energy of the approximation, as we can see in the above
result.

5 Existence and uniqueness of the DG approximation

Our main theoretical result is the following.


226 F. Celiker et al.

Theorem 1 (Existence and uniqueness of the DG approximation).


Consider the DC method defined by the weak formulation (4) and the numerical
traces (5), (6) and (7). Assume that the constants Cij satisfy (8) and (10).
Then the method has a unique solution in the following cases:
Case 1: C 41 > 0 on gh, -C32 > 0 on gh' k2 2: k3 - 1, and kl 2: k4 - 1.
Case 2: C ij = 0 on gh' except Cll = C 22 = -C33 = -C44 = 1/2,
C41 (L) > 0, k2 2: k3 and kl 2: k 4.
Case 3: k2 2: k3 + 1 and kl 2: k4 + 1.

From the first case, we see that the stabilization parameters associated
to Bh and Wh, namely, C 32 and C 41 , respectively, have a stronger influence
on the existence and uniqueness of the solution of the method than the ones
associated to Th and M h , namely, C l4 and C 23 , respectively.
From the second and third cases, we see that when the stabilization effect
of a jump is turned off (by setting the corresponding stabilization parameter
equal to zero), the existence and uniqueness of the approximate solution can
still be guaranteed by a suitable definition of the other parameters and/or
by modifying the polynomial degree of the approximate solutions. Roughly
speaking, the more stabilization parameters are equal to zero, the more the
spaces for Bh and Wh have to be in relation to the spaces of Mh and Th,
respectively.

Proof. Due to the linearity of the problem, it is enough to show that the only
solution to (4) with q = 0, Wo = WL = Mo = ML = 0 is Wh = Bh = Mh =
Th = O. In this case, (9) takes the form

(11)

By the assumptions on the parameters Cij , this implies Th = 0, Mh = O. On


the other hand, taking vI = 1 in (4a), we get that (Bh, 1) = O.
Case 1. In the first case, from the discrete energy identity (11) we see
that [Bh n] = 0 on gh, and [Wh n] = 0 on gh. As a consequence, if k3 = 0,
Bh is a constant and since (Bh, 1) = 0, it is equal to zero. If k3 > 0, a similar
conclusion can be reached as follows: We have, by (5b), that eh = Bh and, by
(4b), that (ix V/:2.
Bh , v 2) = 0 for all v 2 E Since Bh E V/:3 and k2 2: k3 - 1, this
implies that Bh is a constant and since (B h , 1) = 0, that it is equal to zero. A
similar argument shows that Wh is also zero if kl 2: k4 - 1.
Case 2. In the second case, by the definition of the numerical trace of
Bh, we have that eh(O) = Bh(O+) and that eh(e) = Bh(e-) for all the remain-
ing nodes e. Taking v 2 with support (XO,XI) we get, by equation (4b), that
ix
- f:01 Bh v 2 + (Bh v 2)(xl) - (Bh v 2)(xci) = f:01 v 2 d~Bh = O. Since Bh E V/:3
and k2 2: k3 this implies that Bh = 0 on that interval. Now, if we take v 2
f:
with support on (Xl, X2), equation (4b) becomes 12 Bh ddx v 2 = Bh(X 2 ) v 2(x 2}
Taking v 2 = 1, we get that Bh(X 2 ) = O. Also, since k2 2: k3, we can take
DC methods for Timoshenko Beams 227

v 2 = Bh and obtain that Bh(Xt) = o. Hence, f:o' v 2 lxBh = 0, and so Bh = 0


in (Xl,X2). By repeating this argument, we obtain that Bh = o. Similarly, it is
easy to show that Wh = 0 outside the last interval (x N -1, X N ). By the equa-
tion (4a), we have JX N vI ddx Wh = 0, and hence Wh is a constant on the last
V/:4
XN~l

interval since Wh E and kl ?: k4. Finally, since G41 (L) > 0, by the discrete
energy identity (11), we have that [Wh n](L) = 0 and so wh(L-) = o. This
implies that Wh = o.
Case 3. Finally, consider the third case. Since k2 ?: 1, taking v 2 = x/ L
in (4b), we get that Bh(L) = (Bh' 1) = o. Then, taking v 2 = 1 on (Xj,Xj+1)
and v 2 = 0 for the rest of the domain, the equation (4b) yields Bh(xj) =
Bh(xj+r). This implies that Bh = 0 on all the nodes. By (4b), this implies that
(Bh' d~ v 2) = 0 for all v 2 E V/:2 , and since Bh E V/:3 and k2 ?: k3 + 1, this
implies that Bh = o. A similar argument shows that Wh = 0 if kl ?: k4 + 1.
This completes the proof. D

6 Numerical Results

In this section, we display preliminary numerical experiments showing that


DC methods for the Timoshenko beam can be constructed so as to achieve
optimal rates of convergence as well as exponential convergence. We consider
two test problems. In both problems, same-degree polynomial approximations
are used for all the unknowns, that is, we take kl = k2 = k3 = k4 = k.
Test problem 1. We solve (1)-(2) in n = (0,1) with the boundary condi-
tions Wo = WL = 0 and Mo = ML = o. We take q(x) = sin(7rx), for all x E n.
The transverse cross-section area of the beam at the point x, A( x), is assumed
to be rectangular with width and height a(x) and b(x); the moment of inertia
is given by the formula I(x) = a(x)b 3(x)/12. We take a(x) = b(x) = 0.01 The
Young modulus is E = 10 7 whereas the shear modulus is G = (0.35)E.
We pick the numerical traces by setting Gij (x) = 0 for i =I- j except -
G23 (0) = G41 (L) = l/h. And Gll(x) = G22 (X) = -G33 (X) = -G44 (X) = 0.5.
The existence and uniqueness of the approximate solution is guaranteed by
the second case of Theorem 1.
Test problem 2. The purpose of this test problem is to show that the
method can easily handle other boundary conditions and discontinuities in the
material properties and the load. Thus, we take the boundary conditions Wo =
0, Bo = 0, ML = 0 and TL = o. We also take q(x) = sin(7rx) if 0 ::; x::; 1/2 and
q(x) = -16(x - 3/4)2 + 1 if 1/2 < x ::; 1. a(x) = b(x) = 0.02 if 0 ::; x ::; 1/2
and 0.05 if 1/2 < x ::; 1.
We use the following numerical traces. To capture the boundary con-
ditions, we take Wh(O) = 0, Bh(O) = 0, Mh(L) = 0, and j\(L) = o. To
define the remaining numerical traces, we simply take Gij = 0 except for
G l l = G22 = -G33 = -G44 = 1/2. The existence and uniqueness of the ap-
proximate solution is not guaranteed by Theorem 1, but can be easily proven.
228 F. Celiker et al.

in Fig. 1, we see that in both test problems, exponential convergence is


achieved. In Figs. 2, and 3, the solid line represents the exact solution and
"+" represents the numerical trace of the approximate solution at the nodes
of the mesh. We display the results obtained on a uniform mesh of 10 elements.
In the Tables 1 and 2, we display convergence orders up to polynomial degree
k = 5. Mesh number i means a uniform mesh of 2i elements. We can see that
optimal orders of convergence are achieved for all the unknowns even in the
case of piecewise constant approximation.

Fig. 1. Exponential convergence: Test problem 1 (left) and 2 (right)

Fig. 2. Piecewise constant (top) and linear (bottom) approximations for test prob-
lem 1
DG methods for Timoshenko Beams 229

Table 1. Convergence rates for same-degree approximation for test problem 1.

degree mesh IleTIIL2(n) IleMIIL2(n) Il e81IL2(n) Ile w II L2(n)


k number error order error order error order error order
4 1.6e-02 0.99 9.3e-03 0.9S 1.6e-Ol 1.00 9.5e-02 0.99
0 5 S.Oe-03 1.00 4.7e-03 1.00 7.ge-02 1.00 4.Se-02 1.00
6 4.0e-03 1.00 2.3e-03 1.00 3.ge-02 1.00 2.4e-02 1.00
4 5.3e-04 2.00 2.5e-04 2.70 6.4e-03 1.99 2.0e-03 2.01
1 5 1.3e-04 2.00 4.6e-05 2.47 1.6e-03 1.99 5.1e-04 2.00
6 3.3e-05 2.00 1.1e-05 2.11 4.0e-04 2.00 1.3e-04 2.00
4 S.3e-06 3.00 2.7e-06 3.24 1.0e-04 2.99 3.2e-05 3.00
2 5 1.0e-06 3.00 3.3e-07 3.04 1.3e-05 3.00 4.0e-06 3.00
6 1.3e-07 3.00 4.1e-OS 3.00 1.6e-06 3.00 5.0e-07 3.00
4 1.0e-07 4.00 4.ge-OS 4.92 1.2e-06 3.97 3.ge-07 4.00
3 5 6.3e-09 4.00 2.2e-09 4.49 7.6e-OS 3.99 2.4e-OS 4.00
6 3.ge-l0 4.00 1.3e-l0 4.11 4.Se-09 4.00 1.5e-09 4.00
4 9.Se-l0 5.00 3.2e-1O 5.31 1.2e-08 5.00 3.Se-09 5.00
4 5 3.1e-11 5.00 9.7e-12 5.04 3.7e-l0 5.00 1.2e-l0 5.00
6 9.5e-13 5.00 3.0e-13 5.00 1.2e-11 5.00 3.7e-12 5.00
3 5.1e-1O 5.99 4.ge-1O 6.S3 6.1e-09 5.93 2.0e-09 6.01
5 4 7.ge-12 6.00 3.ge-12 6.99 9.6e-11 5.9S 3.1e-11 6.00
5 1.2e-13 5.99 4.3e-14 6.50 1.5e-12 6.00 4.Se-13 6.00

Fig. 3. Piecewise constant (top) and linear (bottom) approximations for test prob-
lem 2
230 F. Celiker et al.

Table 2. Convergence rates for same-degree approximation for test problem 2.

degree mesh Ile T II L2(D) Ile M II L2(D) Ilee IIL2(D) Ile w II L2(D)
k number error order error order error order error order
2 9.7e-2 0.89 1.1e-1 1.04 5.7e-1 1.14 4.7e-1 1.32
0 3 5.1e-2 0.94 5.6e-2 1.04 2.7e-1 1.09 2.1e-1 1.19
4 2.6e-2 0.98 2.7e-2 1.03 1.3e-1 1.05 9.6e-2 1.10
2 1.3e-2 1.25 2.7e-3 1.94 1.1e-2 1.99 4.2e-3 2.04
1 3 3.4e-3 1.89 6.7e-4 1.99 2.8e-3 2.00 1.1e-3 1.98
4 8.6e-4 1.98 1. 7e-4 1.99 6.ge-4 2.00 2.7e-4 1.98
2 1.8e-3 3.00 2.6e-4 2.61 2.8e-4 2.92 2.1e-4 2.95
2 3 2.2e-4 3.00 3.4e-5 2.93 3.5e-5 2.97 2.7e-5 2.98
4 2.8e-5 3.00 4.3e-6 2.99 4.5e-6 2.99 3.4e-6 2.99
2 1.8e-5 4.01 2.7e-5 3.99 1.4e-5 3.96 4.5e-6 4.06
3 3 1.1e-6 4.01 1.7e-6 4.00 8.6e-7 3.99 2.8e-7 4.02
4 7.1e-8 4.00 1.1e-7 4.00 5.4e-8 4.00 1.7e-8 4.01
2 6.ge-7 4.94 2.3e-7 5.05 5.3e-7 4.96 1. 7e-7 4.89
4 3 2.2e-8 4.98 7.1e-9 5.02 1. 7e-8 4.99 5.3e-9 4.97
4 6.ge-10 4.99 2.2e-1O 5.01 5.3e-10 5.00 1. 7e-10 4.99
2 2.3e-8 6.01 7.1e-9 5.91 1.7e-8 5.97 5.7e-9 6.04
5 3 3.6e-10 6.00 1.1e-1O 5.98 2.7e-10 5.99 8.ge-ll 6.01
4 5.6e-12 6.00 1.8e-12 5.99 4.3e-12 6.00 1.4e-12 6.01

7 Conclusion

We devised a family of DG methods for the Timoshenko beam problem and


provided sufficient conditions for the existence and uniqueness of its solution.
We then displayed numerical results showing that the method converges to
the exact solution with optimal order even in the case of discontinuities of the
material and the load were presented. We also provided numerical results indi-
cating exponential convergence. In a forthcoming paper, we provide a complete
error analysis of the methods and show that they are free from shear locking.
Acknowledgement. The authors are pleased to acknowledge support
of this research by the Army High Performance Computing Research Cen-
ter(AHPCRC) under the auspices of the Department of Army, Army Research
Laboratory(ARL) under cooperative agreement DAAD19-0l-2-0014. The con-
tent of which does not necessarily reflect the position or the policy of the
government, and no official endorsement should be inferred.

References
1. S.P.Timoshenko (1921) On the correction for shear of the differential equation for
transverse vibrations of prismatic bars, Philosophical Magazine,41, 744-746.
2. S.P.Timoshenko (1922) On the transverse vibrations of bars of uniform cross sec-
tion, Philosophical Magazine,43,125-131.
3. B. Cockburn(2003) Discontinuous Galerkin Methods, ZAMM Z. Angew. Math.
Mech.,83,731-754.
4. D.N.Arnold(1981) Discretization by Finite Elements of a model Parameter De-
pendent Problem, Numer.Math.37, 405-421, (1981).
DC methods for Timoshenko Beams 231

5. Malkus, D.S., Hughes, T.J.R (1978) Mixed finite element methods-reduced inte-
gration and selective integration techniques: a unification of concepts. Comput.
Methods Appl. Mech. Engrg. 15, 63-81
6. Likang Li(1990) Discretization of the Timoshenko Beam Problem by the p and the
h - p Versions of the Finite Element Method, Numer.Math.57, 413-420.
7. D. N. Arnold, F.Brezzi and D. Marini A family of discontinuous Galerkin finite
elements for the Reissner-Mindlin plate, submitted to Journal of Scientific Com-
puting.
8. D.N.Arnold, R.Falk (1990) The Boundary Layer for the Reissner-Mindlin Plate
Model, SIAM J. Math. Anal., Vol 21, No.2, pp. 281-312.
9. D.N.Arnold, R.Falk (1989) A Uniformly Accurate Finite Element Method for the
Reissner-Mindlin Plate, SIAM J. Numer. Anal., Vol 26, No.6, pp. 1276-1290.
10. D.N.Arnold, F.Brezzi (1997) Locking-free Finite Element Methods for Shells
Math.Comp., Vol.66, Number 217, pp. 1-14.
Numerical Algorithms for Solving
Elliptic-Parabolic Problems

Raimondas Ciegis

Vilnius Gediminas Technical University, Sauletekio Str. 11, LT-2040 Vilnius,


Lithuania rc~fm. vtu.1 t

Summary. This paper deals with numerical algorithms for solving elliptic-parabolic
problems. An example of such problem is given by the Richards equation for modeling
the saturated-unsaturated water flow in porous media. We consider a linear model
problem and investigate the convergence of two finite-volume schemes. The first
one uses the implicit approximation in the whole domain, and the second scheme is
constructed using the splitting method. Results of numerical experiments are also
given.

1 Introduction

In (x, t) E f?l X (0, T] we solve a problem, describing the two-phase flow in


a porous layered media (for a more detailed description we refer to the book
of Helmig [He197]):

OSI
clPI7ft = V'. (AI(SI)KIV'(PI(SI) - xI)),
PI(X, t) = qQ, (x, t) E Of?D X [0, T],
V'(PI - Xl) = 0, (x, t) E of?\of?D X [0, T],
SI(X,O) = Sinit(X),

where Sl is the water saturation in the l-th layer, PI is the pressure, AI(Sz)KI ,
Cl, PI are the permeability, porosity of the porous media, and density of the

°
fluid, respectively.
The equation becomes elliptic in the region of saturation, where PI > and
Sl = sf:
V'. (AI(sl)KIV'(PI - xI)) = 0,
here the pressure PI is the primary unknown, and Sf denotes the water content
of a water-saturated medium.
The actual determination of the discrete solution may require large com-
putational resources due to the strong nonlinearity of the problem, the domi-
nance of the convective process, discontinuity of the porous medium properties.
The formulated problem looks like a system of parabolic partial differential
Numerical Algorithms for Solving Elliptic-Parabolic Problems 233

equations, but its type can become either nonlinear hyperbolic or degenerate
parabolic, depending on the influence of capillary pressure (see, e.g. Helmig
[Hel97]' Eymard et al. [EGHOOJ).
An additional difficulty arises due to degeneracy of the parabolic problem
in the unsaturated region to the elliptic problem in the saturation region of
the porous medium. Fully implicit discrete finite volume and finite difference
schemes are usually used for the approximation of the flow equations in the
whole region (see the papers of Alt and Luckhaus [ALu83j, Chen and Ewing
[ChE97]' Eymard et al. [EGHOO]' Jager and Kacur [JaK91]).
It is well-known that most effective numerical algorithms for solving multi-
dimensional parabolic problems are based on splitting methods. New time
splitting schemes for the time integration of the problems describing the two-
phase flow in the porous medium are proposed by Ciegis et al. [CPZOOj. The
parallel version of this algorithm is considered in [CCZ99j.
In Section 2 we formulate the model linear 3D elliptic-parabolic initial-
boundary value problem and define its approximation in space by the finite-
volume method. In Section 3 we formulate and investigate the fully implicit
scheme for integration in time. The additive integration scheme, which is based
on the Douglas algorithm, is presented and investigated in Section 4. Finally,
the results of numerical experiments are presented in Section 5.

2 Problem Formulation

We consider a linear elliptic-parabolic initial-boundary value problem, which


is defined in the region Q = [0,1j3 x [0, Tj:

au =
at
t~
j=l aXj
(k j aXj
au ) - q(t)u + f(X, t), (X, t) E Qparo

-t aa.
)=1 x)
(k j aaU)
x)
+ q(t)u = f(x, t), (X, t) E Qell, (1)

U(X, t) = flo, (X, t) E an x [0, T],


u(X,O) = uo(X), X E [0,1j3.

We assume that:

Q = Qpar UQell, Qell = [a, Wx [0, Tj.

Let Qhr = Qh X Qr be the discrete uniform mesh:

Qh={Xijk=(X1i,X2j,X3k): xlm=mh, 0:::; m:::; M},


Qr = {tn: t n = nT, 0:::; n :::; N} .
234 R. Ciegis

We use the notation un = U(X ijk , tn). Using the finite-difference method we
approximate a part of the differential operator (1) by the following discrete
operators

A.jUn = (kjU:;;J Xj - qj(tn)Un, AjUn = A.jU n + !j(X, t n ), j = 1,2,3,


here the discrete difference operators are defined by

un = un (x j + h) - un
Xj h '
The selection of a time~stepping procedure is a non trivial task, since the
stability and robustness of the algorithm on the one hand must be balances
with the computational efficiency on the other hand. In the following section
we investigate two different numerical integration schemes.

3 Fully Implicit Difference Scheme


We approximate the differential problem by the following modification of the
backward Euler scheme:

(2)

At each time level t n we get a system of linear equations:

or
(3)
We note that even if the problem is parabolic in the whole region of its
definition (i.e., the flow in the porous media is unsaturated) the 3D parabolic
problem is approximated by the backward Euler scheme.
The system of linear equations (3) is solved by some iterative method,
e.g. the Conjugate Gradient method. In the case of the nonlinear two~phase
flow problem the matrix of the obtained system is non~symmetric, thus some
special methods such as G M RE S should be used to solve a system of linearized
equations.
The convergence rate of iterative methods depends essentially on the stiff-
ness of the matrix A and on the distribution of its eigenvalues. It is easy to
prove that the following spectral estimates are valid:
Numerical Algorithms for Solving Elliptic-Parabolic Problems 235

_
( m_ea---,s(:.. .:. Q!.. . .pa--'.).r \ .) I
+ AA,mm ::;
A
::;
(meas (Q par) \
+ AA,max )
I,
T T

here AA,min, AA,max are the smallest and largest eigenvalues of the matrix A,
respectively, and meas(Q) denotes the volume of Q.
Now we can investigate the stiffness number of the matrix A:

I\;(A) = meas(Qpar)/T + AA,max .


meas(Qpar)/T + AA,min

Taking into account that AA,min = 0(1) and AA,max = 0(h- 2 ), and assuming
that meas(Qpar) > 0, we obtain the following asymptotic estimates:
c
if T = O(Vh) ,
meas( Qpar) h3/ 2
C
I\;(A) = if T = O(h), (4)
meas(Qpar) h
c
if T = 0(h2) .
meas(Qpar)
Thus for sufficiently large time steps T the number of iterations of the CG
method will be approximately the same as for solving the pure elliptic problem.

4 Finite Difference Scheme 2

We approximate the differential problem by the following modification of the


stability-correction scheme

(5)
T

un+! _ U n +2 / 3
s s = A3 U;'+! - A3 U;'.
T

Here s denotes the elliptic iteration number, and the initial condition at time
level t n is recalculated after each iteration as
un if X E fh pan
un - { '
s+l - n+1 .
Us If X E [h,ell ,

uf! = un, for (x, tn) E Qhr.


The proposed algorithm coincides with the Douglas splitting method if the
problem (1) is parabolic in the whole region of definition. The stability analysis
236 R. Ciegis

of the Douglas method is done by Hundsdorfer [Hun96]. It is interesting to note


that the Douglas splitting method is unconditionally stable if all operators Aj
have negative real parts of all eigenvalues and only one operator can have
complex valued eigenvalues (see, also [CPZOO]).

Stability analysis of SCS

In this section we will investigate the stability of the proposed iterative algo-
rithm (5). Let AI, A2, A3 be eigenvalues of the discrete operators AI, A 2, A 3,
respectively:
Aj < 0, j = 1,2,3.
Let us denote the error of the iterative solution by
Z~+l = U~+l _ U n +1,

zn _ {O, if x E [h,pan
8 - Ur; - U n +1, if x E [h,ell.
Theorem 1. The stability-correction scheme is unconditionally stable for the
three dimensional linear elliptic-parabolic problem and the following conver-
gence estimate is valid:

Proof. Let us consider the Fourier series:


zn+1
s L..., qn+1
-- '""' c·· X 'l"J,m,
i,j,m 1",J,m
..
i,j,m

where qi,j,m are the stability functions, or the growing factors


1 +72(A1A2+A1A3+A2A3)-73A1A2A3
qi,j,m = (1- 7A1)(1 - 7A2)(1 - 7A3) .
Thus for Aj < 0 the inequalities
Iqi,j,ml ::; q1 < 1
hold unconditionally. Recall, that we have the following initial condition:

Zn
8
_
-
{O,zn+1 E
if x f?h,par,
'f E Jth,ell·
n
8-1' 1 X

Hence IIZ;'II can be bounded as follows

IIZ~II ::; q21IZ~!lll, q2::; 1,


which implies that

This completes the proof.


Numerical Algorithms for Solving Elliptic-Parabolic Problems 237

5 Numerical Experiments

In this section we present some results of numerical simulations for problem


(1) with the following discrete operators

AjUn = W::J Xj + iJ(x, tn), j = 1,2,3.

Two cases of elliptic regions are investigated:

Q!ll = [0.4,0.7] x [0.4,0.7] x [0.4,0.7] ,


Q;ll = [O.4,O.S] x [O.4,O.S] x [0.4, O.S] .
The function f(x, t) is defined using an exact solution of the differential prob-
lem
u(X, t) = exp(t) sin(7rxi) sin(7rxj) sin(7rxk).
A stopping condition of the iterative algorithm is defined as:
3

II ~
L A-U
J
n +1
S
11
[h,e!!
::; c.
j=l

In Table 1 we present the averaged numbers of iterations at one time level,


obtained by the CG method applied for the realization of the fully implicit
finite-difference scheme (2) with T = O.S.

Table 1. The averaged numbers of iterations for the CG method

N= 20 N=40 N=80
T
Q;ll Q~ll Q;ll Q~ll Q!ll Q~ll
0.1 21.8 7.8 48.8 20.8 102.8 53.8
0.05 19.8 7.0 44.1 18.9 96.1 49.9
0.025 17.6 6.15 38.3 16.9 83.5 43.7
0.0125 14.95 5.02 32.32 14.65 68.45 37.0

In Table 2 we present the averaged numbers of iterations at one time level,


obtained by the splitting scheme (S). Here TO denotes the value of parameter
T used in the elliptic region Q!ll'

6 Conclusions
In this paper we have discussed two numerical algorithms for solving a three-
dimensional elliptic-parabolic problem. The main difference between these
238 R. Ciegis

Table 2. The averaged number of iterations for one time step

T N = 20 TO N = 40 TO

0.1 15.8 0.005 26.2 0.002


0.05 14.35 0.006 26.5 0.002
0.025 12.85 0.007 24.3 0.0025

methods is that in the first one the differential problem is treated as an elliptic
problem in the whole region of the definition and it is integrated in time by
the backward Euler scheme, whereas the second method treats the problem as
parabolic and the integration is done by the splitting-type method. The ad-
vantage of the latter approach is that the linear algebra algorithm is reduced
to simple one-dimensional subproblems. The advantage of the first method is
that the fully implicit approximation leads to a very robust algorithm.

References

[ALu83] Alt, H.W., Luckhaus, S.: Quasilinear elliptic-parabolic differential equa-


tions. Math. Z., 183(4), 311-341 (1983)
[ChE97] Chen, Z., Ewing, R.: Fully discrete finite element analysis of multiphase flow
in groundwater hydrology. SIAM J. Numer. Anal., 34, 2228-2253 (1997)
[CPZOO] Ciegis, R., Papastavrou, A., Zemitis, A.: Additive splitting methods for
elliptic-parabolic problems. Annalli del Universiteta di Ferrara, Sez. VII,
Sc. Mat. Vol. XLVI, 291-306 (2000)
[CCZ99] Ciegis, Raim., Ciegis, Rem., Zemitis, A.: Parallel numerical methods for the
elliptic-parabolic problem, Progress in Industrial Mathematics at ECMI98,
B.G.Teubner, Stuttgart, Leipzig, 206-213 (1999)
[EGHOO] Eymard, R., Gallouet, T., Gutnic, M., Herbin, R., Hilhorst, D.: Numeri-
cal approximation of an elliptic-parabolic equation arising in environment.
Comput. Visual Sci., 3, 33-38 (2000)
[He197] Helmig, R.: Multiphase Flow and Transport Processes in the Subserface.
Modelling of Hydrosystems. Springer-Verlag (1997)
[Hun96] Hundsdorfer, W.: A note on stability of the Douglas splitting method, CWI
Report NM-R9606, Amsterdam, 1996.
[JaK91] Jager, W., Kacur, J.: Solution of porous medium systems by linear approx-
imation scheme, Numer. Math., 60, pp.407-427 (1991)
Stochastic Relaxation of Variational Integrals
with Non-attainable Infima

Dennis D. Cox 1 , Petr Kloucek 2 , Daniel R. Reynolds 3 and Pavel Solin2

1 Department of Statistics, Rice University, 6100 Main Street, Houston, TX 77005,


USA [email protected]
2 Department of Computational and Applied Mathematics, Rice University, 6100
Main Street, Houston, TX 77005, USA [email protected]
3 Center for Applied Scientific Computing, Lawrence Livermore National
Laboratory, P.O. Box 808, L-551, Livermore, CA 94551, USA [email protected]

Summary. We provide an example of a stochastic approach to relaxation of the


variational integrals with non-attainable infima in one dimension. We provide an
approximation for the coefficients of the Laplace transformation of the Probability
Density Function. This approaximation yields the relaxing microstructures.

1 Variational Formulation of Non-attainable Differential


Inclusions
We have reported in [6] application of the Subgrid Projection Method to the
problem of finding an approximation to solutions of non-attainable differen-
tial inclusions. In this contribution, we describe how this approach leads to
a stochastic variational formulation of this problem. Consequently, we can ap-
proach solutions to such problems by stochastic gradient flows. We recall that
we consider the following
Problem 1.1 Let f E W 1,OO(0, 1), 1f'(x)1 < 1 for a. a. x E [0,1]. Find afunc-
tion u E W 1,OO(0, 1) such that

u'(x) E {±1}, for almost all x E (0,1), and


u(x) = f(x), for all x E (0,1).
o
This problem cannot be solved in W 1,1(0, 1), much less in W1,oo(.f?), but if we
relax any of the two contradictory requirements in (1.1) a bit, the set of the

°
solutions is enormous. In fact it is dense in the sense of Baire category, [8].
Namely, for any E > there exists u, E W1,oo(0, 1) such that

u: E {±1}, a. e. in (0,1), and


(1)
Ilu, - fIIL~(o,l) < E.
Moreover, for any continuous function h = h(x) such that h(u~) --' g, weakly
in L1(0, 1), as E -+ 0+, we have
240 D.D. Cox et al.
roo
g(x) = J-oo h(y)dJ-lx,u~ (y), (2)

where l.lx,u~ = A(x)L 1 + (1 - A(X))O+1; O±l denotes the Dirac measures on


lR giving unit mass to the points ±1; and A(X) = !(1 - f'(x)) a. e. in [0,1].
A functional with non-attainable infimum compatible with Problem 1.1 is the
popular potential

def r1 1 1
I(u) = Jo "4 lu (x)1 -1
I 2 12 + "2lu(x)
1 2
- f(x)1 dx. (3)

Of course, the choice of this particular form is somewhat arbitrary. It is obvious


that
inf I(uh) >0 (4)
uhEVh

for any conforming approximation Vh of Wl,4(Q). Hence it follows from (1)


that the set of local minimizers of (4) is large. In particular it is shown in [4]
that if Eh = infvh I(uh) then there exists a family Kh consisting of (l/h)l/h
local minimizers of the same discretized problem such that

I(vh) :::; (1 + 24Vh)Eh' for any Vh E K h,


sup I(tv~ + (1 - t)v~) > ~ Eh, vL v~ E K h. (5)
tE[O,l]

Moreover, we have shown in [7] that if we consider a problem similar to Prob-


lem 1.1, corresponding to tetragonal-to-cubic phase transformations, and if we
apply any Descent Algorithm with a pseudo-gradient as descent navigation
then such an algorithm converges strongly regardless of the initial guess (that
is a weakly differentiable function) even if the infimum cannot be attained!
The situation is even more complicated by the fact that this class of problems
suffers from the so-called Lavrentiev phenomenon which makes the minimiz-
ers dependent on the choice of the functional space, [3]. In summary: classical
minimization algorithms applied to (3)-like potentials with non-attainable in-
fima fail. There exists a huge number of local minimizers on the discrete level,
computationally the minimizers will depend on the polynomial approximation,
the large energy barriers makes the solution dependent on the initial guess.
We conclude that, in the light of these difficulties, it is hopeless to expect
a reasonable outcome by including into the variational formulation just the
information about the averaged states and the crystallographic constraints for
the derivatives. We have shown in [5] and [7] that the variational formulation
should have a form

J W macro + Wmicro + Wstochastic. (6)

Our conclusion is motivated by the observation that we can achieve very good
numerical results by building minimizing sequences which become asymp-
totically (weak) white noise in their derivatives, c.f. [5], [7]. Note that such
Stochastic Relaxation of Variational Integrals with Non-attainable Infima 241

sequences cannot be periodic! Hence, we conclude that the appropriate mi-


croscale dynamical system for approaching the solutions of (1.1) ought to be
given by a Langevin dynamics, i.e. by a stochastic gradient flow. Assuming
that the Helmholtz free energy has a form (6), then we obtain the Langevin
system as an Euler-Lagrange equation corresponding to this functional. We
present such an approach in this paper. We refer to [5], [6] and [7] for two
possible constructions of Wstochastic.

2 Two Microscale Finite Dimensional Langevin Models

We consider two models: one with a fixed potential and one with a time varying
potential. Our objective is to find appropriate local minima of the potential

I(u) = Wmacro(u) + AWmicro(U) , (7)

where

Wmacro(u) iofol If(x) - u(x)12 dx (8)

Wmicro(U) = iofol lu'(x)2 _11 2


dx. (9)

As A -+ 00, the minimizer of I will converge to the solution of the constrained


minimization problem

minimize Wmacro(u) subject to Iu'l = 1.


We will generally consider piecewise affine approximate solutions to this min-
imization problem. Thus
n
U'(x) L VicPA (x),i (10)
i=l

I if x E A,
where v = (Vl, ... , v n ), cPA(X) = { 0 if x <t- A, is the characteristic function

of A and A = [(i - 1)/n, i/n). For a given vector v of values of u' on the
intervals Ai, we will recover u via the formula

With this convention, we may write u = H v where H is the linear operator


implicit in (11) and think of I as a function of v rather than u, say

i(v) = I(Hu).
242 D.D. Cox et at.

2.1 Time Independent Energy Density

Our first approach is to consider a Stochastic Differential Equation (SDE) for


v of the Langevin form

dv(t) -'ll(v)dt + d1](t) , (12)


v(o) 0,

where 1] is an n-dimensional Wiener process satisfying

E[1](t)] = 0, E[1](t) @1](s)] = Amin{s,t}.


The matrix A is a given n x n positive definite matrix, usually taken as an
identity matrix. Note that d1](t) is a white noise process. In other words, we
assume that the energy corresponding to Problem 1.1 has the form (6), and
(12) is the underlying stochastic gradient flow enforcing the competition in the
weak topology to provide the bridge between the atomic nano-scale and the
specimens microscale ('" J-lm).
The choice of the initial condition is somewhat arbitrary. We will have more
to say on this point in our second approach. This SDE describes a particle
moving under a force (defined by the gradient of the potential) subject to
"thermal" agitation (defined by the white noise term). The resulting stochastic
process is a diffusion.
For large values of A, the behavior of the diffusion process defined by (12)
is wander for a short period of time until it "falls" into one of the "wells"
corresponding to the local minima of Wmicro. Our Monte Carlo experiments
indicate that there is a slight preference for the particle to be initially attracted
to the deeper wells (which correspond to smaller local minima of W because the
corresponding u is a better approximation to f in that it makes the W macro
term smaller), but this preference is very weak. For this reason, a second
approach was considered, which we believe better reflects the physical process
of the material phase changes we are modeling.

2.2 Time Dependent Energy Density

Consider a time dependent potential

I(u, t) = [1 - o:(t)]Wmacro(u) + o:(t)AWmicro(U). (13)


The SDE remains of the same Langevin form. Namely,

dv(t) = -'lJ(v,t)dt + d1](t). (14)


Here, the gradient is taken only with respect to the first variable v. The strategy
for specifying 0: is to set

o:(t) (15)
Stochastic Relaxation of Variational Integrals with Non-attainable Infima 243

where tl is chosen large enough that v(t) has converged to a steady state
solution of the Langevin equation similar to (13) but containing only Wmacro.
Then, a(t) is increased linearly to its maximum value of 1

a(t) = max{a(t-td,a max }, (16)


until the particle drops into one of the wells of Wmicro.
The actual implementation requires selection of a time interval 6t and then
we approximate the SDE with the Stochastic Difference Equation (SLlE):

v((k+1)6t)-v(k6t) = -\7W(v(k6t),k6t)6t + Zk+l6t1/2, (17)


where Zl, Z2, ... are independent and identically distributed Gaussian random
vectors having mean 0 and covariance A. For given W macro and W micro the
simulation requires the following inputs:
1. A which controls the weight given to the W micro term in the potential.
2. A which is the covariance matrix for the noise; we have taken this to be
a multiple of the identity A = (}2 I, so only specification of (}2 is necessary.
3. tl which is the "burn in time" so that the process V is in a steady state
determined by W macro. This requires some experimentation depending on

4. a which controls how quickly the potential transforms from W macro to
AW micro. This was controlled by choosing the maximum number of time
steps and choosing a so that a max was achieved at the maximum number
of time steps.
5. a max , the maximum change in a.
6. 6t, the time step.
The actual numerical values we used are given in Table 1.

3 Meso-scale Fokker-Planck Equation

The Probability Density Function (PDF) 9 : lR+ X lR dim V f-+ lR can be used
to obtain any statistical information contained in the microscale Langevin
systems for v at a fixed time point t. Namely,

E[h(t, v)] = J h(t, v)g(t, v) dv, (18)

The PDF 9 is obtained by solving the meso-scale deterministic Fokker-Planck


equation, [10]. The Fokker-Planck equation for the Langevin system in Section
2.1 has the form

og(t,v) . (}2
at = -dlVV [DWdensity(V)g(t, v)] + 26Vg(t, v), v E lR\ t > 0, (19)
244 D.D. Cox et al.

where (}2 is the white noise standard deviation. Hence, for our specific form of
the energy density given by (7), the Fokker-Planck Equation (19) becomes

ag~~ v) = (1 1
j(x)j3(x) dX) .V g(t, v) - (1 j3(x)
1
Q9 j3(x) dX) v . V g(t, v)
\, v J

{Bi] }~j=l

where

}(X) = f(x) - 11 f(y) dy,

Bij = 11 f3i(X)f3j(X) dx,


n-i+1/2 x E (0, (i - l)jn),
n2

f3i(X) = X - (i - l)jn - n-~11/2, x E [(i - l)jn, ijn),

Ij n - n-i+1/2
n2
= i-1/2
n 2 , x E ('j
~ n,
1) .

Since the microscale process given by the Langevin equation is a diffusion, we


expect, in the sense of the convergence in measure,
2n 2n
lim lim
(T2~O+ A~+OO t~+oo
lim g(t, v) = L qiov~ (v) := g(v),
i=l'
L qi = 1, qi ~ 0,
i=l
(21)

where {v7H21 represent a1l2n states with v7j E {±1}. Note that
1. t --+ +00 corresponds to finding the equilibrium distribution of the states,
which is given by the Gibbs distribution [9],
2. A --+ +00 represents imposing the ±1 constraint,
3. (}2 --+ 0+ corresponds to cooling,
4. n --+ 00 gives the continuum case.
The Laplace transform of g( v) has the form
2n
(Lg)(y) = L qje yTv ;, (22)
j=l
Stochastic Relaxation of Variational Integrals with Non-attainable Infima 245

where v; denotes one of the 2n possible distributions of ±1, having the prob-
ability qj. Our goal now is to compute the coefficients qj in (22). In principle,
we have two options:
1. Take the Laplace transform of the Fokker-Planck equation (20) with the
aim to obtain a dense linear 2n x 2n-system for the unknown coordinates
qj. The system can be obtained by inserting 2n various vectors y into
(20) together with the representation (22). One may then use a reduction
technique, [2], [1], to obtain a sparse system and solve it.
2. Enumerate all the 2n states v;,
determine their probabilities qj, and pick
the state with highest probability as the most likely one to which the
system will relax.
Here, we show only the second approach which is based on the following con-
jecture.
Conjecture 3.1

qi = IT q~(1+v7J(1_ q)1-~(1+v7j), where,


j=l (23)

q
= ~
2
(1 _f (j In) - lin
f ((j - 1) In)) .

This conjecture has the important application that it can be used to compute
directly the volume fraction. We believe that it is probably not exact, but in
fact provides an excellent approximation. It is in part based on our previous
work [5] where we found that the computed approximate solutions of (1.1) had
a white noise property. Since

J'(X) = 1 ydtLx(y), (24)

where tLx = a(x)L 1 + (1 - a(x))o+l' a(x) = ~ (1- J'(x)). Then if we let

clef '"
aj = L qi, (25)
v:j=-l

and if an were defined by an E CO(O, 1) such that an(xj) = aj, then


as n ---+ 00 in CO(O, 1). (26)

4 Monte Carlo Simulations Based on the Langevin


Equation
We use the model (17) and we apply Monte Carlo simulations to obtain se-
quences which come close to approximating Problem 1.1. The target function
246 D.D. Cox et al.

for this simulation is f(x) = x(l - x) with Jo1 f(x) dx = 1/16. The volume
fraction is computed by averaging over the replications. Namely,

(27)

Here, m is the number of replications and Vi is the vector of the derivatives for
the i-th replication. The index j refers to the interval A j , i.e., [(j -l)/n,j/nj,
where n is the number of intervals in (0,1). The result shown in this section
is based on twenty independent simulations for the mesh with h = 1/200.
To deduce the macroscopic shape, we average over all twenty replications, c.f.
Figure 1.

Table 1. Data for the model (17) used in this section

Variable Value

spatial resolution 2~O


number of independent replications 20
max no. of steps per replication 10 7
no. of steps for burn in 105
time increment per step 0.05
standard deviation for white noise 0.02
maximum change in alpha 0.5

Sim .. I ' I~ d ~ Finiu. EI ~ ment Appro.imatiom EX'(ISI\. p ", .nd Av "'r.g",dM. rco,cClpicSh . ~

-{).20 0.1 0.2 0.3 0. 4 0.5 O,~ 0.1 o, e O,~

Fig. 1. Macroscopic approximations. The left picture shows all twenty replications
(each replication has different color) and the averaged state (thick smoother line).
The right picture shows the difference between the target function f = x(l- x) and
its computed shape. The spatial resolution is h = 1/200.
Stochastic Relaxation of Variational Integrals with Non-attainable Infima 247

-{I,20 0,1 0.2 0.3 0.4 0.5 0.6 0.7 O.S O,g 1

Fig. 2. Volume fractions. The volume fraction on the left is computed with ()' = 0.02.
The volume fraction on the right is computed with the data given by Table 1 but
with much smaller deviation. Namely, with ()' = 0.005. This shows that particles need
a higher values of the deviation to discover states with lower energy.

5 Simulations Based on the Analysis of the


Fokker-Planck Equation

We use in this section the formulae and convergence results to investigate the
approximation properties based on the formulae (23) and (25). We chose the
spatial resolution to be h = 1/16. With 16 elements on the (0,1) segment
we have 2 16 = 65536 coefficients in the formula (23) and each state vector
has 16 components. We evaluate the values of qi and we select the state for
which qi is the biggest. In other words we select the state with the highest
probability to exhibit the microscopic structure of the solution. The volume
fraction is computed using the formula (25). It seems that the discrete solution
corresponding to the maximum probability state minimizes the L2-distance
to the target function. We chose f(x) = 113 sin(13x) for the calculations in this
section. The results are plotted in Figure 3.

6 Acknowledgment
Dennis Cox was supported in part by the grant NSF DMS-0204723. Petr
Kloucek was supported in part by the grant NSF DMS-0107539, by the grant
from TRW Foundation and by the grant NASA SECTP. The work of Daniel
Reynolds was performed in part under the auspices of the U.S. Department of
248 D.D. Cox et al.

Volume Fraction
Coefficient s of the Probability Density Function

0.025

0.02

0.015

0.01

0.005

10000 20000 30000 40000 50000 60000

Fig. 3. The picture on the left shows the distributions of the coefficients of the
Probability Density Function given by (23). It is clear that the more the target
function deviates from a constant the more isolated is the maximum among the rest
of the coordinates. The right picture shows the reconstruction of the volume fraction
using the formula (25). The darker line corresponds to the computed volume fraction,
the lighter line correspond to the P1-projection of the exact function representing
the volume fraction.

Energy by the University of California, Lawrence Livermore National Labora-


tory under Contract No. W-7405-Eng-48. Pavel Solin was supported in part
by the Grant Agency of the Czech Republic under Grant No. GP102/0l/D114.
This material is based upon work supported in part by Contract No. 74837-
001-0349 from the Regents of University of California (Los Alamos National
Laboratory) to William Marsh Rice University.

References
1. A.C. Antoulas and D.C. Sorensen. Approximation of large-scale dynamical sys-
tems: An overview. Special Issue in Numerical Analysis and System Theory,
Edited by S.L. Campbell, International J. of Applied Mathematics and Computa-
tional Science, 11:1093-1121, 2001.
2. A.C. Antoulas and D.C. Sorensen. The sylvester equation and approximate bal-
anced reduction. Special Issue in Numerical Analysis and System Theory; Edited
by V. Blondel, D. Hinrichsen, J. Rosenthal, and P.M. van Dooren, Linear Algebra
and It's Applications, 2002. to appear.
3. J. M. Ball. Singularities and computation of minimizers for variational problems.
Oxford FoCM, Lecture Notes, 1999.
4. C. Carstensen. Numerical analysis of non-convex minimization problems all-
lowing microstructures. Zeitschrift fur Angewandte mathematik und Mechanik,
76(S2):437-438, 1996.
5. D. Cox, P. Kloucek, and D. R. Reynolds. The non-local relaxation of nonattain-
able differential inclusions using a subgrid projection method: One dimensional
theory and computations. Technical Report 10, Ecole Poly technique Federale de
Lausanne, July 2001. to appear in: SIAM J. Sci. Comp., (2003).
Stochastic Relaxation of Variational Integrals with Non-attainable Infima 249

6. D. Cox, P. Kloucek, and D. R. Reynolds. A subgrid projection mathod for nonat-


tainable differential inclusions. pages 575-584, 2001. in proceedings: ENUMATH
2001, Numerical Mathematics and Advanced Applications, F. Brezzi, A. Buffa, S.
Corsaro, A. Murli (Eds).
7. D. Cox, P. Kloucek, and D. R. Reynolds. On the asymptotically stochastic com-
putational modeling of microstructures. Future Generation Computer Systems,
1080:1-16, 2003.
8. B. Dacorogna and P. Marcellini. Implicit Partial Differential Equations.
Birkhiiuser, 2000.
9. R. Jordan, D. Kinderlehrer, and F. Otto, The variational formulation of the
Fokker-Planck equation, SIAM J. Math. Anal., 29:1-17, 1998.
10. H. Risken. The Fokker-Planck equation: methods of solution and applications.
Springer-Verlag, Berlin-New York, 1984. Springer series in synergetics: 18.
A Pressure-Weighted Upwind Scheme In
Unstructured Finite-Element Grids

Masoud Darbandil, Kiumars Mazaheri-Body 2 and Shidvash Vakilipour 3

1 Sharif University of Technology, Tehran, P.O. Box 11365-8639, Iran


[email protected]
2 Tarbiat-Modares University, Tehran, P.O. Box 14115-111, Iran
[email protected]
3 Sharif University of Technology, Tehran, Iran [email protected]

Summary. Today the finite element method is known as a powerful tool capa-
ble of solving complex flow in complex geometries. Additionally, the unstructured
grid topology is a complementary tool which effectively increases computational ef-
ficiencies. On the other hand, the finite element volume methods incorporate the
advantages of conserving the conservative quantities within elements. However, the
accurate conservation statements need utilizing suitable approximation at cell faces.
In convection dominated flows, upwind-based schemes are strongly utilized. How-
ever, these schemes do not suffice to incorporate the details of pressure field in the
approximation. Therefore, the pressure-weighted upwind scheme is a better choice
for a flow field with high pressure gradients. In this work, a pressure-weighted up-
wind scheme is suitably extended for solving incompressible flow on unstructured
grids. Subsequently, a remedy is given for the problem associated with using equal-
order pressure and velocity interpolations. Eventually, the extended formulations are
validated against suitable benchmark problems involving small and large scale re-
circulation zones. Comparing with the benchmark solutions, the current results are
excellent.

1 Introduction

The rapid progress in unstructured grid generation techniques has encouraged


the CFD code developers to extend their formulations in order to solve the fluid
flow and heat transfer problems on unstructured grid distributions [1]. In fact,
computational methods always require to improve their accuracies in solving
more complex realistic configurations, of course, at lower computational cost.
The limit in computer memory storage enforces the computational methods
to use limited number of nodes. This limit can in turn degrade the achieved
accuracy. The unstructured grid can be used as a powerful tool to compensate
the degraded accuracy by suitable grid clustering within the zones with high
flow field gradients.
Basically, there are two potential numerical instabilities associated with
Galerkin-based methods [2]. The first instability is spurious oscillation in the
flow field due to the presence of advection terms. The second instability is
A Pressure-Weighted Upwind Scheme 251

due to using inappropriate combinations of velocity and pressure interpola-


tions. Generally speaking, the standard Galerkin-based formulations need to
be modified when convection is the dominant physics. The Petrov-Galerkin
[3]' Taylor-Galerkin [4], Galerkin/Least-squares [5] ... are a number of effec-
tive remedies to resolve the problem.
Although the advantage of using unstructured grid in finite element volume
methods is many [6, 7], the instability issues still persist. In fact, the key point
is in the correct approximation of the cell face velocities which is still chal-
lenging. Normally, the upwind-biased interpolations do not take into account
the weight of pressure-field in flow acceleration/deceleration. Therefore, the
pressure-weighted upwinding scheme is used as an important alternative [6].
Unfortunately, this scheme cannot be taken as a solution to the consequences
of using equal-order interpolations [8]. In this work, a physical-based pressure-
weighted upwind scheme is introduced and suitably extended for using in ap-
propriate combinations of velocity and pressure interpolations. Meantime, a
robust prescription for accommodating equal-order interpolations is given.

2 Governing Equations
In the present study, we are concerned with the two-dimensional incompressible
steady flow. The governing equations consist of the conservation statements
for mass and momentums. The non-dimensional vector form of the governing
equations is given by
\7* . V* = 0 (1)
Re [\7* . (V*V*) + \7*p*] = \7*2V* (2)
where the lengths x & y, velocity V, and pressure p variables are nondimension-
alized with respect to a characteristic length Loo (e.g., x*=x/Loo, y*=y/L oo ),
a reference velocity Voo (e.g., V*=V /Voo ), and a reference density Poo (e.g.,
p*=p/(Poo V~)).

iii Cell
[lElement

Fig. 1. A part of unstructured grid representing the elements and a constructed cell

The solution domain is broken into a huge number of triangular elements


which are distributed in an unstructured form. The elements fully cover the
solution domain with no overlapping. Figure 1 shows a small part of the solu-
tion domain. Nodes are located at the triangle vertices. They are the locations
252 M. Darbandi et al.

of the unknown variables. There are three main neighbors around each trian-
gle, see element ABC in Fig. 1. There are no limits in the number of elements
intersect at a node. For example, the shaded area in Fig. 1 shows six triangles
which encompass node P. Therefore, to utilize the benefits of cell-centered
schemes, each element is divided into three quadrilaterals by the help of its
three medians. The medians are demonstrated by dashlines in Fig. 1. The cells
are then constructed from the proper assemblage of these sub-quadrilaterals.
As is seen, irrespective of the shape and distribution of the elements, each
node is surrounded by a number of sub-quadrilaterals. The proper assemblage
of neighboring sub-quadrilaterals around any non-boundary node creates a
polygon cell. In case of unstructured grid, it is possible to have a hybrid mesh
composed of polygons having different number of sides. It is because the num-
ber of elements which visit an specific node is not fixed.

3 Computational Modelling

To utilize the advantages of finite element volume methods, the governing


equations are initially integrated over the shaded area or the cell shown in
Fig. 1. The employment of Gauss divergence theorem to the dimensional form
of the governing equations leads to

i V·dA=O (3)

i u(pV). dA = - i p dAx + i (p,'\lu) . dA (4)

i v(pV). dA = - i p dAy + i (p,'\lv) . dA (5)

where V = ui + vj, p, p, and p, represent velocity, pressure, density, and the


molecular viscosity, respectively. The above integrals are evaluated over the
surface which encloses each cell. The cell area is indicated by A. The above
equations are suitably discretized using finite difference scheme and finite ele-
ment interpolations. In the above expressions, dA=dAxi - dAyj=L1yi - L1xj
is calculated on each cell face. Using this definition, the above integrals can be
evaluated by summation over the faces that enclose the cell, i.e.,
ns
L[p(u dAx + v dAY)]i = 0 (6)
i=1

ns ns
~[pu (u dAx + v dAY)]i = - ~(p dAx)i + ~
ns [
p,
( au au) ]
ax dAx + oy dAy
i

(7)
A Pressure-Weighted Upwind Scheme 253

Fig. 2. The velocity upwinding strategy within an element

ns
8[Pii (u dAx + v dAY)]i = -
ns
8(P dAY)i + 8
ns [
/-L
( ov ov) ]
ox dAx + oy dAy i

(8)
where i counts the number of cell faces from 1 to ns. The number of cell faces
around node P in Fig. 1 is 12. The bar over u and ii indicates that the variables
are approximated from the known magnitudes of the preceding iteration. These
estimations are necessary in order to linearize the nonlinear convection terms.
The rest of procedure is to relate the cell face magnitudes (identified by lower
case letters such as u, v, and P variables) directly to the nodal magnitudes
(identified by upper case letters such as U, V, and P variables) where the
unknown variables are located. A simple idea for treating the right-hand-side
terms is the use of finite element shape functions N j =1...3, i.e.,
3

Pi = LNijPj (9)
j=l

o¢ I
oz .t
= t
j=l
oNij <p.
oz J
(10)

where Pi identifies the magnitude of P at the mid-point of ith cell face. The j
notation counts the node numbers of an element where the ith cell face is lo-
cated inside it. Additionally, the variable z represents either x or y coordinates
and ¢ represents either u or v velocity components. In the above expressions,
lower and upper case letters represent cell face and nodal magnitudes, respec-
tively.
The above treatments end the pressure and diffusion term calculations at
cell face i. However, more sophisticated expressions are required to treat the
convection terms. In fact, the treatment should not disregard the convection-
diffusion physics and concept. To mimic the correct physics of the convection,
254 M. Darbandi et al.

the convection term in the left-hand-side is upwinded. Considering the ith cell
face in Fig. 2, one inclusive suggestion is given by

<Pi = <Pk + (O<P) 118ki (11)


os i

which has been written in the streamwise direction at mid-point of the ith cell
face. The length 118 is a geometry sensitive parameter shown in Fig. 2 as s.
Then, we need to determine the gradient of <P along the streamline. We try
to approximate this gradient using the original governing PDE's. In another
words, one meaningful approximation can be obtained by writing the revised
momentum equations in the streamwise direction, i.e.,

(12)

where I;j =vu 2 + v 2 is the total velocity at the cell mid-point and the source
term 8¢ represents either op/ox in treating x-momentum or op/oy in treating
y-momentum. The substitution of Eq.(12) in Eq.(11) results in

(13)

As is observed, the influence of pressure has been considered in calculating the


correction part ofEq.(11) now. Using the finite-element context, this statement
can be revised to

(14)

where Li is an appropriate diffusion length scale [6]. This length can be esti-
mated in an specified triangle by discretizing the diffusion terms using central
differencing.
Equation (14) shows that <Pi appears in both sides of equation. As is seen,
considering a lagged role for ((; in the diffusion term results in a passive role
of diffusion term in the formulations. To switch it to an active role, it is not
lagged and the impact of this term is taken to the left-hand-side of Eq.(14). A
suitable rearrangement of the new equation in terms of our major dependent
variables, i.e., cjjj and Pj, yields
3 3
<Pi = I:>l!ijcjjj + L{3ij Pj +,i (15)
j=l j=l

where a, {3, and, represent matrix, matrix, and vector coefficients, respec-
tively. The above statement indicates that <P (== u, v) at cell face can be ap-
proximated by the proper assemblage of cjj and P influences. In fact, this
approximation can be regarded as a pressure-weighted upwind scheme.
A Pressure-Weighted Upwind Scheme 255

As is known, one major disaster with the continuity equation is the lack
of having any explicit pressure term despite representing the pressure field.
The past investigation has shown that the ignorance of this important point
can result in non-physical wavy solution [9J. The most important reason for
this non-physical wavy solution is the employment of equal order pressure and
velocity interpolations in Eqs. (6)-(8). The use of unequal interpolations is
known as a general remedy to suppress the non-physical solution. However,
the current innovate idea suggests the use of more sophisticated interpolations
such as Eq.(15) which enforces the direct role of pressure field in the continuity
equation. Therefore, using Eq.(15) in the continuity equation may eliminate the
need for unequal order interpolations. In another words, Eq.(3)should no longer
permit the occurrence of a non-physical solution in the domain. Although this
strategy theoretically seems to work well, some deficits has been practically
encountered. For example, defining m = p(u dAx + v dAy), Eqs.(6)-(8) for
Euler flow can be re-written as
ns
Lmi=O (16)
i=l

ns ns
(17)
i=l i=l
ns ns
(18)
i=l i=l

In a one-dimensional context, Reference [9J shows that the above equa-


tions can still result in undesirable wavy solution under special circumstances.
However, the wavy solution can be eliminated if Eq.(14) is suitably modified.
Therefore, instead of using unequal order interpolation functions, we introduce
and utilize new ¢'s at cell faces. Considering the generic form of upwinding
given by Eq.(ll), the new velocity gradients are given by

-N
p\/ as = v . J-l V ¢ + s¢ - ¢( ~ ~
ax + a) (19)

where the new additional gradient terms in the parentheses provide new state-
ments for the velocity components. Considering the new definitions, Eq.(13)
is revised to

The new additional gradient terms are approximated using the approach em-
ployed in Eq.(lO). The rest of equation is treated similar to Eq.(14). The tilde
over ¢ indicates that these velocity statements are used in the continuity equa-
tion.
256 M. Darbandi et al.

4 Results and Discussion

The derived formulations are tested using standard squared cavity [10] and
triangular cavity [11] benchmark problems which involve many complex re-
circulation zones. The first problem is the flow in a squared cavity driven by
its upper lid. The problem is tested in Re=3200. Figure 3(left) demonstrates
a typical non-uniform unstructured grid distribution in the cavity. The grid
generator has been developed by the current authors. It is capable of generat-
ing different types of unstructured grid topology. As is observed, the grid has
been properly refined in regions with high flow field gradients. Figure 3(right)
depicts the streamlines in the cavity. The recirculation zones in bottom corners
and top-left corner resemble the complexity of this flow field. They have been
detected successfully and accurately.
To investigate the advantages of using the extended formulations on an
unstructured grid, the cavity is tested on both uniform and non-uniform un-
structured grids. The grid resolution is 71 x 71 for uniform grid. The total
number of nodes in non-uniform grid is 5508 which is close to that of uniform
grid, i.e., 5684. The number of cell for them is 10734 and 11086, respectively.

Fig. 3. A typical unstructured grid and the obtained streamlines in the cavity

10".,.------------,
- - NonunifOlmGrid7lx71
- - URcsidual
•••••••••••••• Uniform Grid 7lx71
••••..•.•.•.•. PResidual
Benchmark
10" Uniform Grid 51x51

£~10'2 .!.~~wtlfNNNiNvW¥~
~WN¥i1NNtNfN'i~\vN¥
0.5

~10~1
§
;:0
.........
Nonuniform Grid 51x51 ..........................

50 100 150 200


x , u velocity Iteration

Fig. 4. The centerline velocities in the cavity and the convergence histories
A Pressure-Weighted Upwind Scheme 257

Figure 4(left) demonstrates the centerline u and v velocities for both uniform
and non-uniform grid types. The current results are compared with each other
and those of benchmark [10]. The figure shows that the results of non-uniform
grid distribution are similar to that of uniform grid. Additionally, they are
in good agreement with the benchmark solution. Figure 4(right) shows the
residual histories for both U velocity component and P fields. The V velocity
component history is very similar to that of U. The histories are presented for
both uniform and non-uniform grids. Despite performing equal accuracies in
Fig. 4(left), the performances are different. As is seen, the convergence insta-
bilities are dominant in uniform grid. However, the non-uniform unstructured
grid resembles a smooth and stable residual reduction.
The second test problem is the steady recirculating viscous flow in an equi-
lateral triangular cavity. A primary eddy and several secondary eddies at dif-
ferent regions indicate the complexity of the flow field. Figure 5 shows the
geometry of the cavity with two typical unstructured grid distributions. The
top horizontal wall is sliding with a constant velocity. The non-uniform topol-

Fig. 5. Using uniform and non-uniform unstructured grid topologies in triangular


cavity

0.7

0 .•

0.5

>- 0.4

0.3 b ' , .. ............ ." ." '.

0.2

0.1

'Fig. 6. The streamlines and iso-bars in triangular cavity


258 M. Darbandi et al.

ogy suitably clusters the grid around the corners. Of course, the refinement
helps to achieve more accurate results for an insufficient number of mesh nodes.
Figure 6 shows the streamline patterns (left) and isobar lines using non-
uniform grid distribution. The Reynolds number is 500. The total number of
nodes is 1135. As is observed, there are one primary eddy on top and two
secondary eddies under it. The eddies shrink in lower levels. Additionally,
there is one irregular recirculation on the left edge which has been successfully
detected. Unfortunately, there is no quantitative report of velocity magnitudes
for this benchmark case. However, a qualitative comparison has been done
with other references such as Refs. [11, 12]. Table 1 shows that the x and
y magnitudes of the eddy center locations are in good agreement with those
obtained from the benchmark solutions.

Table 1. A comparison on the (x,y) locations ofvorticities in triangular cavity

top vortex mid vortex bottom vortex edge vortex


Current results 0.038, 0.639 -0.047, 0.298 0.000, 0.064 -0.389, 0.706
Guermond [12] 0.042, 0.632 -0.049, 0.294 0.000, 0.062 -0.395, 0.701

References
1. Mavriplis, D.J. (1997): Unstructured Grid Techniques, Annu. Rev. Fluid Mech.,
29, 473-514.
2. Tezduyar, T.E. (1992): Finite Element Computation of Unsteady Incompress-
ible Flows Involving Moving Boundaries and Interfaces and Iterative Solution
Strategies, AGARD-R-787, France, Chap. 3.
3. Hughes, T.J.R., Franca, L.P., and Balestra, M. (1986): A New Finite Element
Formulation for Computational Fluid Dynamics: V, Compo Methods Appl. Mech.
Eng., 5985-99.
4. Donea, J. (1984): A Taylor-Galerkin Method for Convective Transport Problems,
Int. J. Numer. Methods Eng., 20, 101-119.
5. Hansbo, P., and Szepessy (1990): A Velocity-Pressure Streamline Diffusion Fi-
nite Element Method for the Incompressible Navier-Stokes Equations, Compo
Methods Appl. Mech. Eng., 84, pp.175-192.
6. Darbandi, M., and Schneider, G.E. (1999): Application of an All-Speed Flow
Algorithm to Heat Transfer Problems, Numer. Heat Trans. A, 35, pp.695-715.
7. Baliga, B.R., and Patankar, S.V. (1987): Elliptic Systems: Finite-Element Method
II, Handbook of Numerical Heat Transfer, Edited by W.J. Minkowycz, E.M. Spar-
row, G.E. Schneider, R.H. Pletcher, John Wiley, New York, pp.421-461.
8. Prakash, C. (1986): An Improved Control Volume Finite-Element Method for
Heat and Mass Transfer, and for Fluid Flow Using Equal-Order Velocity-Pressure
Interpolation, Numer. Heat Trans., 9, 253-276.
A Pressure-Weighted Upwind Scheme 259

9. Darbandi, M., Schneider, G.E., and Bostandoost, S.M. (2003): Improvement of


Velocity Role in Coupling of Mass and Momentum Equations, AIAA Paper 2003-
0856, 41st AIAA Aero. Sciences Meeting, Reno, NV, Jan. 6-9.
10. Ghia, U., Ghia, K.N., and Shin, C.T. (1982): High-Re Solutions for Incompress-
ible Flow Using the Navier-Stokes Equations and a Multigrid Method, J. Compo
Physics, 48, 387-411.
11. Ribbens, C.J., Watson, L.T., and Wang, C.Y., (1994): Steady Viscous Flow in a
Triangular Cavity, J. Compo Physics, 112, 173-181.
12. Guermond, J.L., and Quartapelle, L. (1997): Calculation of Incompressible Vis-
cous Flows by an Unconditionally Stable Projection FEM, J. Compo Phys., 132
12-33.
Discontinuous Galerkin Finite Element Method
for the Numerical Solution of Viscous
Compressible Flows

Vft Dolejsf

Charles University Prague, Faculty of Mathematics and Physics, Sokolovska 83,


18675 Prague, Czech Republic [email protected]

Summary. We deal with the numerical solution of the compressible Navier-Stokes


equations with the aid of the discontinuous Galerkin finite element (DG FE) ap-
proach with the nonsymmetric interior penalty terms. The linearization of diffusive
terms and the treatment of the boundary conditions are discussed. Several numerical
examples demonstrating the efficiency of the numerical method are presented.

1 Introduction
We deal with the numerical solution of the compressible N avier-Stokes equa-
tions with the aid of the discontinuous Galerkin finite element method
(DGFEM). This method has become quite popular and it is discussed in a num-
ber of papers. For a review of DG methods, see [4] or [5]. Let us mention the
papers [1] and [2] dealing with the numerical simulation of compressible flows,
where the mixed formulation is applied to the treatment of the viscous terms.
We develop the so-called DGFEM with nonsymmetric interior penalty
terms. This method was applied to the solution of a scalar nonlinear
convection-diffusion equation in [8] where a complete numerical analysis is
presented. The extension of DGFEM to the system of the Navier-Stokes equa-
tions is straightforward (see preliminary results in [6]) but some suitable lin-
earization of diffusive terms has to be performed. That is the subject of this
paper.
In Section 2 the continuous problem describing compressible flow is formu-
lated. DGFE discretization is introduced in Section 3, where also the lineariza-
tion of diffusive terms is discussed. Several numerical examples demonstrating
the efficiency of the method are presented in Section 4.

2 Continuous problem
Let n c JR2 be a bounded plain domain and T > O. We set QT = n x (0, T)
and by an we denote the boundary of n
which consists of several disjoint
parts. We distinguish inlet n,outlet To and impermeable walls Tw on an.
We want to find a vector-valued function w : QT ----> JR4, such that
DGM for the Navier-Stokes equations 261

2 2
OW + '"
L
ofs(w) ~ ' " oRs(w, V'w)
- L inQT, (1)
m s=l
o~
s=l
oXs

where

w = w(x,t), x E il, t E (O,T), (2)


w = (p,pVl, ... ,pVN,E)T E 1R4,
fi(W) = (pVi' PV1Vi + bliP, PV2Vi + b2iP, (E + p)Vi)T, i = 1,2
Ri(W, V'w) = (0, Til, Ti2, Til Vl + Ti2 V2 + Re'YPr oB/oxi) T, i = 1,2

1 (Ovo OV 2 )
Tij = Re ox: + ox: - "3 divvbij ,i, j = 1,2.

To system (1) we add the thermodynamical relations

(3)

We use the following notation: v = (Vl' V2) T - velocity vector, P - density, P


- pressure, B - temperature, E - total energy, 'Y - Poisson adiabatic constant,
Re - Reynolds number, Pr - Prandtl number.
System (1) is equipped with the initial condition

w(x,O) = wO(x), x E il, (4)


and the following set of boundary conditions on appropriate parts of the bound-
ary:

a) plrrx(O,T) = PD, b) Vlrrx(O,T) = VD = (VDl,VD2?, (5)

c) ~
2 (2~Tijni ) Vj + RePr
'Y
on = ° on n x (O,T);
oB

oB
a) Vlrwx(O,T) = 0, b) on Irwx(O,T) = 0; (6)

°
2

a ) 'L" Tijni = 0, j = 1, ... ,2, oB


b) on = on To x (0, T); (7)
i=l

The problem to solve the compressible Navier-Stokes equations, equipped


with the above initial and boundary conditions will be denoted by (CFP)
(compressible flow problem).
262 V. DolejSi

3 DGFE discretization

3.1 Triangulation

By fh we denote a polygonal approximation of the domain D. Let Th (h > 0)


denote a standard triangulation of the closure Dh of the domain Dh into a finite
number of closed triangles.
We set hK = diam(K), h = maxKETh hK. All elements of Yr. will be
numbered so that Yr. = {KdiEJ, where I c Z+ = {0,1,2, ... } is a suit-
able index set. If two elements K i , K j E Yr. have a common edge, we call
them neighbours and put Tij = OKi n oKj . For i E I we set s(i) = {j E
I; K j is a neighbour of Ki}. The boundary oDh is formed by a finite num-
ber of edges of elements Ki adjacent to ODh. We denote all these bound-
ary faces by Sj, where j E h c Z- = {-I, -2, ... } and set ,(i) = {j E
h; Sj is an edge of K i }, Tij = Sj for Ki E Th such that Sj C oKi , j E h.
For Ki not containing any boundary edge Sj we set ,(i) = 0. Obviously,
s(i) n ,(i) = 0 for all i E I. Now, if we write S(i) = s(i) U ,(i), we have

OKi = U Tij , OKi n oD h = U Tij . (8)


JES(i) jE,(i)

Moreover, for i E I, by 'D(i) we denote the subset of ,(i) formed by such


indexes j that the faces Tij approximate the parts of oD, where the Dirichlet
boundary condition is prescribed at least for one component of w. Then, for
the Navier-Stokes equations, with respect of (5)-(7) we have

UU Tij=nUTW' (9)
iEI jE,D(i)

Moreover, we set
'N(i) = ,(i) \ 'D(i), (10)
where a Neumann type of boundary condition is prescribed for all components
ofw.
Furthermore, we use the following notation: nij = (( nij ) l ' (nij ) 2) = unit
outer normal to OKi on the edge Tij (nij is a constant vector on Tij) and
lTij I = length of the edge Tij . Over the triangulation Yr. we define the broken
Sobolev space
Hk(D, Yr.) = {V;VIK E Hk(K) VK E Th }. (11)
For v E Hl(D, Yr.) we set

and [v] rij = vi r - vi rji'


ij
(12)

denoting the average and jump of the traces of v on Tij = Tji , respectively.
Obviously, (V)I'ij = (V)I'jil [V]rij = -[v]I'j., and [V]I'ijnij = [v]I'jinji.
DGM for the Navier-Stokes equations 263

3.2 Approximate solution

We derive the discretization of problems (CFP) with the aid of the DGFEM.
The approximate solution Wh as well as test functions 'Ph are elements of the
finite dimensional space of vector-valued functions

(13)

where
(14)
P E Z+ and PP(K) denotes the space of all polynomials on K of degree:::; p.
Assuming that W is a classical sufficiently regular solution of problem
(CFP) and 'P E [H2(D, lhW, we multiply equation (1) by 'P, integrate over
Ki E lh, apply Green's theorem, sum over all Ki E Th and with the aid of (8)
we arrive at the identity

(15)

j
2
- L L LRs(w, "ilw)(nij)s' 'P dS = O.
iEI jE'"Y(i) r" 8=1

In order to obtain a stable numerical method we add to (15) some sta-


bilization terms which vanish for a smooth solution w. In order to define a
well-posed scheme we have to linearize the viscous terms Rs (w, "il w). From
(2) we obtain

R 1 (w, "ilw) (16)


o
3'2 Re1WI [2 (~
8XI -
~
WI
£!£1.) (
aXI -
8W 3
8X2 -
W3
WI
£!£1.)]
8X2
264 V. DolejS{

3'2 ReWl
1 [2 (awaX23 - aX2 - (~
Wl £.!!!.J..)
!£l!.
aXl - ~
Wl aWl)]
aXl

where R~r) = R~r)(w, Vw) denotes the r-th component of Rs (8 = 1,2, r =


2,3).
Now for w = (W1, ... ,W4)T and ep = ('P1'''','P4)T we define the vector-
valued functions

D 1(w, Vw, ep, Vep) (17)


o
~3 _1_
Re Wl [2 (0£2
aXl _ '£l.
Wl £.!!!.J..)
aXl - (~ ~ £.!!!.J..)]
aX2 - Wl aX2

Re Wl
1 [(~
aXl - ~ aXl + aX2 - '£l.
Wl £.!!!.J..) Wl aWl)]
aX2(0£2
W2 D(2)
Wl 1
+ !!'.!l.
Wl
D(3) +
1
"( [£<.e.i _ .'£±
RePrwl aXl
£.!!!.J.. _
Wl aXl
-..L
Wl
(w 2
012
aXl
+ w3 ~)
aXl

Re Wl
1 [(~
aXl - ~ aWl)
Wl aXl
+ (0£2
aX2 - '£l.
Wl
aWl)]
aX2

3'2 ReWl
1 [2 (~
aX2 - Wl aWl)
'£l.
aX2 - (0£2
aXl - Wl £.!!!.J..)]
'£l.
aXl
W2 D(2)
Wl 2
+ W3Wl
D(3)
2
+ RePrWl
'Y [£<.e.i
aX2
_ .'£± aWl
Wl aX2
_ -..L
Wl
(w 2
012
aX2
+ w2 ~)
aX2

+~ (W2 + 'P2 + W3'P2) ~ ]


where D~r) denotes the r-th component of Ds (8 = 1,2, r = 2,3). Obviously,
D1 and D2 are linear with respect to ep and Vep and

Ds(w, Vw, w, Vw) = Rs(w, Vw), 8 = 1,2. (18)

The definition of Ds, 8 = 1,2 can be given in other forms. We only require
that they are linear with respect eph and satisfy (18). The natural way, how to
perform the linearization of the diffusion terms, follows from (16) where the
DGM for the Navier-Stokes equations 265

space derivatives of ware simply replaced by the derivatives of cp. This lin-
earization gives also the dependence on the gradient of the first component of
cp. However, numerical experiments carried out with the aid of this lineariza-
tion do not yield satisfactory results. Therefore we use (17) which gives terms
DB'S = 1,2 independent of \i'CP1. For more detail see [7].
We add the following terms to the left-hand side of (15):

L L 1 2
L(Ds(w, \i'w, cp, \i'cp)) (nij)s . [w] dS
iEI jEa(i) r'J 8=1
(19)
j<i

+L L
iEI jEr(i)
1 F;j
2
L D8(W, \i'w, cp, \i'cp) (nij)8 . w dS.
8=1
In the second term we use the zero natural Neumann boundary conditions
(7), a)-b) and the Dirichlet conditions are taken into account with the aid of
additional terms on the right-hand side of (15).
Moreover, to the left-hand side of (15) we add the vanishing interior penalty

1
terms
L L CI[W]' [cp] dS (20)
iEI jEs(i) r'J
j<i

with CIIF;j = (Re Irij l)-l and boundary penalty terms balanced by additional
right-hand side terms containing the Dirichlet boundary data.
We arrive at the definition of the following form

+L
iEI
L
jEs(i)
1 2
L(D8(w, \i'w, cp, \i'cp)) (nij)8 . [w] dS
r," 8=1
j<i

- L L 1 2
LR8(w, \i'w) (nij)s' cpdS
iEI jErD (i) r'J 8=1

+L
iEI
L
jErD (i)
1 2
L D8(W, \i'w, cp, \i'cp) (nij)8 (w - WB) dS
r'J 8=1

+L
iEI
L
jEs(i)
J<1.
1 r,"
CI[W]' [cp]dS + L
iEI
L
jErD(i)
1
r'J
CI (w - WB)' cpdS.
266 V. DolejSi

The boundary state WB will be defined later. The convective terms are repre-
sented by the form

Bh(Wh' Cf'h) = - ~ Ii ~ N

fs(Wh) . ~~: dx (21)

+2.: 2.:
iEI jES(i)
JF;j
H(whlF;j,Whlrjilnij)'Cf'h dS, Wh,Cf'h EH1 (D,1h)4,

where H is a suitable numerical flux commonly used in the finite volume


method. We use the numerical flux based on the direct solution of the local
Riemann problem, see [10]. If rij C BDh , then there is no neighbour K j of Ki
adjacent to nj and the values of whlri ) must be determined on the basis of
"inviscid" boundary conditions, see [9].
The boundary state W B = (w Bl , ... , W B4) T is determined in the following
way: We set
(22)
if the r-th component Wr of W is prescribed on rij . Here w;
is the r-th compo-
nent of W* : QT ----+ JR4, which is a function satisfying the boundary conditions
(5) - (7). Otherwise, we set

(23)
which means that we use the "extrapolation" of Wr onto r ij from Ki E Th . In
particular, we have
WB = (pij, 0, 0, Pij 8ij ) on rw, (24)

WB = (PD' PDVDl, PDVD2, Pij 8ij + ~PDIVDI2) on Fr,

where PD and v D = (v D1, VD2) are the given density and velocity from the
boundary conditions (5) - (7) and Pij, 8ij are the values of the density and
absolute temperature extrapolated from Ki onto r ij .
Now the discrete DGFE Navier-Stokes problems read:
Definition 1. An approximate DGFE solution of the compressible Navier-
Stokes problem (eFP) is defined as a vector-valued function Wh such that

a) Wh E C 1 ([0, T]; Sh) (25)


d
b) dt (Wh(t), Cf'h) + Bh (Wh(t), Cf'h) + Ah (Wh(t), Cf'h) = °
\j Cf'h E Sh, t E (0, T),

where w~ is an Sh-approximation ofwo.


The problem (25) exhibits a system of ordinary differential equations which
can be solved with the aid of a suitable ODE solver.
DGM for the Navier-Stokes equations 267

4 Numerical examples
We implemented the numerical scheme (25), a)-c) with the aid of a piecewise
linear approximation on regular triangular grids. The system of ODE was
solved by the explicit Euler method.
Now we consider four cases of viscous flow around the profile NACA0012
with the following data, see [3]:
case Min a Re case Min a Re
C1 0.80 10° 500 (3 0.85 0° 500
(2 2.00 10° 106 (4 0.850° 2000

where Min is the far field Mach number, a the angle of attack and Re the
Reynolds number. We compare our results with the numerical results pre-
sented in [3], where ten methods were applied. The following table contains
our computed lift CL and drag CR coefficients in comparison with [3] (#Ih
denotes the number of elements of the mesh Ih)

computed values reference values from [3]


case #Th CL CD CL CD
(range); mean value (range); mean value
C1 45630.4985 0.1938 (0.4199 - 0.5170); 0.4526 (0.1597 - 0.2868); 0.2559
C2 56400.3969 0.4172 (0.3063 - 0.4059); 0.3443 (0.4120 - 0.4910); 0.4660
C3 49460.0003 0.2304 (0.0000 - 0.0007); 0.0001 (0.1790 - 0.2420); 0.2192
C4 49460.0001 0.1179 (0.0000 - 0.0002); 0.0001 (0.1012 - 0.1360); 0.1171

The computed values of drag and lift correspond to the reference values
from [3]. Figures 1 shows the employed triangulation and the computed isolines
of the Mach number for the case (2.

References
1. Bassi, F., Rebay, S.(1997): A high-order accurate discontinuous finite element
method for the numerical solution of the compressible N avier-Stokes equations.
J. Comput. Phys, 131, 267-279
2. Bassi, F., Rebay, S.(2000): A high order discontinuous Galerkin method for com-
pressible turbulent flow. In Cockburn, B., Karniadakis, G. E., Shu, C.-W. (eds)
Discontinuous Galerkin Method: Theory, Computations and Applications, Lecture
Notes in Computational Science and Engineering 11, pages 113-123. Springer-
Verlag
3. Bristeau, M. 0., Glowinski, R., Periaux, J., Viviand, H. eds (1987): Numerical
Simulation of Compressible Navier-Stokes Flows, volume 18 of Notes on Numerical
Fluid Mechanics. Vieweg, Braunschwig
4. Cockburn B. (1999): Discontinuous Galerkin methods for convection dominated
problems. In Barth, T.J., Deconinck, H., (eds) , High-Order Methods for Com-
putational Physics, Lecture Notes in Computational Science and Engineering 9,
pages 69-224. Springer, Berlin
268 V. Dolejsi

-1 o 2 3 4

-1 o 2 3 4

Fig. 1. Viscous flow along NACA 0012, case (2, triangulation (top), isolines of the
Mach number (bottom)

5. Cockburn, B., Karniadakis, G. E., Shu, C.-W. eds (2000) Discontinuous Galerkin
methods. Lecture Notes in Computational Science and Engineering 11., Springer,
Berlin
6. DolejSf, V. (2002): A higher order scheme based on the finite volume approach.
In R. Herbin and D. Kroner, (eds), Finite Volumes for Complex Applications III
(Problems and Perspectives), pages 333~340. Hermes
7. Dolejsi, V.: On the discontinuous Galerkin method for the numerical solution of
the Navier~Stokes equations. Int. J. Numer. Methods Fluids, (submitted)
8. Dolejsi, V., Feistauer, M., Sobotikova, V.: A discontinuous Galerkin method for
nonlinear convection~diffusion problems. Comput. Methods Appl. Mech. Eng.
(submitted)
9. Feistauer, M., Felcman, J., StraSkraba, I. (2003): Mathematical and Computa-
tional Methods for Compressible Flow. Oxford University Press, Oxford
10. Toro, E.F. (1997): Riemann Solvers and Numerical Methods for Fluid Dynamics.
Springer-Verlag
A Finite Volume Scheme on General Meshes
for the Steady N avier-Stokes Equations in Two
Space Dimensions

Robert Eymard 1 and Raphale Herbin 2

1 Universite de Marne-la-Vallee, Paris, France [email protected]


2 Universite de Provence, Marseille, France [email protected]

Summary. We introduce a new finite volume scheme for the discretization of the
incompressible Navier-Stokes equations on general meshes, for which we prove con-
vergence without any condition on the regularity of the solution. Numerical results
are presented.

1 The incompressible N avier-Stokes equations

Numerical schemes for the Navier-Stokes equations (1) have been extensively
studied: see [7, 11, 12, 13, 8, 6, 15] and references therein. An advantage of
the finite volume schemes is that the unknowns are approximated by piece-
wise constant functions: this makes it easy to take into account additional
nonlinear phenomena or the coupling with algebraic or differential equations,
for instance in the case ofreactive flows. In [11] is presented the classical finite
volume scheme on rectangular meshes, which is the basis of many industrial
applications. A convergence proof of the so-called MAC scheme is given in
[10] in the case of a uniform rectangular grid. However, the use of rectangular
grids limits the type of domain which can be gridded, and more recently, finite
volume schemes for the Navier-Stokes equations on triangular grids have been
presented: see for example [9] where the vorticity formulation is used, and [2]
where primal variables are used with a Chorin type projection method (but
no proof of convergence is known). Here, we propose a new method using the
primitive variables and enforcing the divergence condition directly, using quite
general meshes such as mixed rectangular-triangular or Voronol meshes, and
for which we are able to prove convergence under general conditions (in par-
ticular, no regularity of the exact solution is required). An error estimate in
the case of the linear Stokes equations was presented in [4].
We seek an approximation of u = (u(l), u(2))t E HJ(Sl) x HJ(Sl) and
P E L2(Sl) , weak solution to the incompressible generalized Navier-Stokes
equations:

TJU(i) - v,dU(i) + OiP + u(l)OlU(i) + u(2)02U(i) = f(i) in Sl, for i = 1,2,


(1)
Olu(l) + 02U(2) = 0 in Sl.
270 R. Eymard, R. Herbin

where TJ 2:: 0, u (1 ) and u (2 ) are the two components of the velocity, p denotes
the pressure, 1/ the viscosity of the fluid, under the following assumptions:

D is a polygonal open bounded connected subset of ]R2, (2)

1/ E (0,+00), TJ E [0,+00), (3)


fCi) E L2(D), for i = 1,2. (4)
The terms TJu Ci ) appear when considering an implicit time discretization of

°
the unsteady Stokes or Navier-Stokes equations (with TJ as the inverse of the
time step, TJ = yields in the steady-state.
We prescribe for both problems a homogeneous Dirichlet boundary condi-
tion on the velocity (u(1),u C2 »). Let us denote by x = (x(1),x C2 ») any point of
D and by dx the 2-dimensional Lebesgue measure dx = dx (1 )dx C2 ).

Definition 1 (Weak solution). Under hypotheses (2)-(4), u = (uC1), u (2 »)t


is called a weak solution of (1) if and only if

u = (u(1),u C2 »)t E E(D),


TJ {;;2 l u(i)(X)VCi) (x)dx + 1/ (;;2 l 'Vu Ci ). 'Vv Ci ) (x)dx + b(u,u,v) =
{;;2 l f Ci ) (x)v(i) (x)dx, \Iv = (v C1 ),v C2 »)t E E(D),

(5)
where the trilinear form b is defined for all u, v, wE (HJ(D))2 by

b(u,v,w) = L L
k=1,2 i=1,2 n
1 u(i)(x)aiVCk)(x)wCk)(x)dx, (6)

which classically satisfies, for all u E E( D),

b(u,v,w) = L L
k=1,2 <=1,2
l ai(u(i)VCk»)(X)WCk) (x)dx.

2 The finite volume scheme

Definition 2. [Admissible discretization] Let D be an open bounded polyg-


onal subset of]R2, and aD = D \ D its boundary. An admissible finite volume
discretization of D, denoted by D, is given by D = (M, E, P, V), where:

- M is a finite family of non empty open polygonal convex disjoint subsets of


D (the "control volumes") such that D = UKEMK. For any K E M, let
oK = K \ K be the boundary of K and m(K) > denote the area of K. °
A Finite Volume Scheme on General Meshes 271

...... point XK

r········· ···point Zo-

.... ·point XL

Fig. 1. Example of an admissible triangular discretization

- £ is a finite family of disjoint subsets of n (the "edges" of the mesh), such


that, for all (j E £, there exists a hyperplane E of]R2 and K E M with
0' = oK n E and (j is a non empty open subset of E. We then denote by
ma > 0 the i-dimensional measure of (j. We assume that, for all K E M,
there exists a subset £K of £ such that oK = UaEEKO'. It then results from
the previous hypotheses that, for all (j E £, either (j C an or there exists
(K, L) E M2 with K =1= L such that K n L = 0'; we denote in the latter case
(j = KIL.
- P is a family of points of n indexed by M, denoted by P = (XK )KEM. The
coordinates of XK are denoted by xC;;, i = 1,2. The family P is such that,
for all K EM, x K E K. Furthermore, for all (j E £ such that there exists
(K, L) E M2 with (j = KIL, it is assumed that the straight line (XK' XL)
going through XK and XL is orthogonal to KIL. For all K E M and all
(j E £K, let Za be the orthogonal projection of XK on (j. We suppose that
Za E (j.
- V is a finite family of non empty open polygonal disjoint subsets of n
(constituting the "dual mesh" of M), which are centered around the ver-
tices (Xs)s=l,Nv in the following way (Nv is the number of vertices);
for 1 :::; s :::; Nv, let Ms C M be the set of control volumes to which Xs is a
vertex. For K EMs, denote by (jK,s,l and (jK,s,2 E £K the two edges of K
with vertex Xs. Define Ks as the convex hull of the four points
272 R. Eymard, R. Herbin

The dual cell around x s , denoted by S, is then defined as (also see Figure
1):

S = UKEMsKs.

Since there is a one-to-one mapping between the set {I, ... , N v } c Nand
the set V, we shall replace all subscripts s by S when dealing with the dual
mesh. Let VK denote the set of vertices of a given control volume K. Note
that:

The size of the discretization is defined by:

size(V) = sup{ diam(K), K EM}.


The regularity of the mesh is defined by

angle(V) = inf {lz;XKXsl, IZ.;:XSXKI,K E M, S E VK, 0' E EK nEs}, (7)

where Ixyzl designates the absolute value of the measure of the angle xyz (note
- - - = "27f - ZIIXSXK·
t hat ZIIXKXS ---)
.
For all K E M and 0' E EK, we denote by nK,1I the unit vector normal to
0' outward to K. We denote by dK,1I the Euclidean distance between XK and
0'. We then define
m ii
TK,II = -d-'
K,II
The set of interior (resp. boundary) edges is denoted by E jnt (resp. Eext) ,
that is E jnt = {o' E E; 0' ct
aD} (resp. Eext = {o' E E; 0' caD}). For any
0' E Ejnt,O' = KIL (resp. Eext' 0' E EK), let XII be the center point of the line
segment [XKxL] (resp. [XKZII]) , and X~l) and x~2) its coordinates.
For all K E M and all S E VK, let 0'1 and 0'2 E EK n Es numbered such
that (x~~) - x~))(x~;) - x~l)) - (x~21- x~))(x~;) - x~l)) > 0.
Let A (1) = x(2) _ x(2) and A (2) = x(l) _ x(l)
K,S II} 1I2 K ,S 1I2 II} .

Definition 3. Let D be an open bounded polygonal subset ofJRN, with N E N*.


Let V = (M, E, P, V) be an admissible finite volume discretization of D in the
sense of Definition 2. We denote by HD(D) c L2(D) the space of functions
which are piecewise constant on each control volume K EM. For all W E
HD(D) and for all K E M, we denote by WK the constant value of W in K
and we define (WII )IIEE by:

W II = 0, yO' E Eext (8)

and
A Finite Volume Scheme on General Meshes 273

Let Lv(D) be the space of functions which are piecewise constant on the
domains S, for all S E V. Let divv : (Hv(D))2 -+ Lv(D) be defined by:

divv(u)(x) = mts) L L A}2,s ut;/,


KEMs i=1,2
for a.e. XES, \:IS E V.

We then set Ev(D) = {u E (Hv(D))2,div v (u) = O}. For (v,w) E (Hv(D))2,


we denote by

[v,wlv= L LTK,rr(Vrr-VK)(Wrr-WK), (10)


KEM rrEEK
Remark that thanks to (9), one has:

[v,Wlv =

where Krr denotes the control volume to which IJ is an edge. We define a norm
in Hv (D) (thanks to the discrete Poincare inequality (11) given below) by

IWlv = ([w, wlV)1/2 .


Similarly, for u = (u(1),u(2»)t E (Hv(D))2, v = (v(1),v(2»)t E (Hv(D))2 and
W = (w(1),w(2»)t E (Hv(D))2, we define:

and
[v, w]v = L [v (i) ,w(i)]v.
i=1,2

The discrete Poincare inequality (see [3]) writes:

IIWIIL2(st) :s; diam(D)lwiv, \:Iw E Hv(D). (11)

We only present here a centered finite volume scheme, and refer to [5l for the
upstream version. Under hypotheses (2)-(4), let D be an admissible discretiza-
tion of D. Let A E (0, +00). The finite volume scheme for the approximation
of the solution (1) writes: find u such that

u E Ev(D),
TJ In u(x) . v(x)dx + v[u, vJv + bv(u, u, v) = In f(x) . v(x)dx, \:Iv E Ev(D),
(12)
where, for u, v and w E Hv(D),
274 R. Eymard, R. Herbin

bv(u,v,w) = L L w<j;) L v~k) L A~s u}2


KEM k=I,2 SEVK i=I,2
(13)
(k) 1 '" (k)
Vs = m(S) ~ m(K n S) vK , VS E V, k = 1,2.
KEMs

The trilinear form bv(u, v, w) satisfies some continuity properties in (Hv(D))3


(see [5] for the proof).

Lemma 1. [Continuity of the trilinear form in discrete HI space] Un-


der Hypothesis (2), let V be an admissible discretization in the sense of Defini-
tion 2, let a > 0 be such that angle(V) ::::: a, let Hv(D) be the space of piecewise
constant functions defined in 3 and let bv be the trilinear form defined by (13).
Then there exists C 1 > 0, only depending on a, such that:

(14)

As in the case of the linear problem (see [4]), we use the following penalized
approximation of (12):

-1 1
(u,p) E (Hv(D))2 X Lv(D),
v ([u, v]v) p(x)divv(v)(x)dx + bv(u, u, v) = f(x) . v(x)dx,
(15)
Vv E (Hv (D))2,
divv(u) = -A size(V) p,

3 Convergence of the scheme


The following proposition gives a sufficient condition for the existence and
uniqueness of a solution to the scheme (with or without penalization), un-
der the classical assumption that the data are small, or the viscosity is large
enough (see [14] Theorem 1.3 page 167 for the continuous case). Note that
in the continuous case, the "small data" assumption is only required to prove
uniqueness, not existence. Here, however, this assumption is also required for
the existence of a discrete solution. Moreover, uniqueness is only proven for
" small enough" solutions.

Proposition 1 [Existence and uniqueness of small discrete solutions


in the small data case, with or without a penalization] Under hypothe-
ses (2) -( 4), let V be an admissible discretization of D in the sense of Definition
2 and let a > 0 with angle(V) ::::: oo. Let C 1 be the real value which only depends
on a, given by (14) of Lemma 1. Assume that the condition

1 ('"
v2 ~211f (i) IIL2(!2) ) < C 2 ._ 1
.- 4diam(D)C 1
(16)
A Finite Volume Scheme on General Meshes 275

is fulfilled. Then there exists one and only one function u E (Hv(D))2 such
that

1"lv <: C, 2~' [v - (v' - ~,llf")III,(n»)


4( diam(It)C, ) V'] ,
(17)
and u is solution to (12) and (13) (no penalization), or u is such that there
exists a function p with (u, p) solution to (15) and (13) for a given>.. E (0, +(0).
Furthermore, in the latter case, the following inequality holds:

), eize(V) IlpIlI,(o) <: diam(lt) ( ~,II f(;) I p(n») + C, c, '. (18)

and the function p is unique too.

Proposition 2 [Convergence of the centered penalized scheme in the


nonlinear case) Under Hypotheses (2)-(4), let o! > 0 be given and let C 2 > 0
be given by Proposition 1. We assume that the property (16) holds. Let>.. E
(0, +(0) be given and let (v(n»)nEN be a sequence of admissible discretization of
D in the sense of Definition 2, such that lim size(V(n») = 0 and angle(V(n») ~
n->oo
O!,for all n E N. Let (u(n),p(n») E (Hv(n) (D))2 X LV(n) (D) be a solution to
(15), (13), (17). Then there exists a subsequence of the sequence (u(n) )nEN
which converges in L2(D)2 to u, weak solution of the Navier-Stokes problem
in the sense of (5). If C2 is taken small enough, the uniqueness property of
the solution entails the convergence of the whole sequence.

4 Numerical results

Experiments with an analytical solution were performed. For the centered


scheme, the results indicate a rate of convergence of h 2 for the velocities, and
better than hO. 5 for the pressures in the case of unstructured triangular meshes.
In the case of rectangular meshes or structured triangular meshes, we obtain
an order h 2 for the velocities and better than h for the pressures. For the
upstream weighting scheme on structured meshes, we obtain an order hO. s for
the velocities.
Some experiments were also carried out for the classical example of the lid
driven cavity, using triangular meshes. We refer to [5] for these, and shall only
give here some results on the backward facing step, for a Reynolds number
equal to 800. This is a well documented case in the literature (see e.g. [1), and
allows to test the performance of methods with respect to the precision on the
zones of recirculating flow. The geometrical data of the backward step is taken
from [1]. We computed the streamlines using a reconstruction of a discrete
276 R. Eymard, R. Herbin

potential q)(J"> located at the edges (J E E of the mesh (see [5]). We present
in Figure 2 the streamlines in three different cases: starting form the top, the
first figure is obtained with the centered scheme, using a 25200 rectangular grid
blocks mesh, the second one with the centered scheme using a 2800 rectangular
grid blocks mesh, the third one with the upstream scheme using a 2800 rect-
angular grid blocks mesh, and the two last ones with respectively the centered
and the upstream scheme for 847 cells. It is clear from these figures that the
centered scheme is, as one could expect, more precise, but that it becomes un-
stable for coarser meshes. In fact, for a mesh of 700 cells, the Newton iterations
do not converge, even when using an under-relaxation procedure.

Centered scheme, 25200 cells

Centered scheme, 2800 cells

Upstream scheme, 25200 cells

~-~-=------,,--- ~
~ - -- --
-- - -
~.~~.~~~- -~~- -

Centered scheme, 847 cells

- -- --==- --- ~-- - - ------- =-~-

~--~~-~- -- ---- - - - ---


Upstream scheme, 847 cells

Fig. 2. Streamlines for the backward step

The numerical solution obtained with the centered scheme, using a 25200
rectangular grid blocks mesh seems to be precise enough (comparing the sep-
aration and reattachment lengths with those of the literature, see [5]) to be
used as a reference solution for experiments carried out on coarser meshes.
This allows to compute a rate of convergence of h 2 .
We conclude from these numerical tests that the upstream scheme is too
diffusive and cannot be used for accurate results, although it has the advantage
of remaining stable even on coarse meshes. The centered scheme yields accurate
results for a reasonable number of Newton iterations (typically between 5
and 15).
A Finite Volume Scheme on General Meshes 277

Future developments will concentrate on the extension to three-dimensional


meshes and to the time-dependent case.

References
1. B.F. Armaly, F. Durst, J.C.F. Pereira and B. Schonung, "Experimental and The-
oretical investigation of backward-facing step flow" J. Fluid Mech. (1983) vol. 127,
pp 473-496.
2. S. Boivin, F. Cayre, J.M. Herard, A finite volume method to solve the Navier-
Stokes equations for incompressible flows on unstructured meshes, Int. J. Therm.
Sci., 38, 806-825, 2000.
3. R. Eymard, T. Gallouet and R. Herbin, Finite Volume Methods, Handbook of
Numerical Analysis, Vol. VII, pp. 713-1020. Edited by P.G. Ciarlet and J.L. Lions
(North Holland).
4. R. Eymard and R. Herbin, A cell-centered finite volume scheme on general meshes
for the Stokes equations in two dimensions, 125-128, t.337, 2, 2003. CRAS,
Mathematiques.
5. R. Eymard and R. Herbin, A finite volume scheme on general meshes for the
steady Navier-Stokes problem in two space dimensions, LAPT Report nO , sub-
mitted
6. J.H. Ferziger, M. Peric, Computational Methods for Fluid Dynamics. Springer,
Berlin, 1996.
7. V. Girault, P.-A. Raviart, Finite element methods for the Navier-Stokes equa-
tions: Theory and algorithms, Springer, Berlin, 1986.
8. M.D. Gunzburger, Finite element methods for viscous incompressible flows, A
guide to thoery, practice, and algorithms, Computer Science qnd Scientific Com-
puting, Academic Press 1989.
9. M.D. Gunzburger and R.A Nicolaides Incompressible computational fluid dy-
namics, Cambridge University Press, 1993.
10. R.A Nicolaides and X. Wu, Analysis and convergence of the MAC scheme II,
Navier-Stokes equations, Math. Compo 65 (1996), 29-44.
11. S.V. Patankar, (1980), Numerical Heat Transfer and Fluid Flow, Series in Com-
putational Methods in Mechanics and Thermal Sciences, Minkowycz and Sparrow
Eds. (Mc Graw Hill).
12. R. Peyret and T. Taylor, Computational methods for for fluid flow, Springer,
New-York,1893.
13. O. Pironneau, Finite element methods for fluids, John Wiley and sons, 1989.
14. R. Temam, Navier-Stokes Equations, Studies in mathematics and its applica-
tions, J.L. Lions, G. Papanicolaou, R.T. Rockafellar Editors, North-Holland,
1977.
15. P. Wesseling, Principles of Computational Fluid Dynamics, Springer, Berlin,
2001.
Existence and Uniqueness of a Weak Solution
to a Stratigraphic Model

Robert Eymardl, Thierry Gallouet 2 , Veronique Gervais 3 and Roland


Masson 3

1 Dept de Mathematiques, Universite de Marne-La-Vallee, Marne-La-Vallee,


France; eymard@math. univ-mlv.fr
2 LATP, Universite de Provence, Marseille, France;
Thierry. [email protected]
3 Institut Fran<;ais du Petrole, Rueil-Malmaison, France; Veronique. [email protected],
Roland. M [email protected]

Summary. In this paper, we study a multi-lithology diffusion model used to simu-


late the evolution through time of a sedimentary basin composed of several lithologies
such as sand or shale. It is a simplified model for which the surficial flux in lithology
i is taken proportional to the slope and to a lithology fraction ci in lithology i at
the top of the basin with a unitary diffusion coefficient. Thus, the sediment thickness
variable satisfies a linear parabolic problem and decouples from the other unknowns.
The remaining equations couple, for each lithology, a first order linear equation for
the surface concentration ci with a linear advection equation for the basin concen-
tration, for which ci appears as an input boundary condition at the top of the basin
in case of sedimentation. The existence and uniqueness of a weak solution in L= is
proved for this problem.

1 Introduction

Thanks to recent progress in geosciences, the process of sedimentary basin infill


is generally well understood today, and is often considered as the response to
the interaction between three main processes : the available space created in
the basin by sea level variations, tectonic, compaction ... ; the sediment supply
(boundary fluxes, sediment production); and the transport of the sediments at
the surface of the basin. This interaction is essentially treated in a qualitative
manner using field data, such as seismic or well data, but these informations
can be difficult and expensive to get. Thus, a 4D numerical model of the basin
appears as a powerful tool to solve the problem, and stratigraphic models are
developed to answer the need for quantifying the sedimentary basin infill.
A stratigraphic model describes the evolution through time of sedimentary
basins in terms of geometry and rock properties. These models are suited
for large scales in time and space (greater that 10 km and 10.000 yr), and
thus average several geological processes such as transport processes (river
transport, creep, slumps, ... ). Descriptions of such models are given in [4], [6],
[7] and [8].
Existence and Uniqueness of a Weak Solution to a Stratigraphic Model 279

We consider here the stratigraphic model detailed in [2], in which sediments


are modeled as a mixture of several lithologies characterized by their grain
size population (sand or shale for example). The surficial flux of lithology i,
i = 1, ... , L, is taken as in [7] proportional to the slope of the topography
h and to a lithology fraction ci defined at the surface of the basin. In this
paper, the diffusion coefficients are taken equal to one, leading to a simplified
model in the sense that the sediment thickness variable h decouples from the
other unknowns, the L concentrations Ui in lithology i inside the basin and
the L surface concentrations cf in lithology i at the top of the basin, and
satisfies a linear parabolic problem. The remaining equations, accounting for
the mass conservation of the lithologies, couple for all i = 1, ... ,L a first order
linear equation for the surface concentration variable cf and a linear advection
equation for the basin concentration variable Ui, for which ci appears as an
input boundary condition at the top of the basin. A weak formulation has been
introduced for this coupled problem (see Definition 1).
The aim of this paper is then to study the problem satisfied by the concen-
tration variables, and more especially to state the existence and uniqueness of
a weak solution in Loo. This result is given below in Theorem 1. The proof of
existence has already been achieved in [1] and will be briefly recalled in sec-
tion 3. It derives from the convergence of an implicit finite volume scheme. The
uniqueness will be obtained using the linearity of the coupled problem in the
concentration variables, the existence of a weak solution to the adjoint system
and two integration by part formulae for the non smooth solutions of the direct
and adjoint problems. Then, the paper outlines as follows: the mathematical
model and its weak formulation are described in section 2, and the proof of
existence and uniqueness of a weak solution is achieved in section 3.

2 Mathematical Model

We consider in this paper the model defined by Eymard et al. in [2] in a sim-
plified case for which the diffusion coefficients of the lithologies are taken equal
to one. Furthermore, the sea level variations and the ground distortions are
not taken into account.
Let us denote by h(x, t) the sediment thickness variable, function of time
t > 0 and of x E fl C IR d , d = 1 or 2, the horizontal extension of the
basin. The sediments are modeled as a mixture of L immiscible lithologies,
such as sand or shale, characterized by their grain size population, and con-
sidered as incompressible materials of constant grain density and null poros-
ity. Inside the basin, the mixture is described by its composition given by
the L concentrations Ci(X, z, t) ;::: 0 in lithology i defined on the domain
B = {(x, z, t) I x E fl, t > 0, z < h(x, t)}, and satisfying 2:~=1 Ci = 1. The
sediments transported by the surficial fluxes, i.e. deposited at the surface in
case of sedimentation and passing through it in case of erosion, are character-
280 R. Eymard et al.

ized by their concentrations cHx, t) ::::: 0, defined on [l x 1R,+, and also satisfying
2:~=1 cf = l.
Since the sediment fluxes are non zero only at the surface of the basin, no
change of the sediment composition occurs inside the basin: OtCi = 0 on B.
The evolution of Ci is then only governed by the boundary condition at the
top of the basin stating that cilz=h = ci in case of sedimentation (Oth > 0).
An initial condition to the basin concentrations is also prescribed: Ci It=o = c?
on {(x, z) Ix E [l, z < hO(x)}.
Let us now consider these equations in the new coordinate system (x,~, t) =
(x', h(x', t') - z, t') in which the vertical position of a point is measured down-
ward from the top of the basin, and let us define Ui(X,~, t) = Ci(X, h(x, t) -~, t)
on [l x 1R,+ X 1R,+, u?(x,~) = c?(x, hO(x) -~, t) on [l x 1R,+. Then, we get the
new problem:

OtUi + Oth 0eUi = 0 on [l x 1R,+ x 1R,+,


{ uile=o = cf on V+ = {(x, t) E [l x 1R,+ Ioth(x, t) > O}, (1)
Uilt=o = u? on [l x 1R,+.
The surficial transport process is the multi-lithology diffusive model introduced
in [7] for which the flux of lithology i is taken proportional to the slope and
to the surface concentration: <Pi = - ci k i 'Vh. The coefficient k i > 0 is the
diffusion coefficient of the lithology i, chosen equal to one in this paper for
all i = 1, ... , L. Therefore, the sediment thickness variable decouples from
the other unknowns and satisfies a linear parabolic equation as we shall see
later. Then, the model accounts for the conservation of the fraction Mi(x, t) =
Joh(x,t) Ui(X,~, t) d~ in lithology i, stating that

{ uile=o Ot h + div (-cf 'Vh) = 0 on [l x 1R,+, (2)


2:~=1 ci = 1 on [l x 1R,+.
In this equation, uil€=o Oth is formally equal to OtMi thanks to (1). A Neu-
mann boundary condition is imposed to h on o[l x 1R,+ : 'Vh· nla!?XlR:t = g,
with n the unit normal vector to o[l outward to [l, as well as the initial
condition hlt=o = hO on [l. Finally, Dirichlet input boundary conditions are
prescribed to the surface concentrations : cf 117+ = Ci on 17+ = {(x, t) E
o[l x 1R,+ I g(x, t) > O}.
Summing the first equation of (2) for all i = 1, ... ,L, it appears that, for
this simplified model, the sediment thickness variable decouples from the other
unknowns and satisfies the linear parabolic equation

oth - f'::l.h = 0 on [l x 1R,+,


{ 'Vh· nla!?XlR:t = 9 on o[l x 1R,+, (3)
hlt=o = hO on [l.
The solution of this problem is then used in the remaining equations (4) ac-
counting for the mass conservation of the lithologies. They couple for each
Existence and Uniqueness of a Weak Solution to a Stratigraphic Model 281

lithology a first order linear equation for the surface concentration ef with a
linear advection equation for the concentration Ui, for which ef appears as an
input boundary condition at the top of the basin in case of sedimentation:

In the sequel, the following hypothesis are made on the data:


Hypothesis 1
(i) n is an open bounded subset of JRd, of class C=,
(ii) hO E C2 (D), 9 E 1 c umx JR+) n L2(8n x JR+), and g, hO are chosen so
that the unique solution h of (3) is in C2(Q x [0, T]) for all T > 0,
°
(iii) Ci E L=(E+) with Ci ::::: for i = 1, ... ,L, and I:f=l Ci = 1,
°
(iv) u? E L=(n x JR+), u? ::::: for i = 1, ... , L, and I:f=l u? = 1,
(v) For all T > 0, the boundaries 8E;}; and 8ET of the sets E;}; = {(x, t) E
8n x (0, T) I g(x, t) > o} and ET = {(x, t) E 8n x (0, T) Ig(x, t) < o} are the
union of a finite number of C1 manifolds of dimension at most d - 1,
(vi) For all T > 0, the boundaries 8V;j; and 8VT of the sets V;}; = {(x, t) E
n x (0, T) I 8t h(x, t) > O}, and V T = {(x, t) E n x (0, T) I 8t h(x, t) < O} are
the union of a finite number of C1 manifolds of dimension at most d.
We shall also denote by £ the operator £ = 8 t + 8 t h8t;, and by Cr;:'(JRn)
the space of real valued functions {cp E C= (JRn) I Supp( cp) bounded in JRn}.
To cope with the difficulty to define the trace of the basin concentration
at the top of the basin, we are looking for weak solutions defined as follows:
Definition 1. Let us assume that Hypothesis 1 holds, and let h denote the
solution of (3). Then (Ui, enE L=(n x JR+ x JR+) x LOO(n x JR+) is said to
be a weak solution of problem (4) if it satisfies :
°
(i) for all cp E {q'J E Cr;:'(JR d+2) I q'J(., 0,.) = on n x JR+ \ V+},

r r r [8tcp(x,~,t)+8th(x,t)8t;,cp(x,~,t)lui(x,~,t)dtd~dx
In lIR+ lIR+
+ r r u?(x,~)cp(x,~,O)d~dx+ r r 8t h(x,t) ef(x, t)cp(x, O,t) dtdx =0,
lnlIR+ lnlIR+
°
(5)
(ii) for all 'lj; E {q'J E Cr;:'(JR d+2) Iq'J(., 0,.) = on 8n x JR+ \ E+},
282 R. Eymard et al.

(6)

Then, the main result of this paper is the following Theorem. Its proof will be
developed in the next section.
Theorem 1. Assuming that Hypothesis 1 holds, there exists a weak solution
(u;, enE LOO(D x 1R+ x 1R+) x LOO(D x 1R+) for all i E {I, ... , L} to problem
(4) in the sense of Definition 1, and u; is unique.

3 Existence and Uniqueness of a Weak Solution

The aim of this section is to prove Theorem 1. The proofs of Lemmae 1 and 2
used in the sequel are technical and will be detailed in a forthcoming paper.

en
The existence of a weak solution (u;, E LOO(DxIR+ xIR+) xLOO(DxIR+)
to (4) in the sense of Definition 1 is obtained by convergence of an implicit
finite volume scheme for the model, and has already been proved in [1]. Let us
just recall the main stages of this proof.
In the sequel, we shall consider admissible finite volume meshes defined as
follows:
Definition 2. Let D be a bounded domain of IR d , d = 1 or 2. An admissible
finite volume mesh of D for the discretization of problem (3)-(4) is given by a
family of "control volumes", denoted by K, which are open disjoint subsets of
D, and a family of points of D, denoted by P, satisfying the following proper-
ties,'
1. The closure of the union of all the control volumes of K is n.
2. For any K, K' E K with K -I- K', either the (d - I)-dimensional measure of
R n R', denoted by m( R n R'), is null, or it is strictly positive and R n R' is
included in an hyperplane of IR d . In the following, we will denote by E;nt the
family of subsets 0' of D contained in hyperplanes of IR d with strictly positive
measures, and such that there exist K, K' E K with m(RnR') > 0 and ij = RnR'.
We shall also denote by KIK' E E;nt the edge between the cells K and K'.
3. The family P = (X")"EK is such that x" E R (for any K E K), and, if
0' = K K', it is assumed that x" -I- x"' and that the straight line going through
1

x" and x"' is orthogonal to the edge 0'. We shall denote by d(K, K') the distance
between the points x" and x""
4. For any K E K, there exists a subset E" of E;nt such that OK \ aD =
Existence and Uniqueness of a Weak Solution to a Stratigraphic Model 283

U aD) = UO"EE"i7.
Pi, \ (I\:
We shall denote by (K, E int , P) this mesh, and by 8K = sup {diam(l\:) , I\: E K}
its size.

°
Let (K, E int , P) be an admissible mesh of D. The time discretization is denoted
by tn, n E IN, such that to = and Llt n +1 = t n + 1 - t n > 0. The superscript
n, n E IN, will be used to denote that the unknowns are considered at time
tn. For each control volume I\: E K and each time tn, n ~ 0, h~ shall denote
the approximation of the sediment thickness on I\: at time tn, c7 ",(z), z E
(-00, h~), the approximation of the basin concentration Ci in lithology i in
the column {(x,z)lx E I\:,Z < h(x,tn )}, and c:,;+1 the approximation of
the surface concentration in lithology i at time t;"+1 on 1\:. Then, (3)-(4) is
discretized by a fully implicit time integration and a finite volume method
with cell centered variables. For the computation of the fluxes at the edges of
the control volumes, the discretization uses an upstream weighted evaluation of
the surface concentrations. The approximate concentration c~~l is the solution
at time t n +1 of the conservation equation '

and u7 "'(~) = c7 ",(h~ - ~) for all ~ > 0. One can refer to [2] or [1] for the
compl~te numeri~al scheme.
Then, the approximate sediment thickness (h~)"'EK satisfies an implicit
finite volume numerical scheme for the parabolic problem (3) for which exis-
tence, uniqueness and error estimates have already been proved in [3]. Con-
cerning the concentration variables, we show the existence of solutions to the
discrete problem bounded in the interval [0, 1], which are unique except for the
surface concentration variables c;,;+1. For any admissible mesh (K, E int , P) of
D, any time step Llt > 0, and i =' 1, ... , L, let us define the piecewise constant
functions hK,LJ.t, cf,K,LJ.t on D x 1R'f., and Ui,K,LJ.t on D x 1R'f. x 1R'f. by

s (t) = c,s,n+l (7)


Ci K
, ,LJ.t x, ",'"
v

for all x E 1\:, I\: E K, t E (tn, tn+l], n ~ 0, ~ E 1R'f., where h~, and u?,,,,<:;+1
are any given solution of the discrete problem with c:,;+1 bounded in the in-

°
terval [0,1]. For all mE IN, let (Km, E: t , Pm) be an admissible mesh of D and
Lltm > 0, and let us assume that Lltm ---+ 0, 8K m / VLltm ---+ as m ---+ 00 and
°
that there exists a > such that, for all m E IN, maX~E E~ d 15(K m, ) < a. Let
~nt
u=K:I,.,.,'
Y\,,/'i; -

hKm,LJ.t m , Ui,Km,LJ.t~, cf,Km,LJ.t~ defined by (7) with K = Km and Llt = Lltm·


Then the stability of the discrete concentrations gives the convergence, up to
a subsequence, of (Ui,Km,LJ.t~' cf,Km,LJ.t~ )mElN in Loo for the weak-* topology
as m ---+ 00. To prove that the limit (Ui, ci) is a weak solution of (4), we finally
284 R. Eymard et al.

use an interpolation in time of the approximate basin concentration Ui,lC= ,Llt=


which converges towards Ui in Loo, and a weak-BY estimate for the flux terms,
which is an adaptation to the coupling of a parabolic and an hyperbolic equa-
tion of the result proved in [3] for the coupling of an elliptic and an hyperbolic
equation in the case of a two-phase Darcy flow. The detailed proof is given in
[1].

Let us now show the uniqueness of Ui. It is achieved using mainly the
linearity of (4) in the concentration variables and the adjoint problem on x n
(0, T), T > 0.
For any given surface concentration cf E Loo(n x 1R*+-), we have first stud-
ied the weak formulation (5) of the linear advection equation LUi = with
input boundary condition ci on D+ and initial condition u~. Using the charac-
°
teristic solution of this problem (see [5]), we have proved the following Lemma,
in which an integration by part formula for the solutions of the advection equa-
tion and its adjoint problem is stated:
Lemma 1. Hypothesis 1 is assumed to hold. Then, for any time T > 0, any
functions f E Loo(n x 1R*+- x (0, T)), lS E Loo(Dt), and VO E Loo(n x 1R*+-),
the equation

Lv = f on n x 1R*+- x (0, T),


{ vl~=o = lS on Dt, (8)
vlt=o = VO on n x 1R*+-,

°
has a unique weak solution in Loo(n x 1R*+- x (0, T)) in the sense that for
all rp E {¢> E C'g"(IRd+2) I ¢>(., 0,.) = on n x (0, T) \ Dt and ¢>(.,., T) =
° on n x 1R*+-}, one has

r r r
T
((Lrp)(x,~, t) v(x,~, t) + f(x,~, t) rp(x,~, t)) dt d~ dx
JnJIR+Jo T
+ r r vO(x,~)rp(x,~,O)d~dx+ r r Oth(x,t) lS(x,t) rp(x,O,t) dtdx =0.
In JIR+ In Jo
(9)

function v Oth has a trace on ~ =


rp E C'g"(IRd+2) one has
°
The weak solution v of (8) has a trace on t = T in Loo(n x 1R*+-), and the
in Loo(n x (0, T)), such that for any

r r rT((Lrp)V+frp)(x,~,t)dtd~dX+ r r (v(x,~,O)rp(x,~,O)
Jn J1R+ Jo Jn J1R+
L
-v(x,~, T) rp(x,~, T)) d~ dx + foT Oth(x, t) v(x, 0, t) rp(x, 0, t) dt dx = 0.
Let T >
equation
°and w be the weak solution in Loo(n
(10)
x 1R*+- x (0, T)) of the adjoint
Existence and Uniqueness of a Weak Solution to a Stratigraphic Model 285

-£W = rs on O_x lR+. x (O,T),


{ wl~=o = q on D T , (11)
Wlt=T = w T on 0 x lR+.,

defined in a similar way as above with r E LOO(O x lR+. x (0, T)) a compactly
supported function on fj x lR+ x [0, TJ, w T E LOO(O x lR+.) a compactly sup-
ported function on fj x lR+, and qS E L 00 (0 x (0, T)). Then, one has

r r rT(v(£w)+(£v)w)(x,~,t)dtd~dX- r r (v(x,~,T)w(x,~,T)
} n } 1R+ n } 1R+
11
}0 }
T
-v(x,~, 0) w(x,~, 0)) d~ dx + 8t h(x, t) v(x, 0, t) w(x, 0, t) dt dx = 0.
(12)
Let us denote by (Vi, d't) the difference between any two weak solutions of
(4). From the linearity of the set of equations (4) in the concentration variables,
the functions (Vi, d't) satisfy the weak formulation (5)-(6) with homogeneous

°
boundary and initial conditions.
Let T > 0. From Lemma 1, the function Vi 8t h has a trace at ~ = in
LOO(O x (0, T)) denoted by vilE=o 8t h. Then, from the integration by part
formula (10) of Lemma 1 and the weak formulation (6), it results that for all
°
cp E {¢ E C~(lRd+l) I ¢(x, t) = on 80 x (0, T) \E:J;, and ¢(x, T) = on O}, °
one has

11 T
(Vi(X, 0, t) 8t h(x, t) cp(x, t)+df(x, t) Vh(x, t)· Vcp(x, t)) dt dx = 0. (13)

We easily deduce, using (13) and 8t h - 11h = 0, that


div( -d: Vh) = -viIE=o 8t h E LOO(O x (0, T)), (14)
Vh· Vdi = (viIE=o - dD 8t h E LOO(O x (0, T)). (15)

!
Let us now consider the adjoint system

-WtIE=o 8t h + div(q:sVh) = °
°
on 0 _x (0, T),
qtlE-T = on ET ,
-£w, = v, on 0 x lR+. x (0, T), (16)
wtlE=o = q: on Dr,
wtlt=T = Vtlt=T on 0 x lR+..
The direct and adjoint problems are very close, apart from the non vanishing
right hand side Vi E LOO(Ox lR+. x lR+.) in the advection equation of (16). Then,
the existence of a weak solution (Wi, d't) E LOO(O x lR+. x lR+.) x LOO(O x lR+.)
to the adjoint problem, defined similarly as in Definition 1, can be obtained
in a very close way as in [1] by the convergence of a finite volume numerical
scheme, adapted to the non vanishing right hand side in Loo in the advection
equation.
286 R. Eymard et al.

Considering such a weak solution, the following equation is derived as above


div(qf V'h) = wibo 8t h E LOO(D x (0, T)). (17)
From (15) and (17), the function div(qf df V'h) is defined in LOO(D x (0, T))
and hence the vector field qtdfV'h has a normal trace in LOO(O, T; H-~ (8D)).
As formally df vanishes on E;j;, qt vanishes on Ei, and the normal trace g of
V'h vanishes on 8D x (0, T) \ (E;j; U Ei), the normal trace of qtdiV'h vanishes
on the boundary 8D x (0, T). We can prove this result stated by the following
Lemma:
Lemma 2. Hypothesis 1 is assumed to hold. Then, for any T > 0, any weak
solutions (Wi,qt) of the adjoint problem (16), and (vi,df) of problem (4) with
homogeneous boundary and initial conditions, one has

In IT div(qI d: V'h) dt dx = 0.
According to the definition of the characteristic solution of (11) (see [5]) and
since the velocity 8t h is uniformly bounded on n
x [0, T] for any time T > 0,
the function Vi (resp. its trace Vi It=T) is compactly supported in x 1R+ x [0, T] n
n
°
(resp. in x 1R+). Applying the integration by part formula (12) of Lemma 1
to V = Vi and W = Wi, we get that for any time T >

r r rTlviI2(x,~,t)dtd~dx+ r r IViI2(x,~,T)d~dx
JnJlR+Jo JnJlR+ (18)
= In IT 8 t h(x, t) Vi(X, 0, t) Wi(X, 0, t) dt dx.

From Lemma 2 and the integration over D x (0, T) of (17) multiplied by di,
we obtain

In 1 [di(x, t) Wi(X,
T
0, t) 8t h(x, t) + qt (x, t) V'di(x, t) . V'h(x, t)] dt dx = 0.
(19)
Also, multiplying (15) by qt and integrating over D x (0, T), we get

In loT [(Vi(X, 0, t) - di(x, t)) qt(x, t) 8 th(x, t)-


(20)
-qt(x,t)V'di(x,t)· V'h(x,t)]dtdx = 0.

Summing (19) and (20) and taking into account the boundary conditions
wil~=o = qt on Di, vil~=o = di on D:f, and that 8t h =
(D:f U Di), we obtain
on D x (O,T) \ °
11 T
(Wi(x,O,t)di(x,t) +Vi(X,O,t)qt(x,t) -di(x,t)qt(x,t)) 8th(x,t)dtdx

In 1 Vi(X,
nOT

= 0, t) Wi(X, 0, t) 8t h(x, t)dt dx = 0.


(21)
Existence and Uniqueness of a Weak Solution to a Stratigraphic Model 287

Equation (21) together with (18) conclude the proof of Theorem 1.

This theorem, together with the convergence result on the solutions of the
implicit finite volume numerical scheme seen previoulsy, also give the conver-
gence of the full sequence of approximate solutions (Ui,lCrn,Llt rn )mEIN towards
the weak solution Ui of (4).
The proof will be detailed in a forthcoming paper, and particularly Lemmae
1, 2, and the existence of a weak solution to the adjoint problem.

References
[1] Eymard, R., Gallouet, T., Gervais, V., Masson, R.(submitted 2003) : Con-
vergence of a numerical scheme for stratigraphic modeling. SIAM J. Num.
Anal.
[2] Eymard, R., Gallouet, T., Granjeon, D., Masson, R., Tran, Q.(2003) : Multi-
lithology stratigraphic model under maximum erosion rate constraint. Int.
J. of Num. Methods in Eng., (to appear)
[3] Eymard, R., Gallouet, T., Herbin, R.(2000) : The Finite Volume Method. In:
Ciarlet P. and Lions J.(eds.) Handbook of Numerical Analysis, 7. Elsevier
[4] Flemings, P.B., Jordan, T.E.(1989) : A Synthetic Stratigraphic Model of
Foreland Basin Development. J. of Geophysical Research, Vol 94, B4, 3851-
3866
[5] Godlewski, E., Raviart, P.(1996) : Numerical Approximation of Hyperbolic
Systems of Conservation Laws. Springer
[6] Granjeon, D.,(1997) : Modelisation stratigraphique deterministe; conception
et applications d'un modele diffusif 3D multilithologique. Ph. D. disserta-
tion, Gosciences Rennes, Rennes, France
[7] Rivenaes, J.C.(1992) : Application of a dual lithology, depth-dependent dif-
fusion equation in stratigraphic simulation. Basin Research, 4, 133-146
[8] Tucker, G.E., Slingerland, R.L.(1994) : Erosional dynamics, flexural isostasy,
and long-lived escarpments: A numerical modeling study. J. of Geophysical
Research, Vol 99, B6, 12,229-12,243
Combined Nonconforming/Mixed-hybrid Finite
Element-Finite Volume Scheme for Degenerate
Parabolic Problems

Robert Eymardl, Danielle Hilhorst 2 and Martin Vohrallk 2 ,3

1 Departement de Mathematiques, Universite de Marne-la-Vallee, 5, boulevard


Descartes Champs-sur-Marne, 77454 Marne-la-Vallee, France
[email protected]
2 Laboratoire de Mathematiques, Analyse Numerique et EDP, Universite de
Paris-Sud et CNRS, Bat. 425, 91405 Orsay, France
danielle. [email protected]
3 Department of Mathematics, Faculty of Nuclear Sciences and Physical
Engineering, Czech Technical University in Prague, Trojanova 13, 120 00 Prague
2, Czech Republic [email protected]

Summary. We propose and analyze an efficient numerical scheme for nonlinear


degenerate parabolic convection~reaction~diffusion equations. We discretize the dif-
fusion term, which generally involves a full matrix diffusion tensor, by means of
piecewise linear nonconforming (Crouzeix~Raviart) finite elements over a triangula-
tion of the space domain, or using the stiffness matrix of the hybridization of the
lowest order Raviart~Thomas mixed finite element method. The other terms are
discretized by means of a finite volume scheme on a dual mesh, where the dual vol-
umes are constructed around the sides of the original triangulation. Checking the
local Peclet number, we set up the exact necessary amount of upstream weighting
to avoid spurious oscillations in the velocity dominated case. Under the regularity
condition for the triangulation, using a priori estimates and Kolmogorov's relative
compactness theorem, the convergence of the scheme is proved.

1 Introduction

The contaminant transport equation writes in the form

8(3(c)
----at - \7 . (D\7c) + \7 . (cv) + F(c) = q, (1)

where c is the unknown concentration of the contaminant, the function (3(.)


represents time evolution and equilibrium adsorption reaction, v is the velocity
field, D is the diffusion~dispersion tensor, the function F (.) represents the
changes due to chemical reactions, and q stands for the sources. The main
features of equation (1) are its degeneracy since (3' may be unbounded, the
possible dominance of the convection term, and the presence of a heterogeneous
and anisotropic diffusion~dispersion tensor.
Combined Nonconforming/Mixed-hybrid Finite Element 289

The convergence of a finite volume scheme for the equation (1) with D = I d
and F = 0 has been shown in [7]. Finite volumes with upstream weighting tech-
niques are unconditionally stable; however, there are geometrical restrictions
on the mesh for the discretization of the diffusion term and there is no gen-
eral prescription how to discretize full tensors. The finite element method for
degenerate parabolic problems has been studied e.g. in [2]. One can discretize
full tensors and there are no restrictions on the mesh. However, spurious os-
cillations may appear in the velocity dominated case or in the presence of
a reaction term. Hence a quite intuitive idea is to combine finite volume and
finite element methods, trying to use the "best of both worlds". In [1], the
authors introduce a combined scheme for a convection-diffusion equation with
a nonlinear convection term in two space dimensions. In the presented pa-
per, we prove the convergence of this scheme for the equation (1) in two or
three space dimensions. We extend the techniques used in [7] for a scheme
with negative transmissibilities, general meshes satisfying only the regularity
assumption, and cases when the discrete maximum principle is not satisfied.

2 The degenerate parabolic problem

We consider the equation (1) in a polygonal domain n c IR d , d = 2,3 and on


a time interval (0, T), 0 < T < 00. We set QT = n x (0, T). We impose the
initial condition by
C(x,O) = co(x) xE n, (2)
and a homogeneous Dirichlet boundary condition by

c(x,t) =0 xE an, t E (0, T) . (3)

We make the following assumption on the data:

Assumption (A)
(A1) (3 E C(IR), (3(0) = 0 is a strictly increasing function such that

1(3(a) - (3(b) I 2: cf3la - bl Va, b E IR, cf3 > 0,


or
(A2) (A1) is satisfied and there in addition exists P E IR,P > 0 such that
1(3(x)1 ::; Cf3 in [-P, P], Cf3 > 0 and Lipschitz continuous with a constant
Lf3 on (-00, P] and [P, +00);
(A3) Dij E L=(QT), IDijl ::; C; a.e. in QT, 1 ::; i,j ::; d, CD > 0, D is
a symmetric and uniformly positive definite tensor for almost all t E (0, T)
with a constant CD > 0,

D(x, t)ry . ry 2: CDry . ry Vry E IR d , for a.e. (x, t) E QT;


290 R. Eymard et al.

(A4) v E L2(0, T; H(div, D)) n LOO(QT) satisfies V' . v = qs ::::: 0 a.e. in QT,
Iv, nl :S: Cy , Cy > 0 a.e. on i! x (0, T) Jor each hyperplane i! c D with
normal vector n;
(A5) F(O) = 0, F is a non decreasing, Lipschitz continuous Junction with
a constant L F
or
(A6) F(O) = 0, F is a Lipschitz continuous Junction with a constant LF and
xF(x) ::::: 0 Jor x < 0 and x> M, M> 0;
(A7) q E L 2(QT), where q = qscs with Cs E LOO(QT), 0 :S: Cs :S: M a.e. in
QT;
(A8) Co E LOO(D), 0 :S: Co :S: M a.e. in D.
We now give the definition of a weak solution of the problem (1) - (3).

Definition 1. (Weak solution) We say that a Junction c is a weak solution


oj (1)-(3), iJ c E L2(0, T; HJ(D)), f3(c) E LOO(O, T; L2(D)), and

_1T 1 f3(c) CPt dxdt -1 f3(co)cp(·, 0) dx + 1T 1 DV'c· V'cpdxdt-

-1
T
1 cv· V'cpdxdt + 1T 1 F(c)cpdxdt = 1T 1 qcpdxdt (4)

Jor all cp E L2(0,T;HJ(D)) with CPt E LOO(QT), cp(-,T) = O.

3 The combined finite element-finite volume scheme

We suppose a family of triangulations {1hh of the domain D, where each 1h


consists of closed simplices (triangles in the case d = 2, tetrahedrons when
d = 3) such that D = UKET. K. We define h == max diam(K) and suppose
h KE~

that {1hh is regular:

Assumption (B)
(Bi) There exists a positive constant CT such that

diam(K) C
max :S: T Vh > 0,
KETh PK

where PK is the diameter oj the largest ball inscribed in the simplex K.


We also use a dual partition Dh of D, such that D = UDE'D h D. The
dual volume D associated to the side (JD is constructed by connecting the
barycentres of every K E Th that contains (JD through the vertices of (JD. For
(JD from the boundary, the contour is completed by (JD itself, see Fig. 1. We
denote by QD the barycentre of (JD, by D~nt the set of all interior and by D'f"xt
Combined Nonconforming/Mixed-hybrid Finite Element 291

the set of all boundary dual volumes, and by N(D) the set of all adjacent
volumes to the volume D. For E E N(D), we finally set IJD,E = aD n aE.
We suppose the partition of the time interval (0, T) such that 0 = to <
... < tn < ... < tN = T and define 6tn == tn - tn-l, 6t == max 6tn. When
l::;n::;N
Assumption (AS) is satisfied, we do not impose any restriction on 6t. When
only (A 6) holds, we suppose:

Assumption (C)
(Cl) The maximum time step condition 6t < Z is satisfied.

Fig. 1. Triangles K, L E Th and dual volumes D, E E Vh associated with edges


(JD,(JE

We define the following finite-dimensional spaces:


X h == {'Ph E L2([J); 'PhiK is linear VK E Tj"
'Ph is continuous at QD, D E 'D~nt} ,
X~ == {'Ph E X h ; 'Ph(QD) = 0 VD E 'Dhxt}.
The basis of X h is spanned by the shape functions 'PD, D E 'D h , such that
'PD(QE) = ODE, E E 'Dh , 0 being the Kronecker delta. We equip X~ with the
norm
Ilchllt == L llV'Chl2dx. (5)
KETh K
Definition 2. (Combined scheme) The fully implicit combined nonconfor-
ming/mixed-hybrid FE-FV scheme reads: find the values cD' n E {O, 1, ... , N},
D E 'D h , such that
o 1 r
cD = IDT JD co(x) dx (6)

cD = 0 DE 'D hxt , n E {O, 1, ... , N}, (7)


(3(c D)-(3(cD- 1
) IDI- """ lThn (cn_c n )+ """ n -n-+
6t L JJJlD,E E D L vD,E cD,E
n EEN(D) EEN(D)

+F(cD) IDI = qD IDI D E 'D~nt , n E {I, 2, ... , N}. (8)


292 R. Eymard et al.

In (6)-(8), VD,E = -zs:-


1
tn
ltn1
tn-l IJD,E
v(x, t) . nD,E d1'(x) dt with nD,E the
unit normal vector of the side (JD,E, outward to D, and

qD = D
1
tn
IDI ltn],
tn-l D
q(x, t) dxdt.

Finally,
l' fn
vD,E >
_ 0 -n- - cD
cD,E = n + O'.D,E
n (n CE - n )
CD
n <0
if vD,E -n- - n + O'.D,E
n (n n) . (9)
cD,E = cE cD - cE

Here, O'.D,E is the coefficient of the amount of upstream weighting, defined by

VD,E # O. (10)

Remark 1. (Numerical flux) We can easily see that 0 ::; O'.D E ::; 1/2, i.e. the
numerical flux defined by (9) ranges from the centered s~heme to the full
upstream weighting.

Diffusion matrix from the nonconforming method We set

lDlD,E =- L (Dn\7rpE' \7rpD)O,K D, E E D h , n E {I, 2, ... , N},


KETh

where

n E {I, 2, ... , N}, x E fl. (11)

Diffusion matrix from the mixed-hybrid method Let us consider the


problems

-\7. (Dn\7p) = 9 in fl,


p=O in afl,

at each discrete time tn, with 9 E L 2 (fl). Then using the hybridization of the
lowest order Raviart-Thomas mixed finite element method, one ends up with a
linear system Mn A = G for the Lagrange multipliers A located in barycentres
of sides, see [4, Section V.1.2]. Using the analytic form ofMn, we define

lDlD,E =MD,E = - L (:5 n \7rpE' \7rpD)O,K D, E E D h , n E {I, 2, ... , N}


KETh

where

x E K, K E Tit, n E {I, 2, ... , N}.


(12)
Combined Nonconforming/Mixed-hybrid Finite Element 293

In the sequel, we shall consider apart the following special case, satisfied
e.g. when D = I d and when there is a maximal angle condition:

Assumption (D)
(Dl) All non-diagonal terms of the diffusion matrix are nonnegative, i.e.

]]J)'D,E::::: 0 V D, E E "D~nt , D f. E Vn E {I, 2, ... , N}.

4 Existence, uniqueness, and discrete properties


Lemma 1. (Conservativity of the scheme) The scheme (6)-(8) is conser-
vative with respect to the dual mesh.

One can easily verify that ]]J)'D,D =- L ]]J)'D,E for all D E "Dh and n E
EEN(D)
{I, 2, ... ,N}. Adding the finite volume discretization of the other terms, the
assertion follows.

Lemma 2. (Coercivity of the bilinear diffusion form) We have

- LCD L ]]J)'D,ECE ::::: cnllchllL VCh = L CD'PD E X h , Vn E {I, 2, ... , N}.


DEDh EEDh DEDh

The assertion follows immediately from Assumption (A3) and the subsequent
uniform positive definiteness of the diffusion tensors (11) or (12).

Lemma 3. (Estimate on the convection term) We have

L CD L v'D,E CD,E ::::: 0 VCh = L CD'PD E X~, Vn E {I, 2, ... , N}.


DED~nt EEN(D) DEDh

The proof is similar to that in [6] for pure finite volume schemes.

Lemma 4. (A priori estimate for an extended scheme) Let u E [0,1].


We define an extended scheme by

o 1 r
CD = P5T JD co(x) dx D E "D~nt, (13)

C'D=O DE"D'f,xt,nE{O,l, ... ,N}, (14)

u (3(c'D) - (3(C'l;-l) IDI- ~ ]]J)n cn +U ~ vn cn +


6.t ~ D,E E ~ D,E D,E
n EEDi,nt EEN(D)

+uF(c'D) IDI = uq'D IDI D E "D~nt , n E {I, 2, ... , N}. (15)

Then L (c'D)2IDI < Ces for all n E {I, 2, ... , N} with Ces > o.
DEDh
294 R. Eymard et al.

PROOF:

We multiply (15) by 6tncD and sum over D E 'D~nt and n ~ k to have


k k
U L L [,8(CD) - ,8(c~-l)lcDIDI + CD L 6t n ilchilL ~ (16)
n=l
k k
~ U L 6tn L cDqD IDI + uLpM2 L L 6tnlDI,
n=l

considering u ;::: 0, Lemmas 2 and 3, the fact that for cD < 0 or CD > M,
F(cD)cD ;::: 0 follows from Assumption (A5) or (A6), and that when 0 ~ cD ~

1
M, -F(cD)cD ~ IF(cD)llcDI ~ LpM2. Let us now introduce a function B,
B(8) == ,8(8)8 - ,8(T) dT, 8 E lit One then can derive
8

B(cD) - B(c~-l) = [,8(CD) - ,8(c~-l)lcD - J:~l [,8(T) - ,8(c~-l)l dT. (17)


CD

Using that ,8 is non decreasing, one can easily show that J:~l [,8(T) -
CD

,8(c~-l)l dT ;::: o. In view of this and (17), one has


k k
L L [B(CD) - B(c~-l)lIDI ~ L L [,8(cD) - ,8(c~-l)lcDIDI,

which yields
k

L B(c1J)IDI- L B(db)IDI ~ L L [,8(CD) - ,8(c~-l)lcDIDI·


n=l DEDint
h

Using the growth condition on,8 from Assumption (Ai), one can derive B(8) ;:::
c; 82 for all 8 E lit Thus, using in addition Assumption (A 8),

k
c; L (c1J)2IDI- M,8(M)IDI ~ L L [,8(cD) - ,8(c~-l)lcD IDI·
DED~nt n=l DED~nt
Using the Cauchy-Schwarz inequality, extending the summation over all n E
{I, 2, ... , N} and D E 'Dh in the first right term of (16), and using the Young
inequality, we have
k N
L 6tn L (L 6tn L
1

CDqD IDI ~ (cD)2I D I) 21IqIIO,QT ~


n=l DED"nt n=l DEDh

~ "2€~ '"" n2 1
~ 6tn ~ (CD) IDI + 2€
2
Ilqllo,QT·
n=l DEDh
Combined Nonconforming/Mixed-hybrid Finite Element 295

Substituting all the above estimates into (16), we obtain

L + CD L
k
u c{3 max (c'D)2IDI 6t n llchllt :::; uM;3(M)IDI + (18)
2 nE{1,2, ... ,N} D D
E h
-1
n-

considering also (14) and the fact that k was arbitrarily chosen. We now choose
c = ~~. When u of- 0, this already leads the assertion of the lemma. When
u = 0, it follows from (18) that c'D
= 0 for all DE V h and all n E {I, 2, ... , N},
since II . Ilxh is a norm on X~. Thus the assertion of the lemma is trivially
satisfied in this case. 0
Theorem 1. (Existence of the solution to the discrete problem) The
problem (6)-(8) has at least one solution.
The proof makes use of an induction argument. At each time level, Lemma 4
is employed. Consequently, on can use the (Brouwer) topological degree argu-
ment (see [5]).
Theorem 2. (Uniqueness of the solution to the discrete problem) The
solution to the problem (6)-(8) is unique.
The assertion follows from Assumption (Al) and (AS) or (A 6).
Theorem 3. (Discrete maximum principle) Under Assumption (Dl), the
solution of the problem (6)-(8) satisfies, for all DE V h and n E {I, 2, ... , N},

o:::;C'D:::; M. (19)
One sets a transmissibility lED E == lDJ'D E - IVD ElnD E' E E N(D). In view
of Assumption (Dl) and (10), ~ne has ED
E 2:: 0' for all D E V"nt, E E N(D),
and hence one can prove the assertion as i'n [6].

5 A priori estimates
Theorem 4. (A priori estimates) The solution of the scheme (6)-(8) sat-
isfies

C{3 max
nE{1,2, ... ,N} DEDh
L (cD)2IDI :::; Cae, (20)

max
nE{1,2, ... ,N} DEDh
L [;3(c'D)]2IDI :::; Cae, (21)

N
CD L 6t n llchllt :::; Cae (22)
n=1
296 R. Eymard et al.

with ch: = L cD'PD and Cae a constant independent of hand 6t.


DE'Dh

The a priori estimates follow from (18) with E = ;~ and u = 1 and using
Assumption (Al) or Assumption (A 2).

Definition 3. (Approximate solution) As the approximate solution of (1)-


(3) by means of the combined nonconforming/mixed-hybrid FE-FV scheme, we
understand:
(i) a function Ch,6t given by ch: E X~, ch: = L
cD'PD with cD' DE D h ,
DE'Dh
n E {O, 1, ... , N}, solutions to (6-8), such that Ch,6t
is piecewise constant in
time;
(ii) a function Ch,6t given by the values cD, D E Dh, n E {O, 1, ... , N},
solutions to (6-8), and piecewise constant on the dual volumes DE Dh and in
time.

The function Ch,6t is piecewise linear on Th and continuous at the barycen-


tres of the interior sides, whereas Ch,6t is piecewise constant on Dh. From the
a priori estimate (22), it follows immediately that

(23)

Hence, to show the convergence, we can work with Ch,6t as in finite volume
methods.

Lemma 5. (Time translate estimate) There exists a constant C tt > 0,


such that

\iT E (0, T).

The proof is an adaptation of the technique used in [8] for degenerate parabolic
equations discretized by a finite volume scheme. It uses the equation (8) and
the priori estimate (22).

Lemma 6. (Space translate estimate) Let us define Ch,6t by zero outside


of n. Then there exists a constant Cst> 0, such that

VI;. E IRd.

The proof is again an adaptation of a technique used to investigate finite


volume schemes.
Combined Nonconforming/Mixed-hybrid Finite Element 297

6 Convergence

Theorem 5. (Strong convergence in L2(QT) There exist subsequences


of Ch,6t and ch,6t which converge strongly in L2 (QT) to some function u E
L2(0, T; HJ(D)).
PROOF:

From Lemmas 5 and 6 and (20), the sequence Ch,6t verifies the assumptions of
Kolmogorov's theorem [3, Theorem IV.25 ], and thus Ch,6t converges strongly
in L2(QT) to some function u E L2(QT)' Moreover, due to Lemma 6, [6,
Theorem 3.10] gives that this u E L2(0, T; HJ(D». Finally, considering (23),
Ch,6t converges to the same u. D

Theorem 6. (Convergence to a weak solution) There exist subsequences


of Ch,6t and Ch,6t which converge strongly in L2 (QT) to a weak solution given
by (4). If the weak solution is unique, then the whole sequences Ch,6t, Ch,6t
converge to the weak solution.

The strong convergence of Theorem 5 permits to pass to the limit in the nonlin-
ear terms. For slightly more restrictive assumptions than (A), the uniqueness
follows from [9].

References
1. Angot, A., Dolejsl, v., Feistauer, M., Felcman, J. (1998): Analysis of a com-
bined barycentric finite volume-nonconforming finite element method for nonlin-
ear convection-diffusion problems. Appl. Math. Praha, 43, 263-310
2. Barrett, J. W., Knabner, P. (1997): Finite element approximation of transport
of reactive solutes in porous media. Part II: Error estimates for the equilibrium
adsorption processes. SIAM J. Numer. AnaL, 34, 455-479
3. Brezis, H. (1983): Analyse fonctionnelle, theorie et applications. Masson, Paris
4. Brezzi, F., Fortin, M. (1991): Mixed and Hybrid Finite Element Methods.
Springer-Verlag, New York
5. Deimling, K. (1985): Nonlinear Functional Analysis. Springer-Verlag, Berlin-
Heidelberg
6. Eymard, R., Gallouet, T., Herbin, R. (2000): The finite volume method. In: Cia-
rlet P.G., Lions J.L. (eds) Handbook of numerical analysis, vol. 7, 715-1022,
Elsevier Science Publishers B.V. (North-Holland), Amsterdam
7. Eymard, R., Gallouet, T., Herbin, R., Michel, M. (2002): Convergence of a finite
volume scheme for nonlinear degenerate parabolic equations. Numer. Math., 92,
41-82
8. Eymard, R., Gallouet, T., Hilhorst, D., Na"it Slimane, Y. (1998): Finite volumes
and nonlinear diffusion equations. Model. Math. Anal. Numer., 32 , 747-761
9. Knabner, P., Otto, F. (2000): Solute transport in porous media with equilib-
rium and nonequilibrium multiple-site adsorption: uniqueness of weak solutions.
Nonlinear Anal., 42 , 381-403
Discrete Maximum Principle for Galerkin
Finite Element Solutions to Parabolic Problems
on Rectangular Meshes

Istvan Farag6 1 , R6bert Horvath 2 and Sergey Korotov 3

1 Department of Applied Analysis, Eotvos Lorand University, H-1518, Budapest,


Pf. 120, Hungary, [email protected], supported by the National Scientific
Research Found (OTKA) N. T043765
2 Institute of Mathematics and Statistics, University of West-Hungary, Erzsebet
u. 9, H-9400, Sopron, Hungary, [email protected], supported by the National
Scientific Research Found (OTKA) N. T043765
3 Department of Mathematical Information Technology, University of Jyviiskylii,
P.O. Box 35, FIN-40014 Jyviiskylii, Finland, [email protected], supported by
the Agora Center under Grant InBCT of TEKES.

Summary. One of the most important problems in numerical simulation is the


preservation of qualitative properties of solutions of mathematical models. For prob-
lems of parabolic type, one of such properties is the maximum principle. In [5], Fujii
analyzed the discrete analogue of the (continuous) maximum principle for the linear
parabolic problems, and derived sufficient conditions guaranteeing its validity for the
Galerkin finite element approximations built on simplicial meshes. In our paper, we
present the sufficient conditions for the validity of the discrete maximum principle
for the case of bilinear finite element space approximations on rectangular meshes.

1 Introduction

Consider a two-dimensional linear parabolic problem in the classical setting:


Find a function u E C 1,2((0, T) x n) n C([O, T) x D) such that

au
at
-0: t a 2u
k=l ax~
=f in (0, T) x n, (1)

u=g on [O,T) x an, and Ult=o = Uo in n, (2)


where n is a polygonal domain in lR 2 with a boundary an, T > and 0:
is a positive constant. In order to guarantee the existence and uniqueness of
°
the classical solution u = U(t,x1,X2) = u(t,x), we assume that the functions
Uo : n ---+ lR, f : (0, T) x n ---+ lR and g : [0, T) x an ---+ lR are sufficiently
smooth.
The problem (1 )-(2) serves as the mathematical model of various physical,
chemical or even ecological phenomena. It is well-known that the estimation
Discrete Maximum Principle for Parabolic Problems 299

min{O; min u}
rt
+ tmin{O; min!}
Qt
::; u(t,x) ::; max{O; maxu}
rt
+ tmax{O; max!}
Qt
(3)
is valid for the solution u (see [7, Theorem 2.1] and [9, p. 79]). Here Qt with t E
[0, T] stands for the cylinder (0, t) x Q and Tt is the union of its lateral surface
and its bottom. Formula (3) is called the continuous maximum principle.
To solve problem (1)-(2) numerically, we use certain discretizations, both
in spatial and in time coordinates. It is obvious that the validity of the discrete
analogue of the maximum principle, the so-called discrete maximum principle
(DMP), is a natural requirement for having an adequate numerical solution.
The topic of validity of various discrete maximum principles arose already
30 years ago. Thus, in the works [3] and [4], DMP was formulated and proved
for the finite difference and finite element approximations, respectively, for the
second order linear elliptic equations. In particular, in [4], DMP was proved
in 2D case for the continuous piecewise linear finite element approximations
under the following geometrical conditions: the angles of triangles in the used
triangular meshes are not greater than 7r /2 (nonobtuse type condition), or less
than 7r /2 (acute type condition). More results on DMP for the elliptic problems
can be found in [6] and [8].
In [5], the validity of the DMP for the linear parabolic problem is analyzed:
the finite element discretization was performed with linear elements on trian-
gular (simplicial) meshes and the so-called B-method was applied to the time
discretization. The discrete analogue of (3) and sufficient conditions guarantee-
ing its validity were obtained, where one of such conditions was the acuteness
of the triangulations used.
To the authors' knowledge, there is no similar result on the validity of
DMP for parabolic problems solved with the help of bilinear finite elements
in space. As far as the validity of DMP on rectangular meshes is concerned,
we mention the only work [2] in this respect, where the authors considered
the simplest elliptic problem and showed that the corresponding DMP may
not hold if the rectangular elements are chosen arbitrarily. They also derived
sufficient conditions for the validity of DMP.
In our paper, we give sufficient conditions for the validity of the discrete
maximum principle for the Galerkin finite element solutions based on the bi-
linear elements on rectangular meshes for parabolic problems. Similarly to
the case of triangular meshes, we obtain certain geometrical condition on the
shape of elements. Namely, we introduce the notion of non-narrow rectangular
element, which represents an analogue of nonobtuse triangular element for the
case of triangular meshes.
300 1. Farago et al.

2 Discretization

2.1 Galerkin Finite Element Discretization with O-method

Let D be a rectangular domain covered by the rectangular mesh R h , where


h stands for the discretization parameter. Let PI, 00., PN denote the interior
nodes, and PN+l,"" PN the boundary ones in R h. We also define Na :=
N-N.
Let <PI"'" <PN be basis functions defined as follows: each <Pi is required to
be continuous piecewise bilinear such that <Pi(Pj ) = 6ij, i, j = I, ... ,N, where
6ij is the Kronecker's symbol. It is obvious that these basis functions have the
properties
N
(a) <Pi:?: 0, i = l,oo.,N, (b) L <Pi == 1 in D. (4)
i=l

We denote the space of all possible linear combinations of the basis func-
tions by Vh, and define its subspace Vrf = {v E Vh I vlan = O}. Based on
the usual weak formulation of the original problem, the semi discrete form for
(1)-(2) reads: Find a function Uh = Uh(t, x) such that

Uh(O,X) = u~(x), xED, (5)

Uh(t,X) - gh(t,X) E vt, t E (0, T), (6)


and

J at
OUh Vh dx + B(Uh, Vh) = J fVh dx, \lvh E VOh, t E (0, T), (7)
n n
where B(Uh, Vh) = a In grad Uh . grad Vh dx. In the above, u~(x) and gh(t, x)
(for any fixed t) are linear interpolants in vh, i.e.,

N Na
u~(x) = L uO(Pi)<Pi(X) and gh(t,X) = Lg(t,PN+i)<PN+i(X).
i=1 i=1

We notice that from the consistency of the initial and the boundary conditions
(g(O, s) = uo(s), s E aD), we observe that g(O, PN+i) = UO(PN+i), i =
l, ... ,Na.
We search for the semi discrete solution in the form
N
Uh(t,X) = L U?(t)<Pi(X) + gh(t,X),
i=1

and notice that it is sufficient that Uh satisfies (7) for Vh = <Pi, i = 1, ... ,N,
only.
Discrete Maximum Principle for Parabolic Problems 301

Introducing the denotation

we arrive at the Cauchy problem for the system of ordinary differential equa-
tions
dv h
Mdt +Kvh = f, (8)
vh(O) = [uo(P1 ), ... , UO(PN), g(O, PN+l),"" g(O, PN+Na)]T
for the solution of the semi discrete problem, where

M = [Mij]NxR, f = [fi]NXl, (9)

Mij =
n
J cPjcPi dx, fi = J
n
fcPi dx. (10)

The above defined matrices M and K are called mass and stiffness matrices,
respectively.
In order to get a fully discrete numerical scheme, we choose a time-step
Llt and denote the approximation to v h (nLlt) and f (nLlt) by v n and fn, n =
0,1, ... , nT (nTLlt = T), respectively. To discretize (9), we apply the B-method
(B E [0, 1] is a given parameter) and obtain the system of linear algebraic
equations

(M+BLltK)vn+l = (M-(1-0)LltK)v n+Lltf(n,B), n=0,1, ... ,nT-1, (11)


where V O = vh(O) and f(n,B) = Bfn + 1 + (1 - B)fn.
Further, let the matrices M + OLltK and M - (1 - B)LltK be denoted
by A and B respectively. In what follows, we shall use the partitions

(12)

where Ao and Bo are square matrices from IRNxN; Aa, Ba E IRNxNa,


U n -_ [ul"",uN
n n ]T E IRN , an d g n -- [gl,
n ... ,gN
n]T
a E IRNa . (S"l
1m1ar par t'1-
tion is used for the matrices M and K.) Then, the iteration (11) can be also
written as
(13)

The iteration is well-defined, because Ao is a Gram-matrix, thus it is invertible.

2.2 Entries of the mass and stiffness matrices

To calculate the entries of the mass matrix M, we first calculate the integral of
the product cPjcPi only on a single rectangle R (denoted by MijIR)' Obviously,
302 I. Farago et al.

it is sufficient to do that on the reference rectangle defined by the vertices


H = (0,0), P2 = (a,O), P3 = (a, b) and P4 = (0, b). The four basis functions
corresponding to the vertices of the reference rectangle are:

cP3(Xl,X2) = albxlX2' cP4(Xl,X2) = -;b(xl-a)x2'


A simple calculation leads to (MijiR = MjilR)
ab
MlllR = M221R = M331R = M441R = 9'
ab
MdR = M141R = MdR = M341R = 18'
ab
MdR = M241R = 36'

where ab denotes the area of the rectangle R.


The elements ofthe stiffness matrix (integrating only on R) can be obtained
similarly. Thus,

(14)

a(a 2 + b2) a(2a 2 - b2)


KdR =- 6ab ,K141R =- 6ab '
and any other value is equal to one of the above four numbers.

2.3 Non-narrow rectangular meshes

In paper [5], a geometrical condition, the acuteness of the triangular meshes,


guaranteed the non positivity of the off-diagonal entries of K. The situation
is similar for the case of the bilinear elements. The non positivity of the off-
diagonal entries of K is fulfilled for so-called non-narrow rectangular meshes.
Let Rh be a rectangular mesh and let us introduce the notation

where aR and bR denote the length of the edges of the rectangle R. A rectan-
gular mesh Rh is called non-narrow if p, ::::; O. It is called strictly non-narrow
if p, < O. Hence, the non-narrowness of a mesh means that the longest edge
of each rectangle is not greater than v'2 times the shortest one. The non-
narrowness of the mesh will imply the non positivity of the off-diagonal ele-
ments of the stiffness matrix (see [1], page 254).
Discrete Maximum Principle for Parabolic Problems 303

3 The Discrete Maximum Principle


Let us define the values
n . {O n n } g;;'ax = max{O, g~, ... ,gNa},
gmin = mIn ,g1'··· ,gNa '
n . {O n n n} n {O ,gmax,UI,···,UN,
n n n}
vmin=mIn ,gmin,U1,···,UN, vmax=max
for n = 0, ... , nT, and
(n,n+l)
f min . {O
= mIn, .
mIn f( )} ,
T,X
xED,TE (nL!.t,(n+1)L!.t)

f~n,;;:+l) = max{O, max f(T,X)},


xE D, TE (nL!.t, (n+ 1 )L!.t)

for n = 0, ... , nT - l.
The discrete analogue, the so-called discrete maximum principle (DMP),
for the continuous maximum principle (3) can be written in the form (cf. [5,
p. 100])

. {O , gmin
mIn n+1 , Vmin
n } + Atf(n,n+l)
LI min _< Un+1
i
<
_ max
{O , gmax'
n+1 Vn
max } + Atf(n,n+1)
LI max ,
(15)
i = 1, ... ,N; n = 0, ... , nT - l.
Let us introduce the denotations

(n,n+l)
f max _ f(n,n+l)
- max e E]RN , f(n,n+1)
0
- f(n,n+1)
- max eo E ]RN ,
f a(n,n+l) -_ f(n,n+l) E ]RNa
max ea ,

n n E ]RN
V max == Vrnax e ,
For simplicity, we denote zero matrices and zero vectors by the symbol 0,
whose size is always chosen according to the context. The ordering relation is
meant elementwise.
Before proving the sufficient condition of the DMP, in the next auxiliary
lemmas, some important properties of the matrices M, K, A and B are sum-
marized.
Lemma 1. Let the rectangular mesh Rh for [2 be of non-narrow type (fL ::::; 0).
Then Kij ::::; 0 (i f j! i = 1, ... , N! j = 1, ... , N).
PROOF. We denote SUPPcPi n sUPPcPj by Sij and calculate Kij (i f j):

Kij = a in gradcPj . gradcPi dx = aLL gradcPj . gradcPi dx = L


R~Sij R~Sij
Kiji R .

Because Kij IR (i f j) is nonpositive for any non-narrow rectangle, we observe


that the off-diagonal entries of the stiffness matrix are nonpositive .•
304 I. Farago et al.

Lemma 2. Let the rectangular mesh Rh for fl be of non-narrow type (f-L ::; 0)
and condition
Bii = Mii - (1 - B)LJ.tKii ~ 0, i = 1, ... , N, (16)
be satisfied. Then B ~ O.
PROOF. The matrix M is nonnegative, because the basis functions are
nonnegative. Moreover, the previous lemma guarantees the nonpositivity of
Kij (i i- j) that implies that the off-diagonal entries of B are nonnegative.
The nonnegativity of the diagonal entries of B follows from the condition (16) .

Lemma 3. The relations
(PI) Ke = 0,
are valid.
PROOF. (PI) For the i-th coordinate of the vector Ke, we have

= a L grad 1· grad cPi dx = 0,


which proves the statement.
(P2) For the i-th element of f(n,Bl, we observe that

(f(n,Bl)i = L ((1 - B)f(nLJ.t, x) + Bf((n + 1)LJ.t,x))cPi(x) dx ::;

::; Inr fin,;.;:+1lcPi(x)dx = fin,;.;:+1l r ('fcPj(X)) cPi(x)dx =


In )=1

N
= f(n,n+1l
max
" M··
'L.-t = (Mf(n,n+1l)
max.
= ((M + BLJ.tK)f(n,n+1l)
max.
= (Af(n,n+1l)
max ..
j=1
~J
, "
In the above, we used the facts that the basis functions are nonnegative, their
sum equals to the constant one function, and property (PI) .•
Lemma 4. If
Aij=Mij+BLJ.tKij::;O, ii-j, i=I, ... ,N, j=I, ... ,N, (17)
then AD1 ~ O. Furthermore, the relations
-AD1 Aa ~ 0, -AD1 Aa ea ::; eo (18)
are valid.
Discrete Maximum Principle for Parabolic Problems 305

PROOF. Combining the positive definite property of Ao with the assump-


tion of the lemma, we obtain that Ao is a non-singular M-matrix. This yields
that the inverse of Ao is nonnegative. Because of relation (17), the matrix Aa
is nonpositive, which implies -ADl Aa 2': O. The matrix M is nonnegative,
because the basis functions are nonnegative. Thus,

0:::; Me = (M + BL1tK)e = Ae = Aoeo + Aaea,


and the last statement of the lemma can be obtained by multiplying both sides
by AD1 .•
Now, we prove the main result of our paper, which presents a sufficient
condition for the validity of DMP (d. Theorem 1 in [5]).

Theorem 1. Let the rectangular mesh Rh of [l be of non-narrow type and


let the time increment L1t satisfy (16) and (17). Then the discrete maximum
principle (relation (15)) is valid.

PROOF. Using (13), property (P2), the relation BV;:'ax = AV;:'ax (it follows
from (PI)) and Lemma 2, we have

Avn + 1 = Bvn + L1t r(n,li) -< Bvmax


n + L1tAr(n,n+1) = Av n + L1tAr(n,n+1)
max max max'
(19)
From (19), using the partition (12), multiplying both sides by ADl (2': 0, see
Lemma 4), and regrouping the inequality, we get

Obviously,

e(gn+1 _ v n _ L1tr(n,n+l)). = gn+1 _ v n _ L1tf(n,n+1) <


a a " max max-
:::; max{O, max{gj+1 - v;;-'ax}}. (20)
J

Therefore, using (19)-(20) and Lemma 4, we get

u n+ 1 _ v n _ L1tf(n,n+1) < max{O max{gn+l _ v n }}e (21)


o 0 - 'j J max o·

Writing (21) for the i-th component, and expressing u~+1, we obtain the
right-hand side inequality in (15). The left-hand side inequality in (15) can be
proved in a similar manner. •
The previous theorem does not say anything about the choice of the rect-
angulation and the choice of the parameters Band L1t in order to guarantee
the DMP. The validity of the DMP can be checked only after the direct calcu-
lation of the elements of the matrices A and B by testing the two inequalities
in the above theorem. The next theorem can guarantee the DMP a priori.
306 I. Farag6 et al.

Theorem 2. The finite element solution of (1)-(2), using bilinear basis func-
tions on a strictly non-narrow rectangular mesh Rh of a rectangular domain
D, satisfies the discrete maximum principle (15) if the conditions

(22)

and
2
Llt< 'Y (23)
- 3 (1- e)a'
are fulfilled, where

PROOF. It is easy to show that under conditions (22)-(23), the sufficient


conditions of the DMP are satisfied in Theorem 1. The two inequalities in the
theorem can be proven using condition (23) and (22), respectively.•

4 Final comments

In this paper a priori sufficient conditions for the validity of the discrete max-
imum principle have been given for the Galerkin finite element methods based
on bilinear finite elements in space. We close the paper with some remarks
regarding our results.

- As it usually happens in the qualitative analysis of finite element approxi-


mations, there are both, upper and lower bounds for the time-step, which
means that Llt cannot be chosen neither too small nor too large.
- A square mesh with the mesh-size h is, obviously, strictly non-narrow. On
such meshes, the sufficient conditions for the DMP are

h2
Llt> - (24)
- 3ea
and
h2
Llt< (25)
- 6(1- e)a
This shows that the time step can be chosen only for the values e 2:: 2/3,
i.e., the Crank-Nicolson scheme is not included.
- The results of Theorem 1 are similar to Theorem 1 in [5J. The only difference
is the application of the condition of the strict non-narrowness instead of
the acute type condition.
Discrete Maximum Principle for Parabolic Problems 307

References

1. o. AXELSSON, V.A. BARKER, Finite Element Solution of Boundary Value Prob-


lems, Theory and Computation. Academic Press, Inc. 1984.
2. I. CHRISTIE, C. HALL, The Maximum Principle for Bilinear Elements, Internat.
J. Numer. Methods Engrg. 20 (1984), pp. 549-553.
3. P. G. CIARLET, Discrete Maximum Principle for Finite Difference Operators,
Aequationes Math. 4 (1970), pp. 338-352.
4. P. G. CIARLET, P. A. RAVIART, Maximum Principle and Uniform Convergence
for the Finite Element Method, Comput. Methods Appl. Mech. Engrg. 2 (1973),
pp. 17-31.
5. H. FUJII, Some Remarks on Finite Element Analysis of Time-Dependent Field
Problems, Theory and Practice in Finite Element Structural Analysis, Univ.
Tokyo Press, Tokyo (1973), pp. 91-106.
6. S. KOROTOV, M. KfizEK, P. NEITTAANMAKI, Weakened Acute Type Condi-
tion for Tetrahedral Triangulations and the Discrete Maximum Principle, Math.
Compo 70, (2001), pp. 107-119.
7. O. A. LADYZENSKAJA, V. A. SOLONNIKOV, N. N. URAL'CEVA, Linear and
Quasilinear Equations of Parabolic Type, Translations of Mathematical Mono-
graphs, Vol. 23, American Mathematical Society, Providence, R.I., 1968.
8. V. RUAS SANTOS, On the Strong Maximum Principle for Some Piecewise Linear
Finite Element Approximate Problems of Non-Positive Type, J. Fac. Sci. Univ.
Tokyo Sect. IA Math. 29 (1982), pp. 473-491.
9. J. SMOLLER, Shock Waves and Reaction-Diffusion Equations, Springer Verlag,
1981.
Cubature-Differences Method for Singular
Integro-differential Equations

Alexander I. Fedotov

Chebotarev Institute of Mathematics & Mechanics, Kazan, Russia fedotov«lmi. ru

Summary. In the papers [1] - [4] the quadrature-differences methods for the vari-
ous classes of the I-dimensional periodic singular integro-differential equations with
Hilbert kernels were justified. The convergence of the methods was proved and er-
ror estimates were obtained. Here we propose and justify the cubature-differences
method for 2-dimensional 1 linear periodic singular integra-differential equations.
Such equations appear in the theory of elastity (see [5]) and in some problems of
diffraction of electromagnetic waves (see e.g. [6]) The convergence of the method is
proved and error estimate is obtained.

1 Statement of the problem

Let's define the sets N = N 2, Z = Z2, R = R2, d = [-7r;7rj2. For the


elements of this sets (2-components vectors) beside the usual operations we'll
define the following operations

and the partial order

For the fixed s E R let's denote by HS the Sobolev space of 2-dimensional


27r-periodic complex-valued functions with the norm

Iluli s = lIullHs = (2: (1 + k 2)8 I u(k) 12)1/2,


kEZ

where
u(k) = (27r)-2 i u(T)ek(T)dT

are the Fourier coefficients of the function u( T) to the system of trigonometric


monomials
ek(T) = exp(ik· T), k E Z,TE d.
1 2-dimensional case is considered only in sake of simplicity. All results could be
easily generalised to the case of m (m 2:: 3) dimensions.
Cubature-differences method for singular integra-differential equations 309

For the following we'll asume that s > 1 providing (see e.g. [8]) the embedding
of HS into the space of continuous functions.
Consider the linear singular integro-differential equation

ABu+Tu= J, (1)
where A is a 2-dimensional singular integral operator

Au == aoo(t)u(t) + aOl(t)(lolU)(t) + alO(t)(llOU)(t) + all(t)(lllU)(t),

with the singular integrals

(llOU)(t) = (27r)- lj7r 1'l-tl


U(1'l,t2) cot --d1'l,
-7r 2
2j7r j7r 1'1 - tl 1'2 - t2
(lllU)(t) = (27r)- -7r -7r U(1'l' 1'2) cot - 2 - cot - 2 - d1'2 d1'l

which are to be interpreted as the Cauchy-Lebesgue principal value, B is an


elliptic differential operator

Bu == (Bu)(t) = L baJ3 (t)(D a +J3 u )(t), B : H s +2m ----+ H S , mEN,


lal=IJ3I=m
with derivatives
a alal u
Du=-::---:--::----=--
atlOilat20i2
and T: Hs+ 2m ----+ HS is known linear operator. The coefficients akl(t), k,l =
0,1, baJ3 (t), 1a 1=1 131= m, and the right-hand side J(t) of equation (1) are
assumed to belong to Coo.

2 Calculation scheme

Let's fix n = (nl' n2) EN, denote by


In = Inl X Ino> Inj = {k j 1kj E Z,I kj I:::; nj}, j = 1,2,
the index set and difine the grid

on a. The approximate solution of equation (1) we'll seek as a periodic grid


function (vector of values) Un = un(t) defined on an.
310 A.I. Fedotov

The differential operators D o +f3 of equation (1) we'll approximate by the


operators
(2)
where
[j°Un = [jr'[j~2un'
- -1
OjU n = hj (un(t) - un(t - h j 8j )),
8j = (Oj1,Oj2), j = 1,2,
and Ojk is Kronecker symbol.
Singular integrals are to be approximated by cubatures and quadratures.
To do this we'll integrate interpolative Lagrange polynomials

(PnUn)(T) = L Un(tk)~n(T, tk),


kEIn

~n
( tk )-_ II sin((2nj + l)h - tkJ/2)
T,
j=1,2 ( 2nj + 1) sin (( Tj - tkj )/ 2)'
T= (T1,T2) E a, tk = (tk"tk 2) E an.
Then the integrals will take the form

(J01 PnUn)(tk) = (2n2 + 1)-1 L ,t:}12Un(tk"tI2), (3)


12Eln2

(J10 PnUn)(tk) = (2n1 + 1)-1 L It~l, Un(tl" tk2)'


I, El n ,

(Jll PnUn)(tk) = [2n+l]-1 Llt~I,'k::}12Un(tl)' tk E an, 1 = (1,1),


lEIn

and the coefficients I~q) are


(q) _ 7'7r 7'7r
Ir - {tan 2(2q + 1)' r - even, - cot 2 (2q+ 1)' r - odd}.

The operator T we'll approximate by any covergent operator Tn.


Substituting numerical differential formulas (2), cubature and quadrature
sums (3), values ofthe coefficients akl(t), k, 1= 0, 1, bof3 (t), 1 1= 1131= m, °
of the operator (Tnun)(t) and right-hand side f(t) at the nodes of the grid
an in equation (1) we'll obtain a system of linear algebraic equations

aOO(tk) L bof3(tk)(D~+f3un)(tk)+ (4)


IOI=lf3l=m
Cubature-differences method for singular integro-differential equations 311

of the cubature-differences method.

3 Preliminaries

Let's denote by H~ the set of grid functions (vectors of values) on d n


with the norm

IIUnlls,n = IlunllH;i = (L (1 + k2)s 1 un(k)(n) 12)1/2,


kEIn
where
un(k)(n) = [2n + lr L un(tl)ek(tl),
1

IE In
are Fourier-Lagrange coefficients ofthe function un(t) belonging to the grid d n .
The sets HS and H~ will be mapped onto each other by the operators

PnU = (U(tk))kEIn' Pn : H S ---> H~,

(PnUn)(-T) = L Un(tk)~n(T, tk), Pn : H~ ---> HS,


kEIn
by En(u)s we denotethe best approximation of the function U E HS by the
trigonometrical polynomials of order not higher than n.

Lemma 1. For any U E HS, s E R, s > 1 and n E N the following estimations


are valid

IIPnIIHs---->H;i :::; 2M(n, s)J((2s -1),

IlPnPnU - ulls :::; (1 + 2M(n, s)J((2s -l))En(u)s,


where M(n,s) = (m~))S, n E N, and ((t) is Riemann's (-function.
312 A.I. Fedotov

To prove the convergence of the method we need the function M (n, s) to


be bounded. Let's for some c, s E R define the set

N (c, s) = {n I n EN, M (n, s) :::; c}.

Obviously, N(c,s) = 0 for c < 28 / 2 and N(c,s) = {n I n = (j,j), j E N} for


c = 28 / 2. For the following we'll mean that all indices n, no, nl mentioned
below belong to N(c,s) for some c 2: 28 / 2 .

Lemma 2. For any s:::; p, u E HP

4 J llst ificat ion

Theorem. Let for some c, s E R, s > 1, c 2: 28 / 2 equation (1) and calculation


scheme (2) -( 4) of the method satisfy the following conditions:
1) for any n the operator A maps the set of all trigonometric polynomials
of order not higher than n to itself,
2) B is an elliptic operator,
3) the operator T : H8+2m ---t H8+C is bounded for some c E R, c > 0,
4) the sequence of the the operators Tn approximates operator T with respect
to Pn, i.e. for any function u E H8:

IITnPnu - PnTul18,n = 'T)nlluI1 8+2m with 'T)n ---t °for n ---t 00,

5) equation (1) has a unique solution u* E H8+2m for any right-hand side
f E H8.
Then for all n, beginning from some no, the system of equations (4) is
uniquely solvable and approximate solutions u~ converge to exact solution u*
of equation (1)
Ilu~ - Pnu*lls+2m,n ---t 0, n ---t 00.
If, in addition, u* E Hs+ 2m+2, then the error estimate

is valid.
Proof. Let's take an arbitrary constant r E R which is not an eigenvalue
of problem Bu + ru = 0, u E Hs+2m and substitute into equation (1)

v = Bu +ru, (5)
The existence of such a constant follows from the properties of the spectrum
of elliptic operators (see e.g. [7]). Then

u = Gv, Bu = v - rGv, (6)


Cubature-differences method for singular integra-differential equations 313

where G is the inverse to Bu + ru and equation (1) will take the form

Kv == Av - rAGv + TGv = f, K:Hs~Hs, (7)


being still equivalent to the original one. The equivalence hear means, that
solvability of one of the equations yields solvability of the another, and their
solutions are related by the relationships (5), (6). Now let's rewrite the system
of equations (4) as an operator equation

(8)

An = PnAPn, fn = Pnf,
(BnUn)(tk) = L ba f3(tk)(D!?+f3 un )(tk) ,
lal=If3I=m
and make the substitution

(9)
As it is shown in [10] equation (9) is uniquely solvable for all ll, beginning from
some lll, and for Vn = PnV solutions Un = Gnv n = GnPnv converge to the
solution u = Gv of equation (5). Here G n is inverse to operator Bnun + rUn
and
(10)
By substitution (9) we'll get equation

(11)
which is equivalent to equation (8). As before, the equivalence here means,
that solvability of one of equations yields solvability of the another and their
solutions are related by the relationships (9), (10).
The invertibility of the operators Kn : H~ ~ H~ we'll prove following [9].
To do this we have to establish the following:
a) IlPnfn - fils ~ 0 for II ~ 00;
b) the sequence of operators (Kn) approximates the operator K compactly;
c) K is invertible.
The validity of a) follows immediately from Lemma 1 2

To check b) we have to show first that the sequence (Kn) approximates


the operator K with respect to Pn , and then that for any bounded sequence
(v n ), Vn E H~ the sequence (PnKnv n - KPnv n ) is compact in HS.
For arbitrary Vn E H~ we'll write

(12)
2 Here and further C denotes generic real positive constants, independent fram n.
314 A.I. Fedotov

+ Ir IllPnAnGnvn - AGPnvnll s + IlPnTnGnvn - TGPnvnll s


and estimate each summand of the right-hand side. From the definition of the
operator An and condition 1) of Theorem it follows that the first summand is
equal to zero. For the second summand, using once more the definition of the
operator An, condition 1) of Theorem and the boundness of the operators A
and Pn , we'll have

I r IllPnAnGnvn - AGPnvnl!s :::; CllPnPnAPnGnvn - AGPnvnll s :::;

:::; C(IIGnvn - PnGPnvnlls+2m,n + E n (GPn V n )s+2m).


For the third summand, using Lemma 1 and the boundness of the operators
Tn, we'll obtain

IlPnTnGnvn - TGPnvnll s :::; C(IIGnvn - Pn GPnvnl!s+2m,n+

+IITnPnGPnvn - PnTGPnvnlls,n + En(TGPnvn)s).


Finally, estimation (12) will take the form

IlPnKnvn - KPnvnll s :::; C(IIGnvn - Pn GPn v nlls+2m,n+

+IITnPnGPnvn - PnTGPnvnlls,n + En (GPn V n )s+2m + En (TGPnvn)s),


which, taking into account condition 4) of Theorem, convergence of operators
(G n ) and convergence to zero of the best approximations of functions GPnv n
and TGPnv n , means that

and thus the approximation of operator K by sequence of operators (Kn) with


respect to Pn .
Let's assume now, that sequence (v n ), Vn E H~ is bounded Ilvnlls,n :::; 1,
and prove that sequence (PnKnv n - K Pnv n ) is compact in HS. We'll write

and prove compactness of each summand of the right-hand side. Operators


G : HS --> H s+2m, T : Hs+ 2m --> HS+£, A : Hs+ 2m --> Hs+ 2m are bounded, so
sequences (rAGPnv n ) and (TGPnv n ) are bounded in HS+','Y = min(2m,E)
and thus compact in HS. Operators G n : H~ --> H~+2m and TnGn : H~ -->
H~+E: are also bounded so polynomials PnGnvn and PnTnGnvn are bounded in
HS+' and thus, due to Riesz theorem, sequences (rAPnGnv n ) and (PnTnGnvn)
are also compact in HS, which gives the compactness of sequence (PnKnv n -
KPnv n ).
Validity of c) follows from condition 5) of Theorem and equivalence of
equations (1) and (7).
Therefore, according to Theorem 6.1 [9]' for all n, beginning from some no,
no ::::: nl, equations (11), (8), and thus system of equations (4) are uniquely
Cubature-differences method for singular integro-differential equations 315

solvable and approximate solutions (u~) of system of equations (4) converge


to the exact solution u* of equation (1) with a rate

Ilu~ - Pn u *lls+2m,n :s: CIIPn(ABu* + Tu*) - (AnBnPn u* + TnPnu*)lls,n :s:


:s: C(En(Bu*)s + IIPn Bu* -
BnPnu*lls,n + IIPnTu* - TnPnu*lls,n).
2
If moreover u* E Hs+ m+2 then Bu* E Hs+2 and as it is shown in [9]
'" ,
IIPnBu* - BnPnu*lls,n :s: Ch 2 .

On the other hand, according to Lemma 2, and using the inequality (1 +


n 2)-q :s: C(h2)Q, q E R, q> 0, we'll have

En(Bu*)s :s: (1 + n 2 )-1 En(Bu*)s+2 :s: C(h 2 ),


which, together with condition 4) of the Theorem gives desired estimation

Ilu~ - Pnu*lls+2m,n :s: C(h2 + 1]n).


Hence, the Theorem is proved.

References
1. Fedotov, A. 1., On convergence of quadrature-differences method for linear sin-
gular integrodifferential equations, Zh. Vychisl. Mat. i. Mat. Fiz. 29 (1989), N9,
1301-1307 (in Russian).
2. Fedotov, A. 1., On convergence of quadrature-differences method for one class of
singular integro-differential equations, Izv. Vyssh. Uchebn. Zaved. Mat. (1989),
N8, 64-68 (in Russian).
3. Fedotov, A. 1., On convergence of quadrature-differences method for linear sin-
gular integrodifferential equations with discontinuous coefficients, Zh. Vychisl.
Mat. i. Mat. Fiz. 31 (1991), N2, 261-271 (in Russian).
4. Fedotov, A. 1., On convergence of quadrature-differences method for nonlinear
singular integrodifferential equations, Zh. Vychisl. Mat. i. Mat. Fiz. 31 (1991),
N5, 781-787 (in Russian).
5. Parton, V. Z., Perlin P. 1., Mathematical methods of the theory of elastity. Vol
1., "MIR", Moscow 1984.
6. Kas'anov V. 1., Suchov R. N. Galerkin's methods for a solution of a problem of
diffraction of electromagnetic waves on a dielectric wedge, Progress in Electro-
magnetic Research Symposium, July 5-14, 2000, Cambridge, MA, USA.
7. John, F., Partial differential equations, 4th ed., Springer-Verlag, New York 1981.
8. Taylor, M. E., Pseudodifferential operators, Princeton University Press, Prince-
ton 1981.
9. Vainikko, G. M., The compact approximation of the operators and the approxi-
mate solution of the equations, Tartu University Press, Tartu 1970 (in Russian).
10. Vainikko, G. M., Tamme, E. E., Convergence of the differences method in a
problem of periodic solutions of elliptical type equations, coefficients, Zh. Vychisl.
Mat. i. Mat. Fiz. 16 (1976), N3, 261-271 (in Russian).
Nonconforming Discretization Techniques for
Overlapping Domain Decompositions

Bernd Flemisch, Michael Mair, and Barbara Wohlmuth

lnst. for Appl. Analysis and Num. Simulation, Pfaffenwaldring 57, 70569 Stuttgart,
Germany, {fiemisch, mair, wohlmuth}@ians.uni-stuttgart.de. Supported in part by
DFG, SFB 404, C12.

Summary. For the numerical solution of coupled problems on two nested domains,
two meshes are used which are completely independent to each other. Especially in
the case of a moving subdomain, this leads to a great flexibility for employing dif-
ferent meshsizes, discretizations or model equations on the two domains. We present
a general setting for these problems in terms of saddle point formulations, and in-
vestigate one- and bi-directionally coupled applications.

1 Introduction

We consider coupled problems on two nested domains, the global domain [l


and the subdomain w, see Figure 1. In order to approximate the involved so-
lution components on [l and w, two meshes are used which are completely
independent to each other. We like to be able to deal with different meshsizes,

o r

Fig. 1. Two nested domains (left), independent grids (right)

discretizations and model equations on the two domains. Our approach is use-
ful especially for a moving subdomain, i.e., when w changes its position inside
the global domain. In this case, no remeshing will be necessary and only the
matrices responsible for the coupling have to be reassembled. In Section 2, we
start with the general variational setting in terms of a saddle point formula-
tion. A one-directionally coupled model problem is investigated in Section 3. In
Section 4, we consider bi-directionally coupled formulations on the examples
of a linear elasticity problem and an eddy current simulation.
Nonconforming Discretization Techniques 317

2 Variational Setting

Generalized saddle point problems. Our goal is to find a solution


U = (u,n, u w) consisting of two components defined on the global domain [l
and on the sub domain w, respectively. We denote by V,n and Vw the appropri-
ate weak function spaces for the solution components as well as for the test
functions. Without taking into account any coupling between the two solu-
tion components, the involved differential operators are in general described
by continuous bilinear forms a,n(·,·) acting on V,n x V,n, and awC') acting on
Vw x Vw. Indicating by V the product space V,n x Vw, a composed bilinear form
a(·,·) : V x V ----7 IR is obtained by

The coupling between the two solution components is realized via the Lagrange
multiplier space M in terms of two continuous bilinear forms b1 (·,·) and b2 (·,·)
acting on V x M. For the applications in Section 3 and Subsection 4.2, M is the
dual of the trace space Hl/2(r), i.e., M = H- 1 / 2(r), whereas in Subsection
4.1 M = H-l/2(r)2, with r = ow indicating the subdomain boundary.
Solving additionally for the Lagrange multiplier p E M, the following gen-
eralized saddle point problem is derived: find (u,p) E V x M such that

a(u, v) + b1(v,p) = (I, v)v'xv, v E V,


(1)
b2 (u,q) = (g,q)M'xM, qE M,

where (-, ')v'xv and (-, ')M'xM denote the usual duality pairings. We point
out that for b1 (·,·) = b2 (·, .), problem (1) has the usual symmetric structure,
which is encountered for example in the framework of mixed [1] and mortar
[2] finite element methods. Moreover, if b1("q) acts only either on V,n or Vw,
one-directionally coupled problems are derived.
The bilinear forms bi (·,·) define coupling operators Bi : V ----7 M' and B; :
M ----7 V' by (Biv,q)M'xM = (v,B;q)vxv' = bi(v,q) for v E V and q E M.
The validation of the following coercivity- and inf-sup-conditions guarantees
the unique solvability of problem (1) in V x M / K er BT,
[5]:

a(wo, vo)
::lao> 0 : sup > ao, Wo E Ker B 2 , (2)
voEKerB, Ilwoliv Ilvoliv -
a(wo,vo)
sup > ao, Vo E Ker B 1. (3)
woEKerB 2 Ilwoliv Ilvoliv -
. f sup
::Iko > 0 : m bi (v, q) 2: k
0,
'~ = 1 , 2 . (4)
qEM vEV IlvllvllqIIM/KerB;

We note that the above conditions can be more relaxed [3, 11].
318 B. Flemisch et al.

Discretization. We use two different shape regular quasi-uniform triangula-


tions TH on n and Ii, on w, as illustrated in Figure 1, with Hand h indicating
the corresponding maximum element diameter. The function spaces Vn, Vw ,
and M are replaced by discrete approximations VH C Vn , Vh C Vw , and
Mh C M, respectively. We denote an element (VH,Vh) of the product space
VhH = VH X Vh by vf!. It may become necessary to involve approximate bilin-
ear forms ah(·, .), b1,h(·, .), and b2 ,h(·, .). The discrete saddle point formulation
reads: find (Uf!,Ph) E VhH x Mh such that

ah(uf!,v) + b1,h(V,Ph) = (1, vlv1xv,


(5)
b2 ,h(uf!,q) = (g,qlM'xM,
If the conditions corresponding to (2)~( 4) hold for ah(·, .), b1,h(·, .), and b2 ,h(·, .)
with constants independent of the meshsizes Hand h, it is possible to derive
optimal a priori estimates. This is a consequence of the next lemma, which
follows from [3, Thm. 2.2].
Lemma 1. Under conditions (2)-(4), the following estimate holds with a con-
stant C depending on ao, ko and the continuity constants of the involved bilin-
ear forms:

The most delicate step for the quality of the discretization and the compu-
tational complexity is the information transfer between the two grids via the
discrete Lagrange multiplier space M h . In all our considered applications, we
essentially couple between the global grid on n
and the sub domain boundary
r. On r, dual Lagrange multipliers [13] are used to approximate M, which
have optimal stability and approximation properties. Moreover, they have lo-
cal support and satisfy a biorthogonality relation with the basis functions of
the trace space Vhjr . Therefore, the implementation of the corresponding op-
erators B1,h and B 2 ,h can be performed with low computational costs.

3 A one-directionally coupled model problem


We apply the framework presented in the last section to a one-directionally
coupled model problem. We present a uniqueness proof and an a priori error
estimate, which we confirm by a numerical example.
Continuous formulation. Consider the problem

-.dun = fn in n, unl8n = 0, (6)


with its associated bilinear form an(wn,vn) := In \lwn\lvn, for Wn,vn E
HJ(n). We want to solve an additional problem for the sub domain w, namely,
Nonconforming Discretization Techniques 319

(7)

with fw = fnlw' Using Green's formula, we obtain

aw(uw, vw) + (vw,


au
anw )M'xM = (fw, vw)w, Vw E
1
H (w),

with the obvious meanings for a w(-'·) and (., ')w, and where M = H- 1 / 2 (r).
Introducing the Lagrange multiplier p = aa': ' we find that
b1(v,q) = (vw,q)M'xM for v = (vn,v w) E V = HJ([2) X H1(w), q E M. We
realize the continuity requirement along the boundary r in (7) by the bilinear
form b2(v, q) = (vn - Vw, q)M'xM, V E V, q E M and obtain the saddle point
formulation of (6), (7) given in (1) with 9 = 0.
It is obvious that problem (1) has a unique solution, since the problem
on the global domain [2 is not influenced by the problem on the sub domain
W, and its solution un yields the boundary data for a well posed sub domain
problem. Nevertheless, we provide a complete proof within the saddle point
setting.
Theorem 1. With the above definitions, problem (1) is uniquely solvable.

Proof. We first show the unique solvability of problem (1) by validating the
conditions (2)-(4). Our main tool is the harmonic extension operator 'H
M' ---> Vw , defined by

aw('Hw, vw) = 0, Vw E V~ = HJ(w), ('Hw)lr = w. (8)

We observe that the trace of Vw onto r is the space M, and that h((O, v), q) =
b2((0, -v),q). Taking v = (0, ±'Hw) , w EM', condition (4) is a consequence
of the definition of the H-1/2-norm and of the fact that IIHwlh,w :s; CllwIIM"
Let us focus on condition (2). The kernels of the coupling operators are
Ker B1 = Vn x V~, and Ker B2 = {v E V : trvn = vwlr}, where tr : H1([2) --->
H1/2(r) denotes the trace operator. We uniquely decompose Vw E Vw into
VB+VI such that VB = 'H(vwlr) and VI E V~. For an arbitrary Wo = (wn, WB+
WI) E Ker B 2, we consider Vo = (w n, WI) E Ker B 1. By using the properties of
the harmonic extension, we get

Condition (2) follows from (9):

a(wo,vo) = an(wn,wn) + aw(wI,wI) ::::: cllwnlli,n + cllwIlli,w ::::: cllwollIilv6:tW

The proof of condition (3) is similar. For an arbitraryvo = (Vn,VI) E KerB 1,


we set Wo = (vn, H(tr vn) + VI) E Ker B 2 , and obtain (10).
320 B. Flemisch et al.

Discretization. We use standard conforming finite elements of order rand s


on TH and Th , respectively. The associated discrete spaces with no boundary
conditions are denoted by SHU?) and Sh(w), and we set S5 H(f?) = SH(f?) n
HJ(f?) and So h(w) = Sh(w) n HJ(w) to be the spaces taking into account
homogeneous Dirichlet conditions on of? and r, respectively. For the discrete
Lagrange multiplier space M h , we propose the use of dual basis functions
[13] adapted to the order s of elements from the trace space of Sh(w), which
is indicated by Wh(r). Setting VhH = S5,H(f?) x Sh(w), the discrete saddle
point problem (5) is obtained. The unique solvability of problem (5) can be
shown by replacing the harmonic extension 1-1. and the trace operator tr in
the proof of Theorem 1 by discrete operators 1-I.h : Wh(r) ---; Sh(w) and
trh : S5,H(f?) ---; Wh(r), respectively. In order to obtain the estimate (9),
these operators have to satisfy certain stability and extension properties with
respect to the H 1/ 2 -norm. The discrete harmonic extension 1-I.h is naturally
obtained by taking So h(w) as a test function space in (8), and the operator
tr h is given by the m~rtar projection associated with the discrete Lagrange
multiplier space M h , in particular, for this choice, we find Wst = Ir Ir
trhWst·
We intend to use a smaller meshsize h < H or a higher order s > r on
the subdomain, and, therefore, expect a better solution Uh compared to uHlw.
Thus, the finite element solution UFE is defined by

{
UH in we = f? \ w,
UFE:= .
Uh mw.

Lemma 1 only provides a global estimate, which is not sufficient here, since we
like to disregard the approximate solution component UH on the sub domain
w. The necessary tools for a more local analysis can be found in [12], resulting
in the following estimate which is proved in [7].
Theorem 2. Let B :) we such that d = dist (aB \ of?, awe \ of?) > o. Then
for H small enough and U regular enough, there exists a constant C depending
on d such that

IIU - uHl11,w c + Ilu - uhlkw ::; ChslUls+l,w + CHrlulr+l,B + CHr+llulr+l,st.


(11)
We note that the last term in (11) is the fundamental difference of our approach
to the estimates obtained by standard adaptive finite element methods. It is
due to the fact that in our one-directionally coupled approach no pollution
effect is taken into account.

Numerical test. Consider the model problem (6) on f? := (0,1)2 with source
term f derived from the exact solution u(x, y) := exp( -100((x-0.6)2 /a 2 +(y_
0.5)2/b 2)). An elliptic patch with radii 0.25 and 0.15 is placed in the domain
f? with its center at (0.6,0.5), as illustrated in Figure 1. Since the solution
goes to zero with an exponential decay, we may have a coarser triangulation
far enough away from (0.6,0.5). Therefore, we choose an initial triangulation
Nonconforming Discretization Techniques 321

with h/H = 1/4. We use PI elements on TH , whereas on Yr., we consider two


different cases and use PI elements for one test, and P2 elements for another
test. Figure 2 shows the decay of the errors eH = U-UH and eFE = U-UFE in
the H1-norm under uniform refinement. The errors eH and eFE both satisfy the
10-0
1
S f1-P2
:::--,.......+ 0.1 H
.... ......" 300 h
0

-.
7 10- 1 .......

---.....
M

~ ..............
.S ........
.... 10-2
........0 .....
Q) ......•...
10-3
103 104 105
degrees of freedom degrees of freedom
Fig. 2. Error decay in the HI-norm of PI-PI and P1-P2 coupling

a priori estimates. Choosing the same number of unknowns for the standard
and the overlapping method, the solution obtained by the PI-PI coupling is
significantly better than the solution obtained by the standard method. For
the P2-Pl coupling, the error decay is almost optimal with respect to the
piecewise quadratic finite elements used on Th . In agreement with (11), the
error behaves like c 1 h 2 + C2H, and, numerically, C2 « Cl.

4 Bi-directionally coupled problems


We present two applications which result in bi-directionally coupled problems.
The first one illustrates a complementary coupling procedure, the second one
considers an eddy current problem.

4.1 Natural boundary conditions at the hole w


We want to solve a boundary value problem on the domain n \ w =: W C with
natural boundary conditions on the hole boundary T. The solution on the
global domain n yields the solution on the domain with hole WC. This problem
is analyzed for the linear elasticity setting in [9]. In addition, we show an
application for rotating "holes", see Figure 4.
Saddle point formulation for the linear elasticity problem. We con-
sider the problem: find Uc E (Hl(w c ))2 such that

-Divcr(u c ) = fc
cr(u c) nc = t
with U c = 0 on To and cr( u c) nc = g on Tl where To c an has a positive
measure and an = To u n, To n n = 0, with body forces fc E (£2(WC))2 and
322 B. Flemisch et al.

surface tractions 9 E (L2(rl))2, t E (L2(r))2. An equivalent formulation is


given by

-DivO"(un) = in in [2,
-DivO"(uw) = iw in w,
un = U w onr,
[O"(un) nel = -O"(uw) ne +t on r,
with un = 0 on ro and O"(un) ne = 9 on r l , and where [wn] := wnlwc -
wnl w' iw = inlw and in E (L2([2))2 is an extension of ie to [2. It can be
written in its weak formulation as saddle point problem with the bilinear forms
an(un, vn) = In O"(un) : E(Vn), aw(uw, vw) = L O"(uw) : E(Vw), bl(v, q) = (v w+
vn, q)MxMI, b2(v,q) = (v w - vn, q)MxMI, and V = (HI([2))2 X (HI(w))2,
M = (H~! (r) ) 2. For the discretization, we use linear and quadratic finite
elements on quasi-uniform and shape regular triangulations. The conditions
(2) - (4) hold for h / H small enough in the discrete setting as well as in the
continuous setting, yielding unique solvability. For details, we refer to [9].
The realization of this approach allows for an easy shift of the hole without
having to remesh and can be used in shape optimization algorithms to deter-
mine an optimal hole position. Note that the quantity we pass back from the
hole to the background is the jump in the fluxes, i.e., in general the solution
un is only H2- E -regular, s > O. Another application of this complementary
3

coupling technique are time dependent problems where the hole is an object
moving through the domain [2 emanating some flux into we.

Numerical examples

Beam with one hole. We consider the problem domain we = (-5,5) x (0,1) \
{(x,Y) ER2 1

II(x,y) - (-1,0.5)11::; 0.3} with U e = 0 for x = ±5, O"(ue)ne = (0,-1) for


y = 1, O"(u e) ne = 0 elsewhere and i = O. We use Young's modulus E = 200
and Poisson ratio 1I = 0.3. The stress O"xx is monitored as a graphical quantity,
see Figures 3. The iterative solver is based on a block GauE-Seidel method for
the symmetric positive definite system arising from static condensation of the
Lagrange multiplier. The convergence rates are level independent, see [9].

Fig. 3. Top: Problem setup and start grids. Bottom: 17 xx on nX w using comple-
mentary coupling; we show only the values on we.
Nonconforming Discretization Techniques 323

Rotating hole. The hole domain is a smooth star with five points described by
(x, y) = (mx, my) + (ri + L1r) cos(10n A) (cos( a(A, t)), sin( a(A, t))) with A, t E
[0,1] and a(A, t) = 2n(A - t - cos(10n(A + ~)/51)), and a center (mx, my) =
(1.1,0.65)), the medium radius ri = 0.2 and the radius change amplitude
L1r = 0.05. We solve the Poisson problem. The boundary segment described
by A E (0,0.1) carries non-zero natural data. The background region is [2 =
(0,3) x (0,1). The situation and the solution at t = 0,0.15,0.3,0.45,0.65,0.85
can be found in Figure 4. For each new position of the star, no remeshing
procedure has to be carried out, simply the existent grid has to be rotated.
Moreover, only the coupling matrix has to be reassembled, all other involved
matrices stay the same throughout the whole computation.

Fig. 4. Problem setup (top left): Zero natural b.c. in hatched areas, 0 and 1 Dirichlet
b.c. at the top left and bottom right side, respectively, influx of 5 (natural b.c.)
at the back side of the first wing. Initial background grid plus rotated star grid
at t = 0,0.15,0.3,0.45 (top right). Bottom: Solutions at different times t: t = 0
(complete), t = 0.15,0.3,0.45,0.65,0.85 (partly displayed).

4.2 Eddy current simulation

We want to approximate the eddy currents inside a conductor w which is ex-


posed to a time dependent electromagnetic field acting in the global domain
Q. A detailed problem description and analysis concerning the statically con-
densated elliptic system can be found in [8]' numerical results are available in
[6]. Here, we present an alternative approach which fits into the saddle point
framework presented in Section 2.

Saddle point formulation. Elimination of the other involved field quantities·


from the quasistationary Maxwell equations yields for the magnetic field H

divJ-lH =0 in [2, (12)


1
8t H +- curl curlH =0 in w, (13)
O"J-l

with positive material parameters J-l and 0", and J-l constant in w. We assume
knowing a source vector potential Ts such that curlTs = J s in we, with
324 B. Flemisch et al.

J s a given source current density. The magnetic field H is decomposed into


T - grad cP on wand Ts - grad cP on we, where T E Houri (w) is a vector valued
potential defined on the conductor w, and cP E HJ(D) is a scalar valued po-
tential defined on the global domain n. Moreover, we use the Coulomb gauge,
i.e., T is chosen to be solenoidal. From (12), we obtain

aa(cP,v) -
ir
r (Tn)v = 1 j3Ts gradv,
we
v E Va = HJ(D), (14)

with aa(w,v) = Iaj3gradwgradv and j3 depending on fl. Taking v E HJ(w),


(14) implies that cP is harmonic on w, thus, there exists, = cPlr E Hl/2(r)
such that cPlw = 'H, with the harmonic extension operator 'H : Hl/2(r) ---+
Hl(w). Furthermore, due to the solenoidality of T, it holds that Ir(Tn) .. -
L T grad 'H).. = 0 for an arbitrary).. E Hl/2(r). After time discretization by
an implicit Euler scheme with time step size .dt, we obtain from (13) at each
time step:

aw(h, T), ().., W)) + l (Tn) .. = i fw W, ().., W) E Vw = Hl/2(r) X HOur1(w) ,


(15)
with

aw(h,T), ()..,W)) = i a curl T curl W+TW-Wgrad'H,-Tgrad'H)", (16)

where a = .dt/(fLU), and fw contains the information from the preceding time
step.
This suggests the introduction of the Lagrange multiplier p = Tn E M =
H-l/2(r), and of the coupling bilinear form b((v,).., W), q) = ().. - v, q)M'xM
for (v,).., W) E V = Va x Vw and q E M. Setting g = 0 and b1 (·,·) = b2 (·,·) =
b(·,·), problem (1) is obtained. By choosing ().., W) = (0, gradv), v E HJ(w),
in (15), it is easy to see that the solenoidality of T is guaranteed provided
that fw is divergence free. The unique solvability of the statically condensated
formulation of problem (1) is proved in [8].

Discretization. For the approximation of cP, piecewise linear finite elements


are used on TH . Concerning the vector potential T, we employ curl-conforming
edge elements [10] on T,., which are ideally suited for the approximation of
Hourl(w) , whereas for" we use piecewise linear finite elements on r. As before,
we approximate the Lagrange multiplier space M by dual basis functions.
In contrast to the preceding applications, the bilinear form a(·,·) cannot be
implemented directly, and we need an approximation ah(·, .). Therefore, the
harmonic extension operator 'H in (16) is replaced by its discrete analogue 'Hh
corresponding to piecewise linear finite elements on T,.. The gradient operator
can be easily realized by the node-to-edge incidence matrix G which acts on
the degrees of freedom associated with the linear elements on T,., and gives the
Nonconforming Discretization Techniques 325

gradient as a linear combination of the basis functions for the edge element
space [4]. An optimal a priori estimate based on Lemma 1 for the finite element
solution (¢H, Th) of the statically condensated form of (5) is obtained in [8],
provided that the ratio hi H is small enough.

References

1. Babuska, I. (1973): The finite element method with Lagrangian multipliers. Nu-
mer. Math., 20, 179-192.
2. Ben Belgacem, F. (1999): The mortar finite element method with Lagrange mul-
tipliers. Numer. Math., 84, 173-197.
3. Bernardi, C., Canuto, C., Maday, Y. (1988): Generalized inf-sup conditions for
Chebyshev spectral approximation of the Stokes problem. SIAM J. Numer. Anal.,
25, 1237-1271.
4. Bossavit, A. (1998): Computational electromagnetism: variational formulations,
complementarity, edge elements. Academic Press, New York.
5. Brezzi, F., Fortin, M. (1991): Mixed and hybrid finite element methods. Springer,
New York.
6. Flemisch, B., Maday, Y., Rapetti, F., Wohlmuth, B.I. (2003): Coupling scalar and
vector potentials on nonmatching grids for eddy currents in a moving conductor.
To appear in J. Comput. Appl. Math.
7. Flemisch, B., Wohlmuth, B.I. (2003): A domain decomposition method on nested
domains and nonmatching grids. To appear in Numer. Methods Partial Differ.
Equations.
8. Maday, Y., Rapetti, F., Wohlmuth, B.I. (2003): Mortar element coupling be-
tween global scalar and local vector potentials to solve eddy current problems.
In: Brezzi, F. et al (eds) , Numerical Mathematics and Advanced Applications,
Proceedings of ENUMATH 2001, Springer, Berlin, 847-865.
9. Mair, M., Wohlmuth, B.I. (2003): A domain decomposition method for domains
with holes using a complementary decomposition. Report SFB 404 2003/38.
10. Nedelec, J.-C. (1980): Mixed finite elements in JR3. Numer. Math., 35, 315-341.
11. Nicolaides, R.A. (1982): Existence, uniqueness and approximation for generalized
saddle point problems. SIAM J. Numer. Anal., 19, 349-357.
12. Wahlbin, L.B. (1991): Local behavior in finite element methods. In: Ciarlet, P.G.,
Lions, J.L. (eds.), Handbook of Numerical Analysis, Vol. II, Elsevier Science
Publishers B.V., 1991.
13. Wohlmuth, B.I. (2000): A mortar finite element method using dual spaces for
the Lagrange multiplier. SIAM J. Numer. Anal., 38, 989-1012.
On the Use of Implicit Updates in Minimum
Curvature Multi-step Quasi-Newton Methods

John A. Ford 1 and Issam A. Moghrabi 2

1 Department of Computer Science, University of Essex, Wivenhoe Park,


Colchester, Essex, United Kingdom [email protected]
2 Beirut Arab University, Beirut, Lebanon [email protected]

Summary. Multi-step quasi-Newton methods for optimization employ, at each it-


eration, an interpolating polynomial in the variable space to construct a multi-step
version of the well-known Secant Equation (the relation which constrains the updat-
ing of the Hessian approximation). There is some freedom in the choice of the inter-
polating polynomial and this freedom is exploited, in the case of two-step methods,
by the so-called "Minimum Curvature" algorithms, which produce the 'smoothest'
interpolation, in the sense of obtaining the polynomial with the smallest possible
second derivative (measured in some suitable norm). Typically, these norms are de-
fined by a positive-definite matrix and, in this paper, we will consider and compare
the use of different matrices in defining the norm. In particular, we will describe
the construction of implicit methods, in which, as we will demonstrate, there is no
requirement to compute the matrix defining the norm explicitly.

1 Introduction
We consider two-step quasi-Newton methods for the unconstrained optimiza-
tion problem

If we denote the gradient and Hessian of f by g and G, respectively, then


such methods are organised in a manner that clo~ely resembles the structure
of standard quasi-Newton methods, except that the approximation Bi+1 to
the Hessian G(;ri+1) is required to satisfy a condition of the form (where Ii is
a scalar determined by the precise variant of the method under consideration)
Bi+1 (§.i - li§.i-1) = ('JLi -'%-1)' or (1)
Bi+1[i = JJ1.i' say, (2)
in place of the usual condition (known as the Secant Equation)
Bi+1§.i = 'JL i · (3)

(In equations (1) and (3), the step-vectors §'j and 'JLj are defined by
clef
§'j = ±j+1 - ±j , (4)
'JLj ~f Q(±j+1) - Q(±j) = QH1 - Qj , say, (5)
Implicit Updates in Minimum Curvature Methods 327

where {;r.j} are the successive iterates produced by the method.) A matrix
satisfying (1)/(2) can be constructed by appropriately modifying, for example,
the BFGS update formula, as follows:

B i+l -- B i -
Bir.ir.; Bi
T + 1J1.(Yd.;
-T- (6)
r.i Bir.i 1J1.i r.i
clef
= BFGS ( Bi, r.i' 1J1.i ) , say. (7)
The derivation of the condition (1) is described by Ford and Moghrabi [5,4].
In short, quadratic curves ;r.( 7) and g( 7) in Rn (where 7 E R) are constructed
which interpolate respectively, for the same set of values of 7, the three most
recent iterates ;r.i-1, ;r.i and ;r.i+l' and the three associated gradient evaluations
(assumed to be known). The derivatives of these two curves (at 7 = 72, where
72 is the value of 7 corresponding to ;r.i+l and :2.(;r.i+1) on the respective curves)
are then substituted into the relation
(8)
derived by applying the Chain Rule to the function g(;r.(7)). (In (8), primes
denote differentiation with respect to 7.) Of course, -
clef, '( )
1J1.i = :2. 72 (9)
will, in general, only be an approximation to the vector g' (;r.( 72)) required in
(8), whereas -
r.i clef
= ;r.'( 72 ) (10)
may be computed exactly. Nevertheless, on making these substitutions into
(8) and removing a common scaling factor, a relation of the form (1) for
Bi+1 ;:::: G(;r.i+1) is obtained.
The remainder of this paper is organised as follows: in Section 2, we review
the "minimum curvature" approach to determining a suitable set of parame-
ters {7j }J=o, while Section 3 describes the concept of implicit updates. Section
4 then develops the use of implicit updates within "minimum curvature" meth-
ods. Finally, we present the results of numerical experiments in Section 5, and
Section 6 draws conclusions on the basis of the results.

2 Minimum curvature methods

If the values of 7 corresponding to ;r.i-dg(;r.i-d and ;r.dg(;r.i) are denoted by


70 and 71 respectively, then Ford and Moghrabi [6] observed that, without loss
of generality, the values 70 = 0 and 72 = 1 could be specified, leaving the
remaining value 71 to be chosen according to suitable criteria. Defining the
related quantity
8= (72 - 71)/(71 - 70) = (1 - 71)/71 =} 71 = (1 + 8)-1, (11)
328 J.A. Ford, LA. Moghrabi

they then discussed choosing the parameter J to minimize the norm II ;]2"(7) 11M,
where
(12)
and where M is a given symmetric positive-definite matrix, thus producing
an interpolating curve ;]2(7) which is the "smoothest" in the sense of the norm
II . 11M. It was shown in [6] that fulfilling this criterion leads to the requirement
of solving the cubic polynomial
(13)
at each iteration, where

(14)
(15)
In [6], the properties ofthe polynomial1jJ were analyzed and it was shown that
its zeroes may be determined efficiently. In addition, the circumstances under
which 1jJ has three real zeroes (rather than only one) were identified and it was
shown (in that case) which zero should be selected to yield the lowest curvature.
It was observed that, in general, this approach was capable (depending on the
relative dispositions of the three iterates ;]2i-l, ;]2i and ;]2i+l) of producing all
three essentially different orderings of the iterates on the interpolating curve.
In their original paper [6] on these "minimum curvature" methods, the
authors reported the results of numerical experiments conducted with the
version of the algorithm (called A) obtained by making the straightforward
choice M = I, and showed that this yielded a substantial improvement in per-
formance, when compared with the standard BFGS method. In a subsequent
paper [7], they investigated the performance ofthe algorithm, called B, arising
from choosing M = Bi (this choice being motivated by the previous success
of other multi-step methods employing the same matrix), and showed that
a further improvement in performance was thereby obtained. In this paper, we
will pursue this avenue of investigation further by considering related choices
for the matrix M. In particular, we will consider the use of implicit updates -
that is, updated forms of the matrix Bi which are not calculated explicitly.

3 Implicit updates
Since the choice M = Bi produces an algorithm with good numerical perfor-
mance, a natural question to consider is whether related matrices might yield
further gains. An obvious line of enquiry to pursue in answering this question
is the use of updated versions of B i , where the update employs data from
the most recent iteration(s). Because this updated matrix (call it Hi for the
present) will be used to compute {7j }J=o and hence "ti, [ i and :lli.i' it cannot
Implicit Updates in Minimum Curvature Methods 329

be the matrix Bi+l which will be produced via equation (6). Thus, it appears
that use of such a matrix would necessitate carrying out a second update (re-
quiring O(n 2 ) operations) at each iteration. However, we observe that explicit
knowledge of the updated matrix is not our real goal- rather, we only need this
matrix to enable us to calculate ai, ai-l and /-li and hence (by solving the cubic
polynomial) 6, Ii, [i and YJ.i. Therefore, if it can be shown that the expressions
required in equations (14) and (15) may be computed at low cost without
explicit calculation of Hi, then we will have gained our objective of using an
updated matrix, while avoiding most of the computational expense. Because
explicit computation of Hi is avoided, methods which use this technique were
termed implicit methods in [3].

4 Implicit updates in minimum curvature methods

For the purposes of simplicity and of easing the computational requirements,


we will only consider here single-step implicit updates of Bi (although use of
two-step updates is an obvious further line of research). We therefore propose
the following matrices for use in the norm II . II M, denoting the algorithms
thus defined by C and D:

(16)
(17)
In order to avoid the explicit computation of these updated matrices (for the
reasons explained above), it is necessary to show (in the context of "minimum
curvature" methods) how the quantities ai, ai-l and /-li can be calculated with-
out explicit knowledge of the matrix. We consider this issue for each update,
in turn.

4.1 Update C

Since the matrix Bi - 1 is constructed by means of a standard single-step BFGS


update, it follows immediately from (16) that
(18)

(19)
330 J.A. Ford, I.A. Moghrabi

(In deriving (19), we have assumed that J2i+1 is obtained by some form of
search along the 'quasi-Newton' direction
(20)
which implies that
Bi§.i = -tifl..i, (21)
for some positive scalar ti.) Thus we are able to derive the following expressions
for the quantities (Ji, (Ji-l and fJi:
(22)

(23)

(24)

4.2 Update D

In a similar manner, we can derive the following expressions for (Ji, (Ji-l and
fJi in the case when the implicit update (17) is applied:

(Ji-l -_ §.i-1Bi§.i-l
T
+ ti {(§.L19Y}
T + {(§.L1YY}
T
.
' (25)
§.i fl..i §.i '!f..i
(Ji = §.r'!f..i; (26)

fJi = §.Ll'!f..i· (27)


It is evident, from equations (23) and (25), that there remains one obstacle
to be overcome, in each case, in achieving our goal of avoiding terms requiring
O(n 2 ) computation in the calculation of (J = (Ji/(Ji-l. That difficulty resides
in the computation of the product §.Ll Bi§.i-l' This problem has been tackled
in two ways - by alternation of updates and by use of a recurrence.

4.3 Alternation

Alternation [8] involves the repeated application of a cycle consisting (in its
basic form) of two iterations, the first of which is a standard single-step BFGS
iteration and the second of which is the required two-step method. The conse-
quence of this arrangement is that each two-step iteration can be implemented
in the knowledge that the relation
(28)
holds (because of the preceding single-step iteration). This implies that
§.L1Bi§.i-l = §.Ll'!f..i_l' (29)
which means that §.L 1Bi§.i_l can be computed in O( n) operations.
Implicit Updates in Minimum Curvature Methods 331

4.4 Recurrence

In [9], the authors showed how the following recurrence for the efficient calcu-
lation of the quantity Ai ~f ,?L1 B i'?'i-1 could be derived:

7ri-1 = ti-1'?'i!.-1!li_1 + 1;-1 Ai-1 - 2ti-1rZ1!li_1 ;


sT1Wo 1)2 tT-1([i::1_g,o_1)2
Ao - -to
,-
T
,-1'?'i-1!li_1
+ (-,- -,-
r T w (30)
-i-1-i-1 7ri-1

where

[i-1 = '?'i-1 - li-1'?'i-2 , and (31)


:lli.i-1 = 'JLi - 1 - li-1 'JL i - 2 0
(32)
Again, only O(n) operations are required in order to obtain Ai = ,?L1Bi'?'i-1'

4.5 New methods

In fact, the algorithm denoted by B was developed before the derivation of


the recurrence (30), so we have a total of five new algorithms to compare with
the existing methods BFGS, A (using M = J) and B (using M = Bi and
alternation). For consistency of notation, we will now rename B as Balt, since
it uses alternation. The five new algorithms are therefore

1. (using M = Bi and the recurrence (30))


Brecur
2. Calt (using M = Hi - 1 and alternation)
3. Crecur (using M = Hi - 1 and the recurrence (30))
4. Dalt (using M = Hi+! and alternation)
5. Drecur (using M = Hi+! and the recurrence (30)).

5 Numerical experiments
The algorithms Calt, Crecur, Dalt and Drecur derived from the new implicit
updates were compared with each other, in our first set of experiments. All the
multi-step algorithms tested in these and the following experiments employed
the BFGS formula to update the inverse Hessian approximations Hi = B;l,
but with the usual vectors '?'i and 'JL i replaced by [ i and :lli.i:

H,+1 -- H-, + (1 + :lli.t Hi:JQi ) r.ir.t _ (Hi:lli.id + [i:lli.t Hi)


T
[i :lli.i
T
[i :lli.i
T
[i :lli.i
. (33)

The line-search employed by all the algorithms was an implementation of safe-


guarded cubic interpolation and was required to produce a point ~i+! satisfying
the following standard stability conditions (see Fletcher [2], for example):
332 J.A. Ford, I.A. Moghrabi

!(;£i+1) ::::: !(:£i) + 1O-4§.f g(:£i) ; (34)


§.f [1: (:£i+1) :::: 0.9§.f [1: (:£i) . (35)
The suite of test functions used in the experiments is as described in Ford
and Moghrabi [4], with a small number of modifications to starting-points and
convergence criteria. The suite comprises 107 functions, in all, and includes
many test functions which are well-documented in the literature (for example,
More, Garbow and Hillstrom [11], Conn et al. [1] and Toint [12]), such as the
Penalty I and II functions, the Chained Wood function, the Discrete Boundary-
Value function, the Engvall function, the Discrete Integral Equation function
and the Gragg-Levy function. The dimensions of these test problems ranged
from 2 to 80. For each function, four starting-points were used, giving a total of
428 test problems. For convenience, the functions were notionally classified into
those of 'low' (2::::: n ::::: 15), 'medium' (16::::: n ::::: 45) and 'high' (46::::: n ::::: 80)
dimension. In total, there were 29 functions in the 'low' set, 43 in the 'medium'
set and 35 in the 'high' set, giving 116, 172 and 140 test problems in the
respective sets.

Table 1. Comparison of four new minimum curvature methods


IIProblem setll Low I
Medium High II Combined
C alt 20739 (15148) 32634 (28769) 25085 (23239) 78458 (67156)
Scores 45 84 71 200
Crecur 21820 (15659) 35036 (30606) 24561 (22926) 81417 (69191)
Scores 38 73 64 175
Dalt 20898 (15560) 37327 (33663) 29485 (27933) 87710 (77156)
Scores 31 17 8 56
Drecur 21172 (15299) 36247 (32237) 27571 (26001) 84990 (73537)
Scores 31 10 6 47

Results from these first experiments are summarized in Table 1. Each of


the tables which we present is divided into five columns, three of which cor-
respond to the subsets of functions referred to above, while the last refers to
the complete set. The main entry for a method in each column gives the total
number of function / gradient evaluations required by that method to solve
all the problems in the specified set, followed by the total number of iterations
(in brackets). A 'best performance' for each problem was decided on the basis
of the lowest number of evaluations, with ties resolved by the number of iter-
ations. The row labelled 'Scores' shows the number of best performances by
each method for the relevant set.
It is evident, from Table 1, that methods based on the implicit update
D are not competitive with methods based on C. (Comparison with Table 2
below shows that they are not even competitive, on the problems with highest
Implicit Updates in Minimum Curvature Methods 333

Table 2. Comparison of new and old minimum curvature methods

Low Medium High Combined


BFGS 21269 (16073) 42202 (38478) 33549 (32192) 97020 (86743)
Ratios 100.0% (100.0%) 100.0% (100.0%) 100.0% (100.0%) 100.0% (100.0%)
A 21053 (15300) 32769 (28898)+ 25721 (23908) 79543 (68106)+
Ratios 99.0% (95.2%) 77.6% (75.1%)+ 76.7% (74.3%) 82.0% (78.5%)+
Scores 30 17 12 59
Bait 21201 (15557) 34848 (30625) 25083 (23306) 81132 (69488)
Ratios 99.7% (96.8%) 82.6% (79.6%) 74.8% (72.4%) 83.6% (80.1%)
Scores 31 55 53 139
Brecur 22015 (15869) 36646 (31840) 24853 (23152) 83514 (70861)
Ratios 103.5% (98.7%) 86.8% (82.7%) 74.1% (71.9%) 86.1% (81.7%)
Scores 18 21 16 55
Cal t 20739 (15148) 32634 (28769) 25085 (23239) 78458 (67156)
Ratios 97.5% (94.2%) 77.3% (74.8%) 74.8% (72.2%) 80.9% (77.4%)
Scores 35 77 53 165
Crecur 21820 (15659) 35036 (30606) 24561 (22926) 81417 (69191)
Ratios 102.6% (97.4%) 83.0% (79.5%) 73.2% (71.2%) 83.9% (79.8%)
Scores 29 56 48 133

dimension, with the original "minimum curvature" method A.) Although this
result may be somewhat surprising (it might have been expected, instead,
that an update employing the 'latest' data would be more successful still than
the method using M = B i , let alone the method A), we point out that it is
consistent with the results obtained by Ford and Tharmlikit [10] when using
a similar implicit update. On the basis of the results reported in Table 1, the
methods Dalt and Drecur will not be considered further here.
A second set of experiments (using the same test functions) was then con-
ducted, in order to compare the more successful new methods C alt and Crecur
with the existing "minimum curvature" methods A and Balt, and the new
version Brecur of B. These results are reported in Table 2. For comparison,
we have also included the results returned by the standard single-step BFGS
method. In this Table, the entries in each 'Ratios' row give the proportions
of evaluations and (in brackets) iterations for that method, expressed as per-
centages of the corresponding figures for the BFGS method. Finally, scores
(indicating 'best performances') are recorded again for this set of experiments,
but we point out that the results returned by the BFGS method were not
included in assessing these best performances, since our primary purpose is to
compare the "minimum curvature" methods. (The notation :I placed against
two of the results for the method A indicate that there was one failure for this
method [for a test problem in the Medium category], where it was unable to
converge to an acceptable minimum within the permitted limit of 5000 iter-
334 J.A. Ford, LA. Moghrabi

ations. The evaluations and iterations incurred on this failure have not been
included in the relevant totals.)

6 Summary and conclusions


It has been shown how the technique of "implicit updates" may be embedded
within the "minimum curvature" approach to determining smooth interpola-
tory polynomials for use in two-step quasi-Newton methods and, further, how
this embedding can be carried out in a computationally inexpensive manner
(without, for example, requiring any additional O(n 2 ) quasi-Newton updates).
Numerical experiments have demonstrated that methods based on the implicit
update D are not competitive with other "minimum curvature" methods. Fur-
ther experiments have shown that some "minimum curvature" methods are
capable of out-performing the standard BFGS method on higher-dimension
problems by as much as 27 - 29% in terms of both function / gradient eval-
uations and iterations. On the basis of total evaluations, total iteration and
scores, the most successful method is the alternating implicit method Calt in-
troduced in this paper. Its nearest competitors are Bait and Crecur. We also
note that methods based on the recurrence (30) tend to perform a little less
effectively than the corresponding methods employing alternation. The alter-
nating methods are, in addition, a little cheaper to operate, because they do
not need to compute the recurrence and because they only need to solve the
"minimum curvature" sub-problem once in every second iteration, instead of
on each iteration.

References
1. Conn, A.R., Gould, N.LM., Toint, Ph.L. (1986): Testing a class of methods for
solving minimization problems with simple bounds on the variables. Research
Report CS-86-45, University of Waterloo
2. Fletcher, R. (1987): Practical Methods of Optimization (2nd ed.). Wiley, New
York
3. Ford, J.A. (2001): Implicit updates in multistep quasi-Newton methods. Comput.
Math. App!., 42, 1083-1091
4. Ford, J.A., Moghrabi, LA. (1993): Alternative parameter choices for multi-step
quasi-Newton methods. Optimization Methods and Software, 2, 357-370
5. Ford, J.A., Moghrabi, LA. (1994): Multi-step quasi-Newton methods for opti-
mization. J. Comput. App!. Math., 50, 305-323
6. Ford, J.A., Moghrabi, LA. (1996): Minimum curvature multistep quasi-Newton
methods. Comput. Math. App!., 31,179-186
7. Ford, J.A., Moghrabi, LA. (1997): Further development of minimum curvature
multi-step quasi-Newton methods. In: Bainov, D. (ed) Proceedings of the Seventh
International Colloquium on Differential Equations. VSP Utrecht Tokyo
8. Ford, J.A., Moghrabi, LA. (1997): Alternating multi-step quasi-Newton methods
for unconstrained optimization. J. Comput. App!. Math., 82, 105-116.
Implicit Updates in Minimum Curvature Methods 335

9. Ford, J.A., Moghrabi, LA. (2002): On the Use of Alternation and Recurrences
in Two-Step Quasi-Newton Methods. Submitted to Comput. Math. Appl.
10. Ford, J.A., Tharmlikit, S. (2003): New implicit methods in multi-step quasi-
Newton methods for unconstrained optimisation. J. Comput. Appl. Math., 152,
133-146
11. More, J.J., Garbow, B.S., Hillstrom, K.E. (1981): Testing unconstrained opti-
mization software. ACM Trans. Math. Software, 7, 17-41
12. Toint, Ph.L. (1987): On large scale nonlinear least squares calculations. SIAM
J. Sci. Stat. Comput., 8,416-435
A Boundary Movement Identification Method
for a Parabolic Partial Differential Equation

Tom P. Fredman l

Heat Engineering Laboratory, Abo Akademi University, Biskopsgatan 8, FIN-20500


Abo, Finland [email protected]

Summary. We study boundary movement identification for a parabolic partial


differential equation describing a dynamic diffusion process, on basis of internally
recorded data. Formulated as a sideways diffusion equation, the problem is treated
by a spatial continuation technique to extend the solution to a known boundary
condition at the desired boundary position. Recording the positions traversed in the
continuation for each time instant yields the boundary position trajectory and hence
the solution of the identification problem. As the problem is ill-posed, a hyperbolic
approximation approach is used to regularize the computation and recast the equa-
tions into a form amenable to analysis.

1 Introduction
A common feature of inverse problems is their objective of determining the
"cause from the effects". A consequence of this is the ill-posed mathemat-
ical nature [1] of such formulations. This means that they do not satisfy
Hadamard's definition of well-posedness: (i) For all admissible data, a solu-
tion exists. (ii) For all admissible data, the solution is unique. (iii) The solu-
tion depends continuously on the data. Practical consequences of ill-posedness
are strong error growth in the computation and the fact that approximation
errors as well as measurement noise cause blow up of results. Hence, special
regularization techniques are necessary for stabilizing the computations.
Here, estimation of boundary position from real process data is considered
based on a regularized, slowly divergent space marching [6] method. Possible
approaches for this have been adopted for the sideways heat equation [2], in-
cluding sequential and entire time-domain computation by output-norm mini-
mization as well as direct methods. An entire time-domain direct computation
method will be demonstrated by application to simulated input data contam-
inated by random noise. If necessary, nonlinearities through variation (tem-
perature, time or geometry) of material properties and mixed (Robin type)
boundary conditions are readily included. The integrated regularization and
straightforward space marching algorithm makes the method especially suited
to industrial applications. To our knowledge, the proposed approach has not
previously been used to tackle dynamical boundary identification problems,
although the main component in the calculation - the sideways diffusion equa-
tion - has been extensively investigated.
Boundary identification for a parabolic PDE 337

The sideways diffusion equation is a useful formulation in boundary esti-


mation when the boundary shape follows a certain solution value or conforms
to a specified flux. Then, an iterative scheme can be devised for identifying
the boundary. Solution of the sideways diffusion equation is feasible through
discretization of the time variable, using a stabilizing approximation, and solu-
tion of the resulting system of ordinary differential equations [8, 10]. Possible
stabilizing approximations are central and forward differences [9, 10]' Fourier
transform, wavelets or mollification filtering [11]. Our approach follows that
of Weber [22], who approximated the sideways heat equation by a well-posed
hyperbolic partial differential equation. In fact, such a model of heat conduc-
tion (frequently termed the telegraph equation) had been proposed by Morse
and Feshbach [16]' who pointed out the nonphysical nature of instantaneous
heat transfer in space permitted by the conventional parabolic heat conduction
model.

2 Hyperbolic regularization

Consider the hyperbolic system in the static and bounded domain

a 2 U(x, t) _ 2 a 2 u(x, t) au(x, t) ( ) ] S[] [


ax 2 - '"Y at 2 + at ' x, tEO, x 0, tf . (1)

U(O, t) = um(t) , ( aU(x, t)) = qm(t) (2)


ax x=o
u(x,O) =Uo(x) , u(x,tf) =Uf(x) , (3)
Here, '"Y > 0 is a small regularization parameter and uf an arbitrary function.
For consistency, it is presumed that um(O) = uo(O) and um(tf) = uf(O). The
characteristics of (1) are straight lines with slopes ±'"Y and since the domain
of dependence for the solution at any point is bounded by the characteristics
through that point, influence of the endpoint condition u(x, tf) = uf(x) be-
comes negligible as '"Y becomes small. Since (1, 2, 3) is well-posed and, as will
be seen, a regularization of the conventional sideways diffusion problem, it can
be used as an approximation for analysis of the conventional problem. There is
a well-established theory for this type of partial differential equations, ensur-
ing existence, uniqueness and stability of the solution for realistic measurement
data um(t), qm(t). In the case of a moving boundary, (x, t) E ]0, s(t)[ x ]0, tf[,
to be determined from the measurements, a priori conditions on the solution
together with the hyperbolic regularization ensure a meaningful solution.
Conceptually (1, 2, 3) is an admissible regularization of the conventional
sideways diffusion problem, where it is desired to obtain the solution and flux
at x = S, if it is required that '"Y 1 0 as the measurement error diminishes to
zero. Assuming the solution and flux at the origin in (2) can be approximated
by solution of a direct parabolic diffusion equation in x E ]-00,0[, the solution
338 T.P. Fredman

at x = 8 takes a particularly simple form. Then, the solution trajectory f(t)


at the position 8 can be solved from the Volterra integral equation

(Kf)(t) = Um(t) , 0::::; t < 00 , (4)


where the integral operator is defined by

(Kf)(t) =
8
(:;;.
2y1r
it
0
(
f(T) [ _8 2 ]
t-T )3/2 exp 4t-T
( ) dT. (5)

The integral equation (4) is inherently ill-posed although exact knowledge of


um(t) on any finite time horizon uniquely determines the sought temperature
trajectory f(t) on the same time interval. The following Lemma was proved
by Carasso [5], using auxiliary results in [13].
Lemma 1 Let t f > o. Then, 'V p, 1 ::::; p ::::; 00, K is a compact, linear operator
in £P [0, tf]. Kf = 0 implies f = o. Thus, K- 1 exists and is unbounded.
In the literature, various additional regularization methods are reported in
conjunction with iterative solution of the sideways diffusion equation. Filtering
has been done, either directly in the frequency domain [18]' through higher-
order finite differencing [20] or mollification [14, 15, 2]. Wavelets and spline
approximation have also been applied in some cases to ensure well-behaved
computation [3, 11,2]. For our purposes, regularization beyond hyperbolic ap-
proximation is unnecessary and would only require further tradeoffs in solution
accuracy.

3 Boundary identification

3.1 Existence, Uniqueness and Stability

If the space and time variables of (1, 2, 3) are interchanged, a conventional


hyperbolic initial-boundary value problem is obtained, to which a solution
u(x, t), 0 < x ::::; 8 can be obtained using Riemann functions [21, 19, 22].
From such a solution, existence, uniqueness and continuity with respect to
initial data, here the measurements um(t) and qm(t) in (2), can be verified.
Furthermore, as the solution and distance are connected through the flux,
stability of the boundary identification problem follows whenever a nonzero
flux and parameter "f may be presumed. More interesting is the error estimate
of how well the "f > 0 approximates the case "f = o. Such an estimate was
obtained by Elden [7], indicating the same log-convex behavior as for stability
of the sideways heat conduction equation [9, 11].
A different approach to investigating existence, uniqueness and stability of
the boundary identification is to convert the problem to a coefficient estimation
problem by introducing the spatial variable y(t) = x/ s(t). For "f = 0 the
objective is then to find the solution (v(y, t), 1/ S2(t)) to
Boundary identification for a parabolic PDE 339

1 EJ2v(y, t) av(y, t)
s2(t) ay2 = at ,(y,t) E]O,l[x]O,tf[· (6)

v(O, t) = um(t) , ( av(y,t))


a -_ ()
qm t , (7)
Y y=o
(8)
With certain restrictions on the input data um(t), qm(t), this problem can be
solved "in a well-posed manner", quoting [4], where the problem was treated in
a modern way applied to a multidimensional geometry. For the one-dimensional
situation in (6, 7, 8) the input data must be differentiable and at least one of
the measured functions must be monotone [12]. For our purposes this formu-
lation is too restrictive, as in particular um(t) may be an oscillating function
and all inputs may contain noise. The coefficient estimation setting is useful,
however, in demonstrating the requirements on the data to achieve a well-
posed computation and in illustrating the degree to which our problem can be
considered ill-posed.
An alternative strategy for stabilization at "( = 0 with respect to input
noise can be formulated in a case where s(t) « S and an a priori bound
u(S, t) :::; M as in (9) exists with u(s(t), t) « u(S, t). When the solution
and flux at the origin in (2) can be approximated by solution of a direct
parabolic diffusion equation in x E ]-00,0[, a bound on the measurement
error amounting to E results in a log-convex stability bound proportional to
Mx/SE1-x/S for the solution [9, 11]. For "( > 0, a similar slightly less tight
bound has been established [7].
Discretization of the time variable in (1, 2, 3) and solution of the resulting
system of ordinary differential equations in the space variable has a stabilizing
effect as such, since this prevents blow-up of high-frequency components in
the solution [9]. Discretization represents the unbounded time-differentiation
operator on the left-hand side of (1) by a bounded matrix and although of
high condition number it limits magnification of measurement noise. The rec-
ommended step size for time discretization is tf(1og(M/E))-2/2 [9, 10].

3.2 Preliminaries

Consider now the situation depicted in the (x, t)-plane in Fig. 1 for (1) having
a moving boundary s(t) > 0 and a specified solution function on this boundary,
i.e., u(s(t), t) = us(t). It is presumed that this function is such that there exists
an intermediate value solution s(t), i.e.,

um(t) :::; us(t) :::; u(S, t) V 0:::; t :::; tf. (9)

Furthermore, for solution uniqueness it is presumed that the desired boundary


trajectory is found when reaching the correct boundary solution function the
first time according to the procedure:
340 T.P. Fredman

t = tf - - - - - -,

continuation

u",(t)
S:
I

s(t)
q", (t)

1)8 1)s \ 1)8

o x

Fig. 1. The relation between domains for the classical sideways diffusion problem
and the boundary identification problem for a finite time horizon

- As continuation proceeds and the solution boundary condition is reached,


the solution in Vs \ Vs is put equal to us(t).
- As continuation proceeds and the solution boundary condition is reached,
the flux au~~,t) in Vs \ Vs is put equal to zero.
Keeping in mind the regularized formulation of the problem as a damped wave
equation (1), it is clear that the speed of boundary movement s'(t) must not
exceed the wave velocity 1/"",/ in absolute value. The rather restrictive assump-
tion (9) together with the above procedure ensures (by a simple intermediate
value argument) that the condition u(s(t), t) = us(t) can be satisfied for almost
all times yielding a unique s(t). Hence, conditions (i) and (ii) of the introduc-
tion can be considered fulfilled through formulation, noting that (9) is realistic
for the class of industrial boundary identification problems we have in mind
for application of the method proposed in this paper. Remains condition (iii),
which is tackled as follows: accept possible invalidity of (iii) (for ""'/ = 0) and
devise a method diverging slowly enough to amplify moderate measurement er-
rors in um(t), qm(t) to within bounds of resolution. A Lax-Richtmyer analysis
is given below for theoretical discussion of this issue.

4 Computational method
4.1 Cauchy Problem

Transform (1, 2, 3) to an abstract Cauchy-problem applying the method of


lines by using subscript x for spatial differentiation
Boundary identification for a parabolic PDE 341

[u ]
Ux x
= [at +0I
8 28 2
8t'X 0
I] [u ]
Ux
• (10)

(11)

Here, the initial and endpoint conditions (3) will be incorporated into the
differential operator in the matrix of (10).

4.2 Solution Procedure

Adopting a difference approximation to the differential operator

8
( at + I 2 8t'X
8 2 )
u ( x, t) >:::: L1 t u(x, t), (12)

the sampled measurement data

(13)

can be used for integration of (10) to obtain the discrete solution profile

U(x) = [u(x,O) u(x, L1t) ... u(x, N L1t)]T , 0:::; x :::; s(t), (14)
Direct numerical integration yields solution trajectories (functions of time in
vector form) at positions 0 < x :::; s(t). When an element of the solution vec-
tor reaches the boundary condition u(s(t), t) = us(t), this element is excluded
and the time interval split into two subintervals for which the computation
is repeated. To obtain the boundary position at the element, record the in-
tegration distance. Depending on the shape of the initial-value vector (13),
a small number of such interval divisions are necessary to satisfy the bound-
ary condition for most of the nodes. For each subinterval, starting point and
endpoint conditions are not known (except for the first interval starting point,
specified by the initial condition at t = 0). Therefore, it is reasonable to as-
sume that the solution extends smoothly across these points and to introduce
a corresponding linear extrapolation [10] into the difference scheme (12).

4.3 Lax-Richtmyer Analysis

Applying the Lax-Richtmyer theory [17] the discretized solution operator


of (10), viewed as the mapping from the noisy sensor data to the unknown
boundary, can be investigated [6]. The system (10), discretized through (12)
can hence be considered as a space marching scheme
342 T.P. Fredman

105r;=======,-.----,---.,.......--'""l
- forward
forward hyperbolic
. - .- central
104 - - analytic

'.

100 L -_ _' - - - _ - - - - '_ _----'_ _----'-_ _----'-_ _----'---'


o 0.5 1.5 2 2.5 3
8

Fig. 2. Amplification factors for the analytical case, two-point central (S3 in [6])
and four-point forward difference approximations ((17) for I = 0) and the forward
difference hyperbolic regularization in (17)

Introducing the Fourier image G(.1x, B), in terms of the normalized frequency
B, of C(.1x, .1t), one has the relation

(16)

where II· I denotes the L2 norm in the time variable, I· IZ2 the Euclidean norm
and P is the maximum number of spatial steps required to reach the desired
boundary. The matrix G(B) is 2 x 2 and, e.g., a four-step forward difference
approximation to ft
combined with a central difference approximation to 22 tt
in (12) yields

1 .1x]
G( B) = [ Llx
4Llt
(e4ill _ 1) + ",2, (Llt)2
Llx (e ill _ 2 + e- ill ) 0 . (17)

For comparison, the corresponding analytical solution operator for one space
step .1x is a convolution kernel with the Fourier image eLlxVili/ Llt. The mag-
nitude of this is g(.1x, B) = e Llx VIIiI/2Llt to which IG(.1x, B)112 is a discrete
approximation. Thus, marching the input data P spatial steps to reach the
desired boundary, amplifies the noise by a factor g(.1x, B)P. Alternative dis-
cretizations (12) are treated extensively by Carasso [6]' including the cases
with "y = 0 and central difference as well as the forward difference approxima-
tion of tt in (17). For the sake of illustration, an example computation with
Boundary identification for a parabolic PDE 343

exaggerated parameter values was performed, results of which are depicted in


Fig. 2 together with the analytic amplification factor g(Lh, 8)P for the param-
eter values Llt = 0.5, Llx = 5· 10- 4 , P = 10 4 and I = 0.5. From Fig. 2 it
appears that the hyperbolic regularization in (17) performs well with respect
to error growth.

0.4,---~--,---~--~----,

vf"-'
0.35

j
OJ

0.25 1:.;'."'\·" ,/
.,
t ,
" • I

~ 0.2 ',' ~ "


,,
• I
.
~
, I

J:'"
.,
0.15 ./ .. .-

0.1 , ,
0.05 \ "_,'

00':---0::'::.2,------:'0.':-4-,----,0:'-:.6---:'0.::-8- - - - '

Fig. 3. (Left) Diffusion profile and flux measured at the origin (solid) together with
their theoretical trajectories (dashed), for increasing and oscillating thickness tra-
jectory. (Right) Corresponding identified (solid) and theoretical (dashed) thickness
trajectories

5 Example simulation

A simulation was carried out in order to test the ability of the outlined
computation method to identify different types of smooth thickness varia-
tion. To generate corresponding simulated measurements at x = 0, the di-
rect parabolic problem ((1) with I = 0) was solved with the theoretical tra-
jectories for s(t) and us(t) as boundary conditions. After superposition with
a noise (normally distributed, relative amplitude 0.1 %) signal the full curves
um(t) and qm(t), depicted in Fig. 3, were obtained. Subsequently, the sys-
tem (10, 13) was solved with the "measurement signals" as initial conditions
using a simple explicit Euler-method for space marching with Llx = 10- 3 ,
Llt = 7.9.10- 3 and I = 0.01. Hence, the applicable Courant-Friedrichs-Lewy
condition I Llx / Llt ::::; 1 is fulfilled. Identification of the boundary trajectory
from the direct integration as described above, yielded the results on the right
hand of Fig. 3.
The results confirm the feasibility of the used method for situations with
noisy measurement data. As seen in the figures, the solutions are stable and
consistent with the theoretical (dashed) curves. It should be noted that there
344 T.P. Fredman

is a natural time lag involved before the variation induced by boundary move-
ment reaches the sensor location at x = O. In the direct integration method, the
entire time domain is treated simultaneously without account of this causality
aspect. Causality of the problem would require measurement data for the in-
terval (1,1.1] for the solution of the final time interval (0.9,1.0] to be entirely
trustworthy, despite the extrapolation made for the initial and final solution
nodes (cf. the discussion in Elden [8]). Thus, if heads and tails of the function
trajectories are disregarded the results are encouraging.

6 Conclusions

A simple one-dimensional, dynamic model for boundary identification was for-


mulated on basis of direct integration of the sideways diffusion equation. Simu-
lated, noisy input signals were used to illustrate stability against measurement
errors and model response to different types of boundary variation. The results
indicate model feasibility by both of these criteria.

References
l. Alifanov, O. M. (1994): Inverse Heat Transfer Problems. International Series in
Heat and Mass Transfer, Springer-Verlag, Berlin
2. Berntsson, F. (2001): Numerical methods for solving a non-characteristic Cauchy
problem for a parabolic equation. Technical Report LiTH-MAT-R-2001-17, De-
partment of Mathematics, Linkoping University, Linkoping, Sweden
3. Berntsson, F., Elden, L., Loyd, D., Garcia-Padron, R. (1997): A comparison of
three numerical methods for an inverse heat conduction problem and an indus-
trial application. In: Proc. Tenth Int. Conf. for Numerical Methdods in Thermal
Problems, volume X, Institute for Numerical Methods in Engineering, University
of Wales Swansea, Pineridge Press, Swansea, UK
4. Cannon, J. R., Rundell, W. (1991): Recovering a time dependent coefficient in a
parabolic differential equation. Journal of Mathematical Analysis and Applica-
tions, 572-582
5. Carasso, A. (1982): Determining surface temperatures from interior observations.
SIAM J. App!. Math., 42, 558-574
6. Carasso, A. S. (1992): Space marching difference schemes in the nonlinear inverse
heat conduction problem. Inverse Problems, 8, 25-43
7. Elden, L. (1988): Hyperbolic approximations for a Cauchy problem for the heat
equation. Inverse Problems, 4, 59-70
8. Elden, L. (1995): Numerical solution of the sideways heat equation. In: Engl,
H. W., Rundell, W. (eds.) Proc. GAMM-SIAM Symp., Gesellschaft fijr Ange-
wandte Mathematik und Mechanik, GAMM-SIAM, Regensburg, Germany
9. Elden, L. (1995): Numerical solution of the sideways heat equation by difference
approximation in time. Inverse Problems, 11, 913-923
10. Elden, L. (1997): Solving an inverse heat conduction problem by a "method of
lines". Transactions of the ASME, 119, 406-412
Boundary identification for a parabolic PDE 345

11. Elden, L., Berntsson, F., Reginska, T. (2000): Wavelet and Fourier methods for
solving the sideways heat equation. SIAM J. Sci. Comput., 21, 2187-2205
12. Jones, Jr, B. F. (1963): Various methods for finding unknown coefficients in
parabolic differential equations. Communications on Pure and Applied Mathe-
matics, XVI, 33-44
13. Kato, T. (1984): Perturbation Theory for Linear Operators, 2nd edition. A Series
of Comprehensive Studies in Mathematics, Springer-Verlag, Berlin
14. Mejia, C. E., Murio, D. A. (1993): Mollified hyperbolic method for coefficient
identification problems. Computers Math. Aplic., 26, 1-12
15. Mejia, C. E., Murio, D. A. (1996): Numerical solution of generalized IHCP by
discrete mollification. Computers Math. Aplic., 32, 33-50
16. Morse, P. M., Feshbach, H. (1953): Methods of Theoretical Physics, volume I of
International Series in Pure and Applied Physics. McGraw-Hill Book Company,
Inc., New York
17. Richtmyer, R. D., Morton, K. W. (1967): Difference Methods for Initial-Value
Problems, 2nd edition. Tracts in Pure and Applied Mathematics, Wiley Inter-
science, New York
18. Seidman, T. I., Elden, L. (1990): An 'optimal filtering' method for the sideways
heat equation. Inverse Problems, 6, 681-696
19. Sobolev, S. L. (1989): Partial Differential Equations of Mathematical Physics.
Dover Publications Inc., New York
20. Taler, J., Duda, P. (2001): Solution of non-linear inverse heat conduction prob-
lems using the method of lines. Heat and Mass Transfer, 37, 147-155
21. Tikhonov, A. N., Samarskii, A. A. (1990): Equations of Mathematical Physics.
Dover Publications Inc., New York
22. Weber, C. F. (1981): Analysis and solution of the ill-posed inverse heat conduc-
tion problem. Int. J. Heat Mass Transfer, 24,1783-1792
On Computational Properties of a Posteriori
Error Estimates Based upon the Method of
Duality Error Majorants

Maxim Frolov 1a , Pekka Neittaanmiiki 1b and Sergey Repin 2

1 Department of Mathematical Information Technology, University of Jyviiskylii,


P.O. Box 35, FIN~40014, Finland [email protected] a , [email protected] b
2 V.A. Steklov Institute of Mathematics in St.-Petersburg, 191011, Fontanka 27,
St.-Petersburg, Russia [email protected]

Summary. In the present paper, we analyze computational properties of the func-


tional type a posteriori error estimates that have been derived for elliptic type
boundary-value problems by duality theory in calculus of variations. We are con-
cerned with the ability of this type of a posteriori estimates to provide accurate
upper bounds of global errors and properly indicate the distribution of local ones.
These questions were analyzed on a series of boundary-value problems for linear
elliptic operators of 2nd and 4th order. The theoretical results are confirmed by nu-
merical tests in which the duality error majorant for the classical diffusion problem
is compared with the standard error indicator used in the MATLAB PDE Toolbox.
Numerical tests performed show that the meshes generated on the basis of the ma-
jorant are very close to those that would be computed if on each step of the mesh
refinement process we knew the exact error distribution. At the same time, meshes
generated by the MATLAB code may considerably differ from them.

1 Introduction
For several decades the attention of a number of authors has been focused
on questions of reliability and efficiency of calculations in computational en-
gineering. These questions are closely related to the progress in the theory of
a posteriori error control. On the one hand, it is necessary to have guaran-
teed upper bounds on the errors computed in a suitable norm. On the other
hand, it is very desirable to also have qualitative indication of their local be-
havior. These efforts are aimed to decrease computational costs while ensuring
accurate and reliable modelling of physical phenomena.
Nowadays, in the framework of Finite Element Methods several approaches
to error control are used widely. The first of these was formulated at the end of
70th in the works of Babuska and Rheinboldt (see [2,3]). Further investigations
of this subject were pursued by a number of authors and the amount of the
corresponding literature is very large. The most complete description of the
methods and the associated literature are given, for instance, in [1, 4, 11].
The main idea of the method which we investigate in this paper was intro-
duced in [8, 9]. An important property of this method (which makes it different
Computational Properties of the Duality Error Majorants 347

from other approaches) is that the Duality Error Majorants (DEM) allow us to
estimate the accuracy of any conforming approximate solution independently
of the type of approximation (i.e., majorants are also suitable for methods other
than FEM). The main advantage of this technique is calculation to guaran-
teed upper bounds of the energy norm of the error. In principle, bounds can
be computed as accurately as required (subject to the implied computational
effort). This fact has been mathematically justified in [6] and [10]. As has been
shown in [10]' the method also provides local indication of the error. Both
results were confirmed by a considerable amount of numerical testing.
In view of the above-mentioned properties, it is natural to expect that
a combination of the DEM error estimator with standard packages will lead
to highly effective numerical procedures. This expectation is confirmed by the
tests performed. In this paper, we use the MATLAB PDE Toolbox for mesh
generation and adaptive refinements. Besides, we compare the majorant with
the standard error indicator of the toolbox.
In the very last example, we also present the results of numerical experi-
ments for the biharmonic problem.

2 Duality Error Majorant for a 2nd Order Model


Problem

In this section, we take the classical diffusion problem with a Dirichlet type
boundary condition as a basis of our investigations (we state it in the varia-
tional form and call it the primal problem).
Problem P. Find u E V such that

J(u) = ~~t J(v) , J(v):= J(~A\7V.


J2
\7v - fV) dx ,

a
where V := {v EHl(D) I v = W + Ua, wE Va} and Va :=H1(D).
It is assumed that D is a bounded connected domain in ]Rn with a Lipschitz
continuous boundary aD, f E ][}(D), A E M~xn, and there exist positive
constants aI, a2 such that

From the theory of the Calculus of Variations, it is well-known that Problem


P has the dual counterpart Problem P* (see, e.g., [5]).
Problem P*. Find p* E Qj such that

J*(p*) = sup J*(q*) ,


q*EQj
J*(q*):= J (\7ua. q* - ~A-1q* . q* - fua) dx ,
J2
348 M. Frolov et al.

where Qj:= {q* E [}(D,lRn) I £q*. Vwdx = £fwdx Vw E Va}.


Solutions of Problem P and Problem P* satisfy the relations

J(u) = inf J(v) = sup I*(q*) = I*(p*) , (1)


vEV q*EQj

p*=AVu, -V,p*=f a.e.inD. (2)


For any conforming approximate solution v, using the relation (1), we im-
mediately arrive at the a posteriori error estimate for the energy norm of the
error e = v - u

I I e111 2:= J
n
AVe· Vedx::; J
n
(AVv - q*) . (Vv - A-lq*)dx ,

where q* is any element of Qj.


This estimate is mainly of a theoretical value. In many cases, it is practically
hard to construct conforming approximations that belong to Qj. But, as was
originally shown in [8, 9] (see, also, [10]), this difficulty can be overcome by
extending the set of admissible functions for the dual variable. Hence we arrive
at the estimate:

I I e1112::; M(v, (3, y*) := MD(v, (3, y*) + M R ({3, y*) , (3)

MD(v, (3, y*) := (1 + (3) J (AVv - y*) . (Vv - A-ly*)dx ,


n
MR({3, y*) := (1 + 1/(3) !Lf,n/Ctl J (V .y* + f)2dx ,
n
where y* is an element of Q~ := {q* E [}(D,lR n ) I V·q* E [}(D)}, {3 is a
positive number, and <CV'n is a constant in the Poincare-Friedrichs inequality.
The functional M is called the Duality Error Majorant. It is defined on the
pair of free variables ({3, y*). For any v, any values of (3 and y* from lR+ x Q~
provide a guaranteed upper bound on the error. However, we should minimize
M(v,{3,y*) with respect to {3 and y* to compute a sharp estimate.
Let us consider a sequence {Q~k}t:; of finite dimensional subspaces of Q~
that possess the limit density property, i.e., for any 8 > 0 and any q* E Q~
there exists a positive integer ks such that

inf I q* - y* IIQ* ::; 8


y*EQ:['k 'V
Vk 2': ks, I q* 112Q:; :=11 q* 112 + I V .q* 112
v

(we denote by 11·11 two different [}-norms).


Computational Properties of the Duality Error Majorants 349

Let us introduce the functional

Mk(v) := M(v, {3k,Y'k) = inf M(v,{3,y*) ,


(3)o,y*EQVk

whose value can be computed by solving an auxiliary finite-dimensional prob-


lem on the subspace Q~k'
Theorem 1. If the sequence {Q~k} t~ possess the limit density property in
Q~, then

as k -HXJ.

The proof of this theorem is based on the relations (2) and the above-mentioned
property.
It is also important to emphasize the following result concerning the conver-
gence of the optimal sequence {yk}t~ to p* in Q~ (see [6]).
Theorem 2. If the sequence of pairs ({3k,Y'k) that minimizes M(v,{3,y*) on
lR+ x Q~k is such that {3k ---+ 0, then

and

The theorems state properties of the duality error majorant as a global esti-
mator.
Let us denote by c(x) and JLk(X) the integrands of the error and majo-
rant M( v, {3k, y'k), respectively:

c(x) := AVe (x) . Ve(x) ,

JLk(X) := (1 + {3k)(AVv(x) - Y'k(x)) . (Vv(x) - A-1y'k(x)) +


+(1 + II {3k) ~nlal (V 'Y'k(x) + f(x))2 .

For any positive a, we define the set

na := {x En: IJLk(X) - c(x)1 ~ a}


Theorem 3. Under the assumptions of Theorem 2

meas (na) ---+ 0 for any given a >0 as k ---+ 00 .

The proof of Theorem 3 can be found in [10].


Theorem 3 states that the Duality Error Majorant also provides effective
indication of the distribution of local errors, and, therefore, this proposition
has great importance for any mesh refinement processes.
350 M. Frolov et al.

3 Duality Error Majorant for a 4th Order Model


Problem
In this section, we consider the primal problem related to the biharmonic
operator and the corresponding dual problem.
Problem P. Find u E Wo such that

J(u) = v~~t J(v) , J(v):= 1(~BVVV


n
: VVv - IV) dx ,

o
where Wo :=H2(fl). As in Section 2, we assume that I E L2(fl), the tensor
B = {bijsz} possess the symmetry property b(ij)(sl) = b(sl)(ij) for i, j, s, I =
1, ... , n, and there exist positive constants aI, a2 such that

a 1 IxI 2 <Bx:x<a
_ _ 2 Ix l2 \;/xEMnxn
s , (4)
Problem P*. Find m* E N; such that

J*(m*) = sup J* (n*) , J*(n*):=


n*ENj
1(-~B-1n*
n
: n*) dx ,

where N; := {n* E V(fl,M~xn) Iln* : VVwdx = llwdx, \;/w E Wo}.


The relationship between Problem P and Problem P* here is similar to that
presented by (1) and (2):

J(u) = inf J(v) = sup J*(n*) = J*(m*) ,


vEWo n*ENj

m* = BVVu, V·V·m* = I a.e. in fl. (5)


In this case, the respective majorant was derived in [7] and has the form

I I e111 2 := 1
n
BVV e: VVedx ::; M(v, (3, x*) := MD(v, (3, x*) + M R ({3, x*) , (6)

MD(v, (3, x*) := (1 + (3) 1


n
(BVVv - x*) : (VVv - B- 1x*)dx ,

MR({3,X*):= (1 + l/(3)C&n/al 1(V.V.x* - J) 2 dx,


n
w
where x* E N := {n* E L2(fl,M~xn) I V·V·n* E L2(fl)}, {3 E lR+, Cwn is
a constant in the inequality
Computational Properties of the Duality Error Majorants 351

Ilwll::; Cwn I VVw I Vw E Wo ,

and al is as in (4).
In general terms, the techniques used for deriving estimates (3) and (6)
are rather close. As in Sect. 2, we consider a sequence {NWk}t~ of finite
w
dimensional subspaces of N and introduce the corresponding functionals

Let us formulate also analogues of Theorems 1-3.


Theorem 4. If {N~}t~ possess the limit density property in NW! then

as k -+ 00 .
Proof. By the limit density property, for the exact solution m* of the dual
problem and any given 8 > 0 we can find k8 such that, for k ;:::: k8, there exists
an element m k E NWk satisfying the inequality

where ek := mk - m*. Let k ;:::: k8; then

Given the relations (5), we consider parts of the majorant M(v, 8, mk):

j (BVVv - mk ) : (VVv - B-lmk)dx =


n
= j(BVVe-e k ): (VVe-B-lek)dx =
n
=111 e1112 -2 j VVe : ekdx + j B-lek : ek dx ::; I I eIII2 +28111 eIII I va;. + 82/ a l ;
n n

j(V.V.m k - f)2dx = j(V.V.ek)2 dx ::; 82 .

n n
Taking into account the multipliers (1 + 8) and (1 + 1/8) c&nlal, we arrive
at the estimate I I e111 2::; Mk(v) : ;111 eIII2 +8 C. Therefore,

Mk(V) -+llleIII 2 as k -+ 00.


o
352 M. Frolov et al.

Theorem 5. If the sequence of pairs ((3k, xk) that minimizes M(v, (3, x*) on
IR+ x NWk is such that (3k ~ 0, then

xk ~ m* in Nw, Mdv, (3k, xi') ~III ei11 2 , M R ((3k, xk) ~ 0 ,

and

meas (Da) ~ 0 for any given (J >0 as k ~ 00 .

4 Numerical Results

In this section, we justify the method of Duality Error Majorants as an efficient


tool for numerical simulations. During the last few years we have performed
various types of tests (see [6, 10] and other papers cited therein). In these
tests, it was observed that the method is accurate and robust. The next step
in our analysis, which is considered in the present paper, is to investigate the
behavior of the DEM in the process of adaptive mesh refinement. Certainly, in
a short note it is impossible to describe all of the tests that we have performed.
Therefore, we select only one interesting and representative example.
The main purpose of our investigations is to compare the efficiency of
different error indicators in the process of mesh adaptation. It is important to
emphasize that all refinements are based on the same principle (the marking
strategy is quite standard: we flag elements if the corresponding local error is
greater than half of the maximum local error). Therefore, any difference in the
final results is due only to the differences in the efficiencies of the approaches.
We have used three error indicators. The first indicator is computed by
comparing an approximate solution v with the exact solution and provides
an objective judgement of the quality of mesh adaptations. The value of this
reference error indicator on an element T is denoted by TJ~:

(TJ~)2:= !
T
A(Vu - Vv) . (Vu - Vv)dx .

Two further indicators to be compared with TJ~ are defined as follows:


a local indicator based on the Duality Error Majorant that arises from (3):

(TJ~)2 := (1 + (3) !
T
(AVv - Yk) . (Vv - A-1Yk)dx +

+(I+I/(3)~n/al !(V'Yk+j)2 dx ,
T

and the standard local indicator of the MATLAB PDE Toolbox:


Computational Properties of the Duality Error Majorants 353

'rJ~ := C1 II hT fliT +C2 {~ L h~[nE . (AVU h)]2} 1/2


EE8TVH2

In general, the numerical tests performed can be classified into two groups.
For those in the first group, it was observed that the standard error indicator
provides adaptively refined meshes of suitable quality. For this group, the DEM
is also preferable but its advantage is not very considerable. However, for the
second group of examples, the distinction is rather more obvious. Below, we
present such a case.
Example. Let us consider the classical problem

=f in n,
{ -V·(AVu)
u=o on 8n ,

where the values of A and f are given in Table 1 (the sub domains of n are
depicted in Fig. 1). From Fig. 2, we conclude that the local errors computed
by the DEM reproduce the actual distribution on the initial mesh with high
accuracy. We also observe a disadvantage of the standard approach - overes-

Fig. 1. Domain n

Table 1. The matrix A and in the right-hand side f


Sub domain 1 2 3

A [~O~] [~ ~] [~O ~]
flO 1

timation of local errors in subdomains, where


II
f -=f=. O. From this point of view,
354 M. Frolov et al.

it is natural to predict that the DEM will lead to a more effective adaptation
than the one obtained by the indicator 'rJ~.

,
1,/ f".-/ f".-/ 1,,/
1/ 1/ 1/
"
, l"'- I"'- I"'-
1/ 1/ 1/"-
l"'- I"'- l"'-
V, V, v, v,
<
, l"'- I"'- I"'-
, V, v, v, v,
. ,
l"'-
V
l"'-
I"'- l"'- I"'-
1/ 1/ V
I"'- l"'-
V
I"'- l"'- I"'- 1",/1,/
1/

V 1/ 1/ V 1/ V
< l"'- I"'- f".-/ 1,,/ f".-/ f".-/
1/ 1/ 1/ 1/ V 1/
,
1'/1'/1,,/ f".-/ 1,,/ f".-/ 1,,/ f".-/
, 1/ 1/ 1/ 1/ 1/ 1/

Fig. 2. Indication on the initial mesh: (E) exact error distribution; (D) error indi-
cation by the DEM; (S) error indication by the standard indicator

The results of several mesh refinements are collected in Table 2; we de-


noted by N the corresponding number of degrees of freedom for the meshes
generated. The quantity % presents percentage of the relative error for the ap-
proximate solutions. Let us compare similar steps of adaptation computed by
the reference technique (the 1st block of columns) and DEM (the 2nd block).
Final meshes obtained on the basis of the indicators are also depicted on Fig. 3.
For every step, the DEM gives accuracy of approximations, which is very close.
to the optimal. It is worth outlining, that the effectivity index

Ie!! := Viii/III eIII


of the computed upper bounds is very close to 1 in each step of the mesh re-
finement. Eventually, the mesh (D15) almost coincides with the optimal mesh
(E13) (see Fig. 3). At the same time, the standard approach leads to essentially
worse results (see the 3rd block of columns). The difference is clearly observed
for meshes (D16) and (811). The same accuracy of computed solutions is pro-
Computational Properties of the Duality Error Majorants 355

Table 2. Mesh adaptation performed by the considered indicators

(E) (D) (S)


Iter. N % Iter. N % Jeff Iter. N %
0 22519.02 0 225 19.02 1.09 0 22519.02
5 862 6.99 7 876 7.12 1.07 4 1022 7.79
7 1422 5.42 9 1428 5.50 1.07 5 1929 5.70
11 3268 3.45 13 3376 3.50 1.06 7 3989 4.01
(13) 5552 2.70 (15) 5698 2.74 1.06 9 7417 3.01
(16) 7142 2.38 1.06 (11) 13560 2.37

vided by the DEM-based technology of adaptation with approximately half


the number of degrees of freedom.

(E13) (015)

(016) (511)

Fig. 3. Final meshes computed by different indicators


356 M. Frolov et al.

In the very last example, we show that the DEM that stems from (6) also
leads to error estimates of high quality.
Example. Let us consider the biharmonic problem with homogeneous boundary
conditions:
in 0,
on 80,

where 0 is the unit square. In this example, we aim to show that the DEM
provides effective error control not only in the framework of Finite Element
Methods but more widely. For this purpose, we choose the exact solution of
the following form: u = ¢(Xl) ¢(X2), where ¢(x) = (1 - x)2 x 2. An approxi-
mate solution is taken in the form v = u + CW, where W = 'Ij;(Xl' kt) 'Ij;(X2' k 2)
and 'Ij;(x, k) = e10xk ¢(x). We select such a value of the constant C that the
accuracy of v is about of 5%. An approximation of the dual variable )0(* is also
constructed as a combination of global basis functions (the total number of
which is denoted by Nb). The effectivity of the corresponding error bounds for

Table 3. Efficiency of the DEM for the biharmonic problem

kl 0.1 0.2 0.5 0.5


k2 0.1 0.3 0.5 1.0

Nb = 144 Ieff 1.42 1.49 1.52 1.53


Nb = 196 Ieff 1.11 1.13 1.13 1.14

the various cases is presented in Table 3. From these results, we conclude that
a combination of 196 basis functions is quite enough to provide high-quality
estimates for the various values of the parameters kl and k 2 .

5 Conclusions

We justified theoretically and numerically that the method of Duality Error


Majorants (a) provides sharp upper bounds on the global energy norm of
the error, and (b) it reproduces the local behavior of the error with high
accuracy. For this reason, mesh adaptations based on the DEM are very close
to those that would be obtained on the basis of the exact knowledge on the
error distribution.

Acknowledgement
The research was partially supported by grant N201728 of the Academy of
Sciences of Finland.
Computational Properties of the Duality Error Majorants 357

References
1. Ainsworth, M., aden, J.T. (2000): A posteriori error estimation in finite element
analysis. Wiley, New York
2. Babuska, I., Rheinboldt, W.C. (1978): A-posteriori error estimates for the finite
element method. Int. J. Numer. Methods Eng., 12, 1597-1615
3. Babuska, I., Rheinboldt, W.C. (1978): Error estimates for adaptive finite ele-
ment computations. SIAM J. Numer. Anal., 15, no. 4, 736-754
4. Babuska, I., Strouboulis, T. (2001): The finite element method and its reliabil-
ity. Numerical Mathematics and Scientific Computation, The Clarendon Press,
Oxford University Press, New York
5. Ekeland, I., Temam, R. (1976): Convex analysis and variational problems. Stud-
ies in Mathematics and its Applications, Vol. 1., North-Holland Publishing Co.,
Amsterdam-Oxford, American Elsevier Publishing Co., Inc., New York
6. Frolov, M., Neittaanmiiki, P., Repin, S. (2003): On the reliability, effectivity
and robustness of a posteriori error estimation methods. In: Kuznetsov, Y.,
N eittaanmiiki, P., Pironneau, O. (eds.) Numerical Methods for Scientific Com-
puting. Variational problems and applications. CIMNE Barcelona (in press)
7. Neittaanmiiki, P., Repin, S.l. (2001): A posteriori error estimates for boundary-
value problems related to the biharmonic operator. East-West J. Numer. Math.,
9, no. 2, 157-178.
8. Repin, S.1. (1997): A posteriori error estimation for nonlinear variational prob-
lems by duality theory. Zap. Nauchn. Semin. POMI, 243, 201-214 ((2000):
english translation in J. Math. Sci., New York, 99, no. 1, 927-935)
9. Repin, S.l. (2000): A posteriori error estimation for variational problems with
uniformly convex functionals. Math. Comp., 69, no. 230, 481-500
10. Repin, S.I, Sauter, S., Smolianski, A. (2003): A posteriori error estimation for the
Dirichlet problem with account of the error in the approximation of boundary
conditions. Computing, 70, 205-233
11. Verfiirth, R. (1996): A review of a posteriori error estimation and adaptive mesh-
refinement techniques. Wiley-Teubner Series Advances in Numerical Mathemat-
ics
Efficient Algorithm for Local-Bound-Preserving
Remapping in ALE Methods

Rao Garimella 1 , Milan Kuchafik 2 and Mikhail Shashkov3

1 Los Alamos National Laboratory, T-7, MS-284, Los Alamos, NM 87545, USA
[email protected]
2 Czech Technical University in Prague, Bfehova 7, 115 19 Prague 1, Czech
Republic [email protected]
3 Los Alamos National Laboratory, T-7, MS-284, Los Alamos, NM 87545, USA
[email protected]

Summary. The remapping algorithm is an essential part of the ALE (Arbitrary


Lagrangian-Eulerian) method. In this talk we present such an algorithm based on
linear function reconstruction, approximate integration and mass redistribution.

1 Introduction

Conservative remapping is an essential part ofthe ALE (Arbitrary Lagrangian-


Eulerian) method for fluid dynamics computations. This method tries to use
advantages of both the Lagrangian and Eulerian approaches.
At first, several time steps of the pure Lagrangian computation are used.
As the grid moves together with the fluid, it may happen that the grid be-
comes distorted or tangled due to shear flow. Now comes the Eulerian part
of the algorithm. We prepare a new rezoned grid and recompute (remap) the
quantities from the distorted grid to the rezoned one.
We have several conditions this remapping step must satisfy. It must be
efficient to be usable in real computations. Total sum of the conservative quan-
tities must be preserved - the algorithm must be conservative. We do not want
to create new local extrema, we want it to be local-bound preserving. It must
be stable and applicable to general unstructured meshes in 2D and 3D. In this
article we introduce a 3D algorithm, which satisfies these conditions. A similar
procedure in 2D is described in [1].

2 Algorithm Description

2.1 Problem statement

Suppose, we have two grids: Lagrangian C = {c} and rezoned 6 = {c}. The
grids have the same topology. The rezoned grid is created from the original
one just by small movement of the grid nodes. There exists some underlying
Efficient Algorithm for Local-Bound-Preserving Remapping 359

function g(r), r = (x, y, z) in the Lagrangian cells (for example g = p, g = pu,


g = pv, g = pw, g = P (c+ IUI 2 /2), where p is mass density, U = (u,v,w)
is a vector of velocities and c is the internal energy). We do not know the
function itself, we know just the mean values in the grid cells and their masses
and volumes

gc =
I g(r) dV
.::..c--::-:,..,..-:--
V(c)
m(c)
V(c) , m(c) == J g(r) dV, V(c) == J 1dV. (1)
c c

Total mass (momentum, energy) in the computational domain [! can be com-

J J
puted as
M == g(r)dV = L g(r)dV = Lm(c). (2)
n '1c c '1c

We want to compute new masses m*(c) and corresponding mean values in


the rezoned cells
=* m*(c)
ge = V(c) (3)

and we want them to be as close to the exact values as possible (m*(c) ~


m(c) = Ie
g(r) dV). We also want not to create new local extrema

g,:ax = max gc n , (4)


cnEC(c)

where C(c) c C is neighborhood of cell c, and to be conservative (total mass


must be the same)
Lm*(c) =M.
'1c
If the underlying function is a linear function, we want our method to be exact

m*(c) = m(c) = J
e
g(r) dV for g(r) = a + bx + cy + dz.

2.2 Remapping Algorithm


We design our algorithm in three stages. In the first stage, we make a piecewise
linear reconstruction of the underlying function on the original mesh. This can
be done using different methods, with or without limiters. In the second stage,
we integrate this reconstructed function to obtain means on the new grid. The
most natural approach would be exact integration, but it needs computation of
the intersections of the Lagrangian grid with the rezoned one. This intersection
is very time consuming in 2D and almost unfeasible in 3D, so we use numerical
quadrature - swept integration. It does not require finding these intersections
so it is much faster. The problem is that it is an approximate method and
it may happen that the local extrema are violated, so we need also the third
stage - repair - which ensures us this local-bound preservation.
360 R. Garimella et al.

2.3 Stage 1 - Piecewise Linear Reconstruction

We want to reconstruct the underlying function in the form

pc(r) = Pc(x, y, z) = Pc + s~ (x - xc) + s~ (y - Yc) + s~ (z - zc) , (5)


where
JxdV JydV JzdV
c c c
Xc = V(c) , Yc = V(c) , Zc = V(c) (6)

are coordinates of the cell center and V(c) is the volume of the cell defined in
(1 ).
For computation of slopes we use the limited form

s~ = Pc s~ unlim sY =
'c
P C syc unlim
'C
SZ =P C
SZC unlim
, (7)

where six,y,z} unlim are unlimited slopes and Pc is Barth-Jasperson limiter,


which must be computed firstly.

Unlimited Slopes In 1D we can use just the central difference as the un-
limited slope. To compute unlimited slopes in 2D we construct a contour sur-
rounding the cell and use Green's Theorem. In 3D this would require comput-
ing intersections of this neighborhood with the original grid, which would be
too slow. So we must use another method.
Let's construct the functional

(8)

for each cell, which measures the sum of differences between the mean values in
the neighboring cells and average values of the reconstructed function from the
original cell in the same neighboring cell. We want to minimize this functional,
so we want the reconstructed function to be as close to the mean values in the
neighboring cells as possible.
We easily compute derivative of this functional with respect to all three
variables and let them be equal to zero. This gives us a linear system

aF(S~, S~, S~) _ 0


asiX,y,z} -, (9)

which can be easily solved and gives us our unlimited slopes six,y,z} unlim.
Efficient Algorithm for Local-Bound-Preserving Remapping 361

Limited Slopes For computation of the slopes six,y,z} we use the Barth-
Jasperson limiter at each cell vertex n and than the minimum of them as
a cell limiter

for p~nlim - Pc > 0


for p~nlim - Pc < 0 Pc = min Pn , (10)
nEN(c)
for p~nlim - Pc = 0 ,
which ensures us preservation of local extrema and also preservation of a linear
function. Here p~nlim is the value of the reconstructed function (using the
unlimited slopes) in the node n. It is described in details in [2].

Integration over an Arbitrary Polyhedron The only part in the func-


tional, we don't know, is the integral

J Pc(x,y,z)dxdydz. (11)

We also need to compute the integrals in the definition of cell centers Xc, Yc,
zc (6) and cell volumes VC n (1). So we need a method for integration of the
linear function over an arbitrary polyhedron. We note, that the boundary of
the polyhedron is uniquely defined, we know just the vertices of each face. If
the face vertices do not lie in one plane, the face is curved and the boundary
is not uniquely defined.
We demonstrate our integration procedure for the example of the cell vol-
ume, the integration of an arbitrary linear function is similar. The cell volume
can be written in the form

V(c) = J
c
IdV = ~ J c
div(x,y,z)dV (12)

and using the Divergence Theorem we can rewrite it as an integral over the

J y,
boundary 8c
v (c) = ~ (x, z) T • S dA . (13)
Bc

Here the superscript T means the transposition of a vector and S is the vector
normal to the boundary. The boundary integral can be split into the sum over
all faces II of the face integrals

v (c) = ~ L
IIEBV II
J (x, y, z) T . S dA (14)

Now just by averaging the coordinates of vertices of each face we compute its
center, connect it with all face vertices and split these face integrals to the
362 R. Garimella et al.

integrals over such defined triangles .<1. On each this triangle the face normal
S is constant so it can go in front of the integral

V(c) = ~ L L
IIE8V LlEII
(SX J
Ll
xdA+SY J
Ll
ydA+Sz J
Ll
ZdA) (15)

Now we project all triangles to the coordinate planes. For each triangle we
select the coordinate plane in which the triangle has the biggest area. This
ensures us that we do not get into trouble due to numerical problems. Using
Green's Theorem we reduce these integrals over triangles to 1D edge integrals,
which can be computed directly from vertex coordinates. This algorithm gives
us a method for computing the integral of the arbitrary linear function over
an arbitrary polyhedron. More details can be seen in [3].

2.4 Stage 2 - Swept Integration

Swept region quadrature concept has been explained in detail in [1].


The swept region is created by the movement of the face from the original
grid to the new position. It is bordered by the old face, the new face, and by
not necessarily fiat quadrilaterals connecting each edge from the original face
to the edge of the new face. We can compute the volume and mass of a such
region - we talk about swept volume and swept mass. We use these terms in
their signed sense. Suppose we have a cell on the original mesh and we move
just one face as illustrated on the Fig. 1. In this case, the right face moves

ORIGINAL CELL - - -

NEW FACE POSITION

SWEPT REGION

Fig. 1. One swept region

outward from the original cell and the middle part is the swept region. In fact,
all faces can move in different ways and swept regions can be tangled. If most
of the swept region goes outward from the original cell, the swept volume and
swept mass are positive, otherwise they are negative. The mass of the swept
Efficient Algorithm for Local-Bound-Preserving Remapping 363

region is computed by integration of the reconstructed function over the cell,


in which the most swept region lies. The new cell mass can be composed from
the mass of the original cell and the masses of all swept regions

m*(c) = m(c) + L omf· (16)


fEF(c)

Here f means a swept region from the set F(c) of all swept regions of the cell
c. The new mean value can be than computed as
;;; m*(c)
Pc = V(c) (17)

and as noticed before, it can violate the local bounds due to the approxima-
tion of the integration. So the third stage is necessary to enforce local-bound
preservation.

2.5 Stage 3 - Repair

The repair stage works as the conservative redistribution of a conserved quan-


tity. It corrects the overshoots back to their local bounds. At first, we must
compute these local extrema. For each cell c we define a bound-determining
neighborhood C(c), which is a piece of the original grid fully covering the new
cell. Usually we use the original cell plus its nearest neighbors. We compute
the local extrema in this neighborhood

(18)

We show the repair for the example of violation of the lower bound

(19)
upper bound is done similarly. At first we compute mass, which is needed in
the cell to bring the mean value back to the local minimum

om~eeded = (p~in - Pc) V(c) . (20)


We want our algorithm to be conservative, so we do not just add this mass
to the wrong cell, but we look for available mass in the bound-determining
neighborhood. For each neighboring cell we compute the mass

(21)

which can safely be taken from the cell without violating the local bound also.
The total available mass in the neighborhood is

5m avail =
C(c)
'\'
L om~vail
Cn '
(22)
cnEC(c)
364 R. Garimella et al.

If the available mass is too small (om~(~il < omileeded), we extend the sten-
cil and look for the available mass in a targer area. If there is enough mass
available, we perform the repair. We bring the wrong value back to the local
mllllmum
(23)
and we take the mass from the neighborhood proportionally to the mass avail-
able
om~vail
m'(cn ) = m(cn ) - ~
>: avaIl
om':.'eeded
C
. (24)
umC(c)

In [1] we proved that this algorithm succeeds in a finite number of steps and
the repair stage corrects all local-bound violations.

3 Numerical Tests

3.1 Orthogonal Uniform Grid

In the first example the underlying function is equal to zero everywhere, only
in a spherical region around the center of the computation domain (0,1)3 it is
equal to 1

g(x,y, z) =
I
{0
for r ::; 0.25
else
r= J(x _~) + (y _~) + (z _~)
2 2 2

(25)
We define the uniform orthogonal grid in the computational domain, the initial

Fig. 2. Initial spherical function on 40 3 orthogonal uniform grid

function is shown on the Fig. 2. We move the grid as the tensor product
movement
Efficient Algorithm for Local-Bound-Preserving Remapping 365

x~ew = (I-a) Xn +a x;·5 ,y~ew = (I-a) Yn +ay;,'o ,z~ew = (I-a) Zn +a Z;'·5 ,


(26)
where

a = 0.5 sin(47rt), t = N/Nmax , t E (0,1) - time of Nth timestep.


(27)
We make N max = 200 remap pings to obtain accumulated errors and to have
the problematic regions visible. On the Fig. 3 we can see this spherical function

a) b)

Fig. 3. 200times remapped spherical function using only unlimited reconstruction


a) without repair b) with repair

remapped using only unlimited slopes. This causes more errors, so the effect
of the repair stage is more obvious. In the a) part of the figure we can see the
function without the repair stage. The light gray cells show areas where the
extrema are violated. In the b) part we see the same remapping with repair,
no values violate the bounds.

3.2 Tetrahedral Grid with Random Movement

The second numerical example shows the same cubical computational domain
with tetrahedral mesh inside. It includes about 9000 tetrahedrons. We use
the same spherical function as before, we can see it on the Fig. 4. Now, we
shake the grid randomly 10 times and remap between these grids. In the last
time step we remap back to the original grid. On the Fig. 5 we can see the
situation with the usage of the Barth-Jasperson limiter with and without the
repair stage. Again, we can see several white cells in the a) part, where the
bounds are violated. In the b) part, the repair stage corrects everything and
no problem with bound preservation is observed.
366 R. Garimella et al.

Fig. 4. Initial spherical function on about 9000 tetrahedrons

a) b)
Fig. 5. lOtimes remapped spherical function using Barth-Jasperson limiter a) with-
out repair b) with repair

4 Conclusion

In this article we constructed an efficient algorithm for function remapping


between two similar grids. It is face-based and usable in 3D unlike the most
natural exact integration algorithm, which is not feasible in 3D. The algorithm
is conservative (total mass remains constant), local-bound preserving (does
not create new extrema), stable and linearity preserving. We presented several
numerical examples to show, that we can use it for different types of grids and
grid movements.

5 Acknowledgments

This work was performed under the auspices of the US Department of En-
ergy at Los Alamos National Laboratory, under contract W-7405-ENG-36. The
authors acknowledge the partial support of the DOE/ ASCR Program in the
Efficient Algorithm for Local-Bound-Preserving Remapping 367

Applied Mathematical Sciences and the Laboratory Directed Research and De-
velopment program (LDRD). R. Garimella and M. Shashkov acknowledge the
partial support of DOE's Accelerated Strategic Computing Initiative (ASCI).
M. Kuchafik also acknowledges the partial support of the Czech Technical
University grant CTU0310614. The authors thank L. Margolin, B. Wendroff,
B. Swartz, R. Liska, M. Berndt and V. Dyadechko for fruitful discussions and
constructive comments.

References

1. M. Kucharik, M. Shashkov, and B. Wendroff. An efficient linearity-and-bound-


preserving remapping method. Journal of Computational Physics, 188(2):462-471,
2003.
2. T. J. Barth. Numerical methods for gasdynamic systems on unstructured meshes.
In C. Rohde D. Kroner, M. Ohlberger, editor, An introduction to Recent Devel-
opments in Theory and Numerics for Conservation Laws, Proceedings of the In-
ternational School on Theory and Numerics for Conservation Laws, Berlin, 1997.
Lecture Notes in Computational Science and Engineering, Springer.
3. B. Mirtich. Fast and accurate computation of polyhedral mass properties. Journal
of Graphics Tools, 1(2):31-50, 1996.
Mimetic Finite Difference Methods for
Diffusion Equations on Unstructured
Triangular Grid

Victor Ganzha 1 , Richard Liska 2 , Mikhail Shashkov3 and Christoph Zengerl

1 Department of Informatics, Technical University of Munich, Boltzmannstr. 3,


D-85748 Garching, Germany [email protected], [email protected]
2 Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University
in Prague, Bfehova 7, 115 19 Prague 1, Czech Republic [email protected]
3 Group T-7, Los Alamos National Laboratory, Los Alamos, NM 87544, USA
[email protected]

Summary. A finite difference algorithm for solution of stationary diffusion equation


on unstructured triangular grid has been developed earlier by a support operator
method. The support operator method first constructs a discrete divergence operator
from the divergence theorem and then constructs a discrete gradient operator as
the adjoint operator of the divergence. The adjointness of the operators is based
on the continuum Gauss theorem which remains valid also for discrete operators.
Here we extend the method to general Robin boundary conditions, generalize it to
time dependent heat equation and perform the analysis of space discretization. One
parameter family of discrete vector inner products, which produce exact gradients for
linear functions, is designed. Our method works very well for discontinuous diffusion
coefficient and very rough or very distorted grids which appear quite often e.g. in
Lagrangian simulations.

1 Introduction

The mimetic finite difference methods preserve fundamental properties of the


original continuum differential operators and allow the discrete approximations
of partial differential equations (PDEs) to mimic critical properties including
conservation laws and symmetries in the solution of the underlying physical
problem [1]. The discrete analogs of differential operators satisfy the identi-
ties and theorems of vector and tensor calculus [2] and provide new reliable
algorithms for a wide class of PDEs. In [3] the mimetic method for parabolic
diffusion equation has been developed on 2D quadrilateral logically rectangu-
lar grids and in [4] the method has been developed for stationary diffusion
equation on unstructured triangular grid. In this paper we apply our ideas
to the construction of mimetic methods for the solution of parabolic diffu-
sion problems in strongly heterogeneous materials on unstructured triangular
computational grids in 2D, capable to treat arbitrary computational region.
Mimetic Finite Difference Methods for Diffusion Equations 369

1.1 Continuous problem

We consider heat equation with general Robin boundary conditions

Ut - = j on D
div K grad U (1)
au + (3(K grad u, n) = 'IjJ on aD

on an arbitrary 2D region D with boundary aD with possible discontinuous


diffusion coefficient K and unknown function u. The goal of the paper is to
develop a numerical method for this problem on given triangular grid. The
method should work well also on bad quality meshes typically appearing in
Lagrangian hydrodynamics computations, where we need to treat the parabolic
part of the model as heat conductivity of fluid. The spatial dicretization will
be the same as in [4]. For derivation of discretization we will use the first order
coordinate invariant operators div and grad, so we first transform the heat
equation (1) into the first order system Ut + div w = j, w = - K grad U where
w is the heat flux.
For further analysis we introduce the generalized gradient operator

Gu= -K grad u (2)

and extended divergence operator

D
w =
{diVW on D
-(w, n) on aD (3)

and look at some integral properties of these operators.


First we note that the divergence Green formula

r divwdD- fan
)n
J (w,n)dS=O (4)

can be written as (D w, 1)H = 0, where the inner product of scalar functions


(., .) H on the space H of sufficiently smooth scalar functions on D is defined
by (u, V)H = In u v d D + :fan u v d s. On the other hand the Gauss theorem

k
r udivw d D - J
hn
u(w,n) d S + r (w,K-1Kgradu)d D
k
= 0 (5)

can be written as
(Dw, U)H = (w, GU)H' (6)
where the inner product of vector functions (., .)H on the space of vector func-
tions H is defined by (A, B)H = In(K- 1 A, B)d D.
Our operators G, D are acting between spaces Hand H as G : H --+ H, D :
H --+ H. The Gauss theorem (6) implies that the operator G is the adjoint
operator of the operator D in the sense of defined inner products G = D*.
370 V. Ganzha et al.

1.2 Semidiscrete problem

The heat equation problem (1) is discretized in time by fully implicit scheme
u n +1 _ un
--,---- - div K grad u n +1 =f on D, (7)
L1t
au n + 1 + (3(K grad u n +!, n) = 'IjJ on aD,

which can be written in operator form as Aun + 1 = Fun where the operators
A and F are given by

Au _ {u/L1t - div K grad u on D Fu = {u/ L1t + f on D


- au + (3(K grad u, n) on aD ' 'IjJ on aD

For simplicity we assume Neumann boundary conditions in the semidiscrete


problem (7), so a = 0, {3 = 1. In this case the global operator A is given by

A = (I/L1t - DG) (8)


where the operator I is the identity inside the region D and zero on the bound-
ary aD, G is the generalized gradient operator (2) and D is the extended
divergence (3) operator. One can quite easily [4, 3] show (G is the adjoint op-
erator of D, G = D*), that the global operator A is self adjoint and positive
definite A = A * > O. The same conclusion can be reached also for the case of
Dirichlet and Robin boundary conditions, [5].

2 Spatial discretization
We first describe the approximation of scalar and vector functions on the
given unstructured triangulation of the region D. Triangles of the grid are
numbered by index i, vertexes by index j and edges by index k with boundary
edges ordered first. The scalar function u is approximated by the piecewise
constant discrete function with constant values Ui inside each triangle i and
with constant values Uk on each boundary edge k. The vector heat flux function
w is discretized by point values at the center of each edge by the projection
W k of w to the normal to the edge as shown in Fig. 1 The normal flux is
continuous across the edges. We define space HC of discrete scalar functions
(piecewise constant functions inside each triangle and on each boundary edge)
with natural inner product
Nt Neb
(U, V)HC = LUi Vi VCi + L Uk Vk Sk,
i=l k=l

where Nt is the number of triangles, Neb is the number of boundary edges,


VCi is the area of triangle i and Sk is the length of the boundary edge k. The
space HL of discrete vector functions has the natural inner product
Mimetic Finite Difference Methods for Diffusion Equations 371

Fig. 1. Projection of vector function w to the edges normals at centers of the edges

Both natural inner products of discrete functions are approximations of inner


products of continuum functions defined above in section 1.1.
The traditional definition of the discrete inner product (A, B)i of vector
functions at triangle i is

(9)

part of which at the vertex j/ of the triangle i is

Such inner product gives the exact gradient of linear functions.

2.1 General vector local inner product

The general discrete inner product of vector functions at a triangle i can be


defined by a symmetric positive definite matrix M

(A,B);W = (M· A)· B, M = (:~~ :~~ :~:)


m13 m23 m33

Any triangle can be transformed into triangle with vertexes (0,0), (1,0),
(x, y) by moving, rotation and scaling. We continue the explanation here on
this triangle. The Gauss theorem (5) is applied to our triangle with linear scalar
function u and arbitrary vector function w. We require the inner product to
372 V. Ganzha et al.

define the exact gradient which results in a system of 6 linear equations for 6
variables, elements mkl of the matrix M. Only 5 equations from the system
are independent and their solution is

((x - 2)x + y2 + l)(x + 1)82 + 3m1281y2


ml1 = 382y 2
_(x 2 + y2 - 1)8182 + 3m12y2
m13 = 382y 2
_(x 2 + y2)(x - 2)81 + 3m1282y2
m22 = 381y 2
_(x 2 - 2x + y2)8182 + 3m12y2
381y 2
((x - 1)x + y2 + 1)8182 + 3m12y2

where 81,82 are lengths of the triangle edges 81 = J(x - 1)2 + y2, 82 =
Jx 2 + y2 and m12 remains as a free parameter.
The Sylvester criterion for positive definiteness of matrix M results in a set
of 3 inequalities which reduces to the constraint on m12 > m'ld n = 8182(1 +
(x + 1)(x - 2)/y2)/9 after simplification by using quantifier elimination. We
have found a family of inner products depending on the parameter m12 which
produce exact gradient for linear functions. The traditional scalar product
defined by (9)-(10) belongs to this family. The free parameter m12 can be used
to improve some properties of our numerical algorithm as its accuracy and
condition number of the local matrix M [6].

2.2 Divergence and gradient discretization

Before proceeding to divergence and gradient discretizations we need to define


formal inner products. The formal inner product for scalar discrete functions
is given by
Nt Neb
[U, V]HC = L:UiVi + L:UkVk, (U, V)HC = [MU, V]HC
i=1 k=1

and for vector discrete functions by


Ne
[A,B]HL = L:AkBk, (A, B)HL = [LA, B]HL
k=1

where we have introduced the operators M, L which connect the natural and
formal inner products. Note that the formal inner products are plain sums of
discrete values products while the natural inner products approximate inner
products on the spaces of continuum functions.
Mimetic Finite Difference Methods for Diffusion Equations 373

The discrete operators divergence D and gradient G act between discrete


functions spaces as D : HL ----4 HC, G : HC ----4 HL. The divergence Green
formula (4) can be written in the discrete case as (DW,l)Hc = 0 and when
applied to one triangle i it gives us the discretization of the divergence
3

(DW)i = v~.
t
L Wkf Skfsign(J/+l -
J=l
J/)

inside the triangle i (sign(jf+l - j!) distinguishes the unique direction on the
edge k! connecting vertexes jf+l and jf [4]) and (DWh = -Wknk on the
boundary edge k.
The Gauss theorem (6) can be rewritten in the discrete case as (DW, U)HC =
(W, GU)HL so that the discrete gradient is the adjoint of discrete divergence
G = D*. When we transform this inner products equality into formal inner
products using the operators Land M we get

Now the formal adjoint DO can be constructed [4] and to get the gradient
W = GU (which is the natural adjoint D* U) of the scalar grid function U the
system
LW=D0MU (11)
has to be solved. The gradient constructed by this way as adjoint to divergence
has global stencil.
The discrete approximation of the global operator (8) is symmetric and
positive definite and for its inversion, which is needed in each time step of the
implicit method, we employ the conjugate gradient method. The numerical
gradient evaluated on every iteration of the conjugate gradient method by
solving (11) for E is computed by the standard Gauss-Seidel method. Our
method is exact on piecewise constant or piecewise linear solutions, otherwise
it is second order accurate.

2.3 Boundary conditions

The shortly outlined mimetic discretization incorporates a different kind of


boundary conditions, namely Dirichlet, Neumann and Robin ones. Each type
of boundary conditions is treated differently. For Dirichlet boundary conditions
the value of the scalar function u on the boundary is known, the gradient on
the boundary is computed and boundary conditions are fulfilled exactly. For
Neumann boundary conditions the gradient on the boundary is known, we do
not solve for it, and the boundary flux (gradient) is moved to the right hand
side of the global system, which is the discretization of the semidiscrete system
(7). The value of the scalar function u on the Neumann boundary is not needed
and again boundary conditions are fulfilled exactly.
374 V. Ganzha et al.

The situation is different for the Robin boundary conditions when both
value of the scalar function u and the boundary flux (gradient of u) are un-
known on the boundary. The discrete form of the boundary conditions in (7)
is included in the global discrete system and the boundaries with Robin con-
ditions are included in scalar products used for computing conjugate gradient
coefficients and residuals. More details on numerical treatment of boundary
conditions by mimetic methods can be found in [S].

3 Numerical tests

In this section we provide several tests of the developed mimetic method. For
all tests we use initial conditions u = 0 everywhere and compute till time
sufficiently large for the solutions to reach the steady state. Exact discrete
solutions used in error evaluation are given by the point values of exact con-
tinuous solution at the median (average of its vertexes) of each triangle.

3.1 Piecewise linear and quadratic tests

Both piecewise linear and quadratic tests are solved on the region (x, y) E
(0,1) x (0,1) with a discontinuous piecewise constant diffusion coefficient

k = {kl' 0 < X < O.S,


k2' O.S < X < 1.

We solve these problems for particular values of diffusion coefficient kl =


1, k2 = 2 till time t = 10 when the solution converges to the stationary one.
Of course the triangulation is done in such a way that the whole discontinuity
line x = O.S is covered by the edges so that inside each triangle the diffusion
coefficient is constant.
The stationary exact solution of the piecewise linear test, coming e.g. from
[7, 3]' is a piecewise linear function

The maximal numerical errors for this test are shown in Table 1 (a) and are
close to machine precision, showing that our method is exact for piecewise
linear solutions.
The stationary exact solution of the piecewise quadratic test, coming e.g.
from [8, 3], is a piecewise quadratic function
Mimetic Finite Difference Methods for Diffusion Equations 375

where ai = -l/ki' b1 = (3a2 + al)k2/(4(k 1 + k2)), b2 = k 2 bdkl' C2 =


-b 2 - a2/2. The convergence analysis for this test is presented by maximum
errors and numerical order of convergence in Table l(a) and confirms that our
method is second order for non-linear solutions, even in the case of discontin-
uous diffusion coefficients.

3.2 Anisotropic triangulation

One of our aims was to develop a method working well also for bad qual-
ity, rather distorted triangular grid which appears quite often in Lagrangian
meshes moving with the fluid. To show how our mimetic method works on bad
quality grids including triangles with big angles we choose the initial grid on
the region (x, y) E (-1,1) x (0, 1) as shown in Fig. 2 (a) and stretch this grid by
a parameter a producing bad quality grids on the region (x, y) E (-a, a) x (0, 1).
The initial grid stretched by parameter a = 5 is shown in Fig. 2 (b). We solve
one problem on series of grids obtained by stretching by increasing parameter
a.

(a)

(b)

Fig. 2. Grid used for stretching the triangulation; (a) for parameter a =1, i.e.
x E (-1,1), (b) for parameter a = 5, i.e. x E (-5,5)

The problem with anisotropic triangulation is heat equation (1) with diffu-
sion coefficient K = 1, right hand side f = -2/a 2 and zero Dirichlet boundary
conditions on left and right and zero Neumann boundary condition on top and,
down. The stationary solution of this problem is u = x 2 /a 2 -1. This problem
is solved till time t = 10a 2 when the solution reaches the stationary state. We
compare the results of our mimetic support operator method with the results
of standard linear finite element method for which we use its implementation
in Partial Differential Equation Toolbox in Matlab. This comparison is pre-
sented by maximal errors in Table 1 (b) for stretching parameter a growing
from 1 to 10 000. Our mimetic method keeps the accuracy well for all values
ofa while finite element method is loosing accuracy already for a = 100. The
376 V. Ganzha et al.

minimal numerical value of numerical solution, which should be minus one, is


also presented in Table 1 (b). One can note that the finite element method is
not resolving this minimum for big a, its solution remains flat and very close
to initial condition u = O. The reason for this behavior of the finite element
method is that for big a linear interpolation introduces zig-zagging in y di-
rection (there are no edges close to be parallel to y axis) for solution with
curvature in x direction. This zig-zagging is eating too much of the overall
energy and the parabola in the x direction is not resolved well.

Table 1. (a) Convergence tables of maximum errors Emax for piecewise linear and
piecewise quadratic tests, for quadratic test also numerical order of convergence q is
shown; (b) Maximum errors and minima of numerical solution of problem with sta-
tionary solution u = x 2 /a 2 -Ion grids stretched by parameter a by mimetic support
operator method (MSOM) and standard linear finite element method (FEM).

Nr. of piecewise piecewise MSOM FEM


triangles linear test quadratic test a Ernax min(u) Ernax min(u)
Emax Emax q 1 0.011 -1.0087 0.0039 -1.0028
126 1.4. 10 ,w 0.0024 1.96 10 0.0055 -1.0032 0.072 -0.95
504 2.5.10- 11 0.00063 1.98 100 0.0055 -1.0031 0.88 -0.12
2016 8.9 . 10- 12 0.00016 1.99 1000 0.0055 -1.0032 1.0 -0.0013
8064 9.4.10- 12 0.000040 10000 0.0074 -1.0034 1.0 -0.00001
(a) (b)

Acknowledgment: This research has been partially supported by Interna-


tional Bureau of the BMBF and Czech Ministry of Education in the frame-
work of the German-Czech join research project nr. CZE-OO-OlO and by the
project nr. ME 436/2001 from the program Kontakt of the Czech Ministry of
Education. The work of M. Shashkov was performed under auspices of the US
Department of Energy under contract W-7405-ENG-36.

References

1. M. Shashkov. Conservative Finite-Difference Methods on General Grids. CRC


Press, Boca Raton, Florida, 1996.
2. L. Margolin, M. Shashkov, and P. Smolarkiewicz. A discrete operator calculus for
finite difference approximations. Comput. Methods Appl. Mech. Engrg., 187:365-
383, 2000.
3. M. Shashkov and S. Steinberg. Solving diffusion equation with rough coefficients
in rough grids. J. Compo Phys., 129:383-405, 1996.
4. Victor Ganzha, Richard Liska, Mikhail Shashkov, and Christoph Zenger. Support
operator method for laplace equation on unstructured triangular grid. Selcuk
Journal of Applied Mathematics, 3(1):21-48, 2002.
Mimetic Finite Difference Methods for Diffusion Equations 377

5. J.M. Hyman and M. Shashkov. Approximation of boundary conditions for mimetic


finite-difference methods. Computers Math. Applic., 36:79-99, 1998.
6. R. Liska, M. Shashkov, and V. Ganzha. Analysis and optimization of inner prod-
ucts for mimetic finite difference methods on triangular grid. Mathematics and
Computers in Simulation, Accepted, 2004.
7. J.M. Morel, J.E. Dendy Jr, M.L. Hall, and S.W. White. A cell-centered lagrangian-
mesh diffusion differencing scheme. J. Camp. Phys., 103, 1992.
8. R. J. MacKinnon and G. F. Carey. Analysis of material interface discontinuities
and superconvergent fluxes in finite difference theory. J. Camp. Phys., 75, 1988.
On Computational Glaciology:
FE-Simulation of Ice Sheet Dynamics

Gunter Godert 1 and Franz-Theo Suttmeier2

1 Guenter. [email protected]
2 University Dortmund [email protected]

Summary. The main focus of this paper is on stable FE-discretisations for treat-
ing systems of partial differential equations arising in glaciology. The systems are
coupled ones, consisting of a flow problem determining stress, pressure and velocity
and evolution problems for temperature and mean orientation densities, describing
anisotropic material behaviour. The proposed strategies are applied to a standard
model for describing ice sheet dynamics and an enhanced one, taking into account
the developement of certain fabrics in the structure of the ice.

1 Introduction
Climate, climate history and climate forecast have become more and more
important. If one wants to conceive changes of climate in the past and to
make precdictions for the future, the climate of an ice-age has to be studied.
In the context of climate simulations the flow of polar ice masses represents
an essential part. In so far there emerges a need for appropriate climate bound-
ary conditions, Greve e.a. [11], Huybrechts [13]' Fabre e.a. [3]. Furthermore
there is a need of an appropriate description of thermo-mechanical material
behaviour. For instance simulations considering future climate need reliable
constitutive relations to generate reliable predictions about the global hydro-
balance. In addition problems emerged from cold region structural engineering,
e.g. Calove.a. [1], are often become more reliable through such flow simula-
tions. On the other hand, because it may record the past history of ice and
climatic changes and because it is sensible to ice sheets deformation history,
the microstructure of polar ice is worth studying.
The growth and retreat of inland ice masses is governed by the snowfall
onto the surface, the melting and calving of the ice close to and at the outer
ice boundaries. Owing to its own weight, the ice deforms with velocities of
typically 100 meter per year causing a transport of ice towards the ice sheet
boundaries where the ice melts and calves. This process, in turn, is influenced
by the temperature distribution within the ice, implying a delicate balance be-
tween the thermal and mechanical regimes that are established by the climate
input and the geothermal conditions of the substrate. The thermomechanically
coupled ice dynamics together with the mass flux due to snowfall and mass
loss in the vicinity of ice boundaries determine the thickness distribution of
a particular ice sheet.
On Computational Glaciology: FE-Simulation of Ice Sheet Dynamics 379

The deformation of an ice sheet and the variation of its temperature dis-
tribution depends to a large extent on its thermomechanical constitutive mod-
elling. Here, we treat ice as a rheologically nonlinear, thermally coupled, vis-
cous fluid, i.e., we asume its fluidity (inverse viscosity) to be temperature-
dependent, the latter according to a power law with exponent 3, the former
essentially following an Arrhenius-type relationship.
Ice sheets are cold, i.e. below their melting point, except at parts of their
base, where the temperature may reach the melting point. For instance, this
can be concluded from radio-echo sounding of sub-Antarctic lakes (c.f. Oswald
and Robin [16]). The usual fluid no-slip boundary conditions only apply when
basal ice is cold, but at the melting temperature, ice can slide over its base
(see Paterson [17]). In Fowler [4] it is mentioned, that several models are based
on the assumption of non-zero sliding velocity, and in some cases this is even
required in order to obtain a solution (see Morland and Johnson [15] and
Hutter e.a. [12]). On the other hand, basal topographie in Antarctica is so
rough, that for the sliding law we should expect v ;;:::: 0 (c.f. Paterson [17],
Richardson [18]).
The underlying mathematical model is a coupled system of partial differ-
ential equations for describing the distributions of velocities, temperature and
evolution of the geometry of the ice sheets. This system is solved numerically
by employing the Finite Element (FE) method. The choice of an appropriate
discretisation is involved to some extent:
1) assuming the temperature to be known, we have to choose stable elements
for the saddle-point problem determining stress, pressure and velocity simul-
taneously.
2) assuming the velocity field to be known, we have to choose stable elements
for the convection-diffusion problem determining the temperature.
The standard fluid model mentioned above is necessarily isotrop and thus
cannot describe stress-induced anisotropies evident in specimens from bore-
holes. The extreme anisotropy of the ice single crystal leads to heterogeneous
intra-granular deformation modes within the polycrystal and hence to the de-
velopment of a certain fabric. The expectation is, that the climate becomes to
some extent reconstruct able from analyzing ice-core textures, e.g., Thorsteins-
son [19], in combination with the numerical solution of the ice-sheet flow
problem. Therefore considerations for an enhanced model are based on two
geometric scales.

2 Mathematical model
The ice in large ice masses is generally polytherm, i.e., the ice mass consists
of disjoint regions in which the ice is either cold (i.e., its temperature is below
the melting point) or temperate (i.e., it is at the pressure melting point),
but except for a few recent cases theoretical formulations are restricted to
cold ice. For such a case the continuum mechanical postulate ice is a slow,
380 G. Giidert, F.T. Suttmeier

gravity-driven, incompressible, heat-conducting, nonlinear viscous fluid yields


the following balance laws of mass, momentum and temperature as well as
constitutive relations:

CrY - c:(v) = 0 Material


- div v = 0 Incompressibility
- div rY + Vp - f = 0 Equilibrium
T - i1T - rYc:( v) = 0 Temperature

which hold on a bounded domain f2 in ]R2. The right-hand-side f = (0, _g)T


is defined by the gravity force g. C describes the constitutive relation between
the deviatoric stress rY and the strain rate c:(v) = (Vv + (Vvf)/2 and may
be identified with the 4-th order fluidity tensor. The velocity field is denoted
by v = (Vl,V2f, the related pressure by p and the temperate by T.
Usually the relation CrY = c:(v) is given by

c:(v) = A(T)G(rY)rY,

where A(T) denotes the Arrhenius law and G(rY) = IrYI2 according to Glenn.
The basis for applying the Finite Element (FE) method to the classical
system is the formulation in the variational setting

(CrY, T) - (c:(v), T) = 0
(divv, q) = 0
(rY,C:('P)) - (p,div'P) = (t,'P)

and

(OtT, w) + ((v· V)T, w) + (VT, Vw) = (rYc:(v), w),


for arbitrary T, q, 'P, w choosen from suitable function spaces described below.
In order to introduce a more compact setting, we define

(1)

With these definitions, we consider the problem of finding U E V := Ex Q x


V x L with
V c (L 2)3 X L2 X (Hl)2 X Hl, fulfilling

B(U;tJ.5) =0 (2)

with
On Computational Glaciology: FE-Simulation of Ice Sheet Dynamics 381

B(U;i»:= (CO",T) - (c(V),T) - (divv,q)


+ (O",c(cp)) - (p,divcp) - (j,cp)
+ (OtT, w) + ((v. "V)T, w) + ("VT, "Vw) - (O"c(v), w).
Here and in what follows, (.,.) represents the L2 inner product of a bounded
domain n in]R2 and 11.11 the corresponding norm. Furthermore Hm = Hm(n)
denotes the standard Sobolev space of L2-functions with derivatives in L2(n)
up to the order m, and HJ C Hi is the subspace of Hi-functions vanishing on
T:=on.
2.1 Boundary conditions

Sliding (basal slip), as opposed to fabric or temperature-enhanced basal shear,


certainly occurs on ice sheets, particularly at the margins. We consider it
unlikely to occur where basal ice is cold, and where shear stresses are close to
zero. In addition large scale basal roughness may mean that any sliding which
does occur inland will be very small. Following these remarks we consider in
test configurations sketched in Figure 1 no-slip boundary conditions, i.e. Vi =
V2 = 0 on Tx , and, exploiting symmetry, Vi = 0 on Ty , for the components of
the velocity field v = (Vi,V2f.
Furthermore, we have free boundary conditions for the temperatute T on
Tx and Ty (due to symmetry) and for v on Ts := on \ (Tx U Ty). Moreover T
is prescribed on Ts by data from climate input.
Now, we calculate the deformation under the assumption of isotropic ma-
terial behaviour.A plot ofthe velocity field is depicted in Figure 1. We observe
that Ts is devided into two parts, the inflow and outflow boundary, defined by

T_ = {x E Ts I V· n < O}, (3)


T+ = {x E Ts I V· n 2: O}, (4)
where n denotes the outward unit normal of Ts. This notation will be used at
the end of Section 4.

3 Discretisation
The full discretisation for the system (2) is derived in two steps. First, we
perform a discretisation with respect to the time variable, yielding a sequence
of problems continuous with respect to the space variable. In the second step,
these problems are approximated by the finite element method.

3.1 Semi-discretisation

For discretization, the time interval [0, tMJ is decomposed like 0 = to < ti <
... < tM into subintervals 1m := (tm-i' tmJ of length k m := tm - t m - i .
382 C. Ciidert, F.T. Suttmeier

Fig. 1. Sketch of computational domains for the test examples including structure
of the FE-meshes for the benchmark problem

Integrating in (2) over 1m and approximating the integrals by quadrature


formulas of the type

JI~
w(t)dt=km{awm+(l-a)wm-l}, (5)

with some a E (0,1]' yields the time dicrete schemes. The choice of a = 1
corresponds to the backward Euler scheme, while for a = ~ , we obtain the
Crank-Nicolson scheme. Here, we only consider the simple Euler scheme which
reads
0= B(U m ; tjj) := (Tm - T m - 1, w) (6)
+ (CUm,T) - (S(Vm),T) - (divvm,q)
+ (u m , s(ip)) - (pm, divip) - urn, ip)
+ km(((v m . \1)Tm,w) + (\1Tm, \1w) - (ums(vm),w)).

In what follows the superscript m is omitted.

3.2 Nonlinear solution process

In this section, we describe the algorithm (see e.g. Geiger and Kanzow [7]) we
employ to solve the problems arising in (6) having the general structure

B(U; tjj) = O. (7)


On Computational Glaciology: FE-Simulation of Ice Sheet Dynamics 383

1. Calculate correction Dj by a linear problem

(8)

2. Perform a damped update

(9)
where a j is choosen s.t.

(10)
with a constant /j ;:::: O.
3. Set j = j + 1 and goto 1.
Remark: The Newton scheme is defined by the choice

(11)

where B'(Uj;.,.) is the derivative of B(-;·) in Uj.


The problem B(U, <l» = 0 has in cases under consideration the special form

B(U)U - F(U) =0,


omitting the argument <l> to simplify notation. A full Newton-scheme is deter-
mined by

LB(D) = au (B(U)U -F(U)) D


= B(U)D + B'(U)DU - F'(U)D.
Neglecting the last two terms, our nonlinear iteration is given by

UHl = uj - a j B-1(Uj) (B(Uj)Uj - F(Uj))


= (1- aj)Uj + a j B-l(Uj)F(Uj) ,
which allows for the interpretation as a damped variant of Kachanov's method
(freezing of coefficients).

3.3 Spatial discretisation

In order to obtain approximate solutions of the time discrete problems (6) we


will apply the finite element method on decompositions 'IT'h = {Ti 11 :S i :S N h }
of [l consisting of quadrilateral elements T, satisfying the usual condition
of shape regularity. The width of the mesh 'IT'h is characterised in terms of
a piecewise constant mesh size function h = hex) > 0, where hT := hiT =
diam(T). With these notations the discrete solutions Uh of (6) are defined by

V<l> E V h, (12)
384 C. Codert, F.T. Suttmeier

where stable discretisations are established by a cell-wise constant pressure


approximation and discontinuous and continuous bilinear functions (Q1) for
stress and velocity respectively. Temperature is approximated by continuous
Ql-elements as well, where the corresponding transport dominated equation
requires further stabilisation, which is provided by the streamline diffusion
method. For a detailed discussion of this method we refer to the textbook
written by C. Johnson [14].

3.4 Linear solution process

1) Assuming the temperature to be known, in each nonlinear iteration step we


have to solve a linear system of the form

(
A 0
o 0 CT
-BT)
-BC 0

with the relations

A ~ (C(a)cr, T), B ~ (cr,s(ip)),


C~(p,divip), r ~ -(f,ip).

Identifying U = (cr,p)T , P = v, b = 0, c = rand

A= (~ ~), B = (- B, C) ,

we have to treat a general discrete saddle point problem of the form

which can be done by employing augmented Lagrange algorithms (see e.g.


Glowinski [8]).
2) assuming the velocity field to be known, in each nonlinear iteration step
we have to solve an unsymmetric linear system determining the temperature.
This can be done by using a standard bicgstab-algorithm.

4 Enhanced Model
An important by-product of the simulation of the dynamics of large ice masses
is the determination of the age of the ice as a function of its position where it
is located today. Hereby, the age of an ice particle is defined as the time that
elapsed since it fell as a snow flake on the free surface. The problem of the ice
On Computational Glaciology: FE-Simulation of Ice Sheet Dynamics 385

age - depth correlation has indeed become a central question in the reconstruc-
tion of the past climate from many ice cores in Greenland and Antarctica. The
thermomechanically coupled ice-sheet models yield this information on the ice
motion through spatial and temporal integration, however, the resulting mo-
tion depends on the underlying constitutive behaviour usually assumed to obey
an isotropic, non-linearly viscous flow law, as described above.
More precisely, such laws should be based upon the microscopic structure
of the material, which is for ice, as for metals, a polycrystalline aggregate
consisting of hexagonal single crystal grains. The ice in glaciers is not isotropic;
at closer examination it is seen to be built by a large number of differently
oriented, strongly anisotropic ice crystals. As structural elements one may
distinguish two building stones defining the crystal: first the basal plane along
which the crystals may relatively easily slide, and second, the unit normal
vector perpendicular to it, defining the so-called c-axis.
The anisotropy of ice crystallites is given by these axis, the fine-scale is given
by the crystallites orientation defined on the sphere 8 2 , whereas the large-scale
is to be identified with the space of daily experience say ]Rd. Following Godert
[10] the fine-scale-structure is actually considered via the second-order struc-
ture tensor A denoting mean orientation densities and yielding an anisotropic
material behaviour.
Therefore our classical system is enhanced by

A=E[v,A] (13)

From now on, C depends on the orientation

of the crystallites the ice consists of. The corresponding evolution of A is


determined by E[v, A]. Following Godert [10]' E is given by

E[v, A] = ((0:1 -1)Idev - 20: 1'Pa )c(v)


+ Ao(Id - (d + 1)A) + WA - AW,

where W = (Vv - (Vvf)/2. The value 0:1 defined by

d 1
0:1:=O:d_1(h 2 - d) (14)

denotes a measure of alignment, 0:1 ----) 0 for randomly distributed c-axes ori-
entations, whereas 0:1 ----) 0: if all c-axes are parallel. Furthermore 0: = 1.2 is
determined via field data (c.f. Godert [9]) and h2 denotes the inner product,
IA2 := (I d, A 2). The macro-space dimension is given by d, AO controls the
diffusion due to recrystallisation, e.g. Godert [10].
The operator ((0:1 - 1)Idev - 20: 1'Pa ) is realised as a 3 x 3-matrix, applied
to the vector
386 G. G6dert, F.T. Suttmeier

P a is symmetric and in this note given by

with c = a12(2a22 - 1). The material behaviour is characterised by

(15)

where in accordance to Gagliardini and Meyssonnier [5] (3 = 0.25 can be in-


terpreted as the ratio of the prismatic and the basal fluidity. Furthermore flo is
given by
5
flo = floo 3(3 + 2 '
with a constant floo > o. In test calculations below we simply choose floo = l.
For the term WA - AW, one obtains

WA - AW =
(OyVI - OxV2) (
2 2aI2 - 2a12 (a22 - all)
)T .

Remark: The evolution for A has a hyperbolic character. Consequently,


we can prescribe inflow boundary data for A on r_.
On + A is left free. r
The discretisation for A is performed using the same strategies as described
for the numerical treatment of the temperature T.

5 Numerical results
The numerical results presented throughout this work are obtained by FE-
implementations based on the DEAL-library [2].

5.1 Benchmark problem

Here, we consider our model on a bounded domain S? described by


1- ~cx2
3 - y4 -- 0 c = 1.83 . 10- 2 .

The structure of the FE-meshes is shown in Figure 1. Especially, we focus on


the evaluation of the velocity field along axis parallel to the y-axis.
In order to check our discretisation, we compare the computed solution
of VI along the vertical line x = 1 for the isotropic case to Vialov's profile
(c.f. Vialov [20]), an analytical description for such a situation. The result is
depicted in Figure 2, demonstrating good agreement between computed and
the exact solution, denoted by u and f(x) respectively.
On Computational Glaciology: FE-Simulation of Ice Sheet Dynamics 387

0.35 ,-....,.--.-----,-....,--,--....,.--.-----,-....,----,
u--
f(x) ---------.
0.3

0.25

0.2

0.15

0.1

0.05

0_1'__---'----'---'----'-----'----'---'----'-----'---'
o 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Fig. 2. Evaluation of u along vertical line

5.2 Enhanced model

The computations are performed using about 23365 degrees of freedom. The
time step is choosen adaptively via

with c:5 = 0.001. Here U1 and U2 denote the solution at time step m - 1
obtained by employing Euler and Crank-Nicolson scheme respectively. The
typical developement of the local step size is depicted in Figure 3 showing
k m to increase in the stationary limit. The considerations are restricted to
isothermal flow.
Now, we compare the horizontal velocity along the vertical line x = 1
for the isotropic ((3 = 1) and the enhanced model ((3 = 0.25). In Figure 4
it is shown, that the enhanced model becomes faster, after the stationary
limes was reached. This is in agreement with results found in Gagliardini and
Meyssonnier [6].
Eventually, we investigate the influence of .AD, which controls the effect of
recrystallisation, on the orientation of the c-axis described by the parameter
0:1 in (14). The results are shown in Figure 5. As predicted from theory the
maximum value of 0:1 is controlled by .AD, and we observe 0:1 --> 0: as .AD tends
to zero.
388 G. G6dert, F.T. Suttmeier

time-steps ---+-

0.1

0.01

0.01 0.1 10

Fig. 3. Typical developement of the local step size during the simulation, showing
krn to increase in the stationary limit

0.6 ,--r----,----,---r--,-----,--.....,---,
u, 1 ---+-
u,0.25
0.5

0.4

0.3

0.2

0.1

0_--'---'---'----'----'----'--'-----'
o 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fig. 4. Evaluation of u along vertical line, demonstrating the enhanced model


((3 = 0.25) to become faster compared to the isotropic model ((3 = 1)
On Computational Glaciology: FE-Simulation of Ice Sheet Dynamics 389

0.8

0.6

0.4

0.2 a1, 1=0.00 - - + -


a1, 1=0.05 ---->C---. \\.~
a1 1=0 10 ..... ~ .....
o L -_ _a1; 1=0:20"
L -_ _ __
~ ~ __ ~ _ _~ _ _ ~_ _~ _ _~

o 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fig. 5. Evaluation of C¥l along vertical line for different values of the parameter AO

6 ConcI usion
We have presented a stable FE-discretisation for treating systems of partial
differential equations arising in glaciology. The system is a coupled one, con-
sisting of a flow problem determining stress, pressure and velocity and evo-
lution problems for temperature and mean orientation densities, describing
anisotropic material behaviour. The time discretisation is done by the stable
backward Euler scheme. Stable discretisations with respect to space are estab-
lished by a cell-wise constant pressure approximation and discontinuous and
continuous Ql-elements for stress and velocity respectively. Temperature and
orientation densities are approximated by continuous bilinear functions as well,
where the corresponding transport dominated equations require further stabil-
isation, which is provided by the streamline diffusion method. The proposed
strategies are applied to a standard model for describing ice sheet dynamics
and an enhanced one, taking into account the developement of certain fabrics
in the structure of the ice.

References
1. R. Calov, A.A. Savvin, R. Greve, L Hansen, and K. Hutter. Simulation of the
antarctic ice sheet with a three-dimensional poly thermal ice sheet modeL Annals
of Glaciology, 27:201-206, 1998.
2. DEAL. differential equations analysis library. available via http://www-
lsx. mathematik. uni-dortmund. de/user/lsx/suttmeier/deal.html, 1995.
390 G. GCidert, F.T. Suttmeier

3. A. Fabre, A. Letrguilly, and C. Ritz. Sensitivity of a greenland ice sheet model


to ice flow and ablation parameters: Consequences on the evolution through the
last climatic cycle. Climate Dynamics, 13:11-24, 1997.
4. A.C. Fowler. Modelling ice sheet dynamics. Geophys. Astrophys. Fluid Dynam-
ics, 63:29-65, 1992.
5. O. Gagliardini and J. Meyssonnier. Plane flow of an ice sheet exhibiting strain-
induced anisotropy. In Y. Wang K. Hutter and H. Beer, editors, Advances in
Cold-Region Thermal Engineering and Sciences, pages 171-182. Springer, 1999.
6. O. Gagliardini and J. Meyssonnier. About the condition to apply on the lateral
boundary of a model for local flow of anisotropic ice. to appear, 2001.
7. C. Geiger and C. Kanzow. Numerische Verfahren zur Losung unrestringierter
Optimierungsaufgaben. Springer, 1999.
8. R. Glowinski. Numerical methods for nonlinear variational problems. Springer
Series in Compo Physics. Springer, 1983.
9. G. GCidert. Meso-macro model for the description of induced anisotropy of nat-
ural ice, including grain interaction. In K. Hutter, Y. Wang, and H. Beer, ed-
itors, Advances in Cold-Region Thermal Engineering and Sciences, pages 183-
196, 1999.
10. G. GCidert. The use of structure tensors to model the evolution of textural
anisotropy of polar ice. Ann. Glaciol., 2002. submitted.
11. R. Greve, M. Weis, and K. Hutter. Palaeoclimatic and present conditions of
the greenland ice sheet in the vicinity of summit: An approach by large-scale
modelling. Paleoclimates, 2: 133-161, 1998.
12. K. Hutter, S. Yakowitz, and F. Szidarovsky. A numerical study of plane ice sheet
flow. J. Glaciol., 32:139-160, 1986.
13. P. Huybrechts. The present evolution of the greenland ice sheet: an assessment
by modelling. Global Planet. Change, 9:39-51, 1995.
14. C. Johnson. Numerical solution of partial differential equations by the finite
element method. Studentlitteratur, 1987.
15. L.W. Morland and I.R. Johnson. Steady motion of ice sheets. J. Glaciol., 28:229-
246, 1980.
16. G.K.A. Oswald and G.de Q. Robin. Lakes beneath the antartic ice sheet. Nature,
245:251-254, 1973.
17. W.S.B. Paterson. The physics of glaciers. Pergamon, Oxford, 1981.
18. S. Richardson. On the no-slip boundary condition. J. Fluid Mech., 59:707-719,
1973.
19. T. Thorsteinsson. Textures and fabrics in the grip ice core, in relation to climate
history and ice deformation. Technical Report 206, Reports on Polar Research,
1996. ISSN 0176-5027.
20. S.S. Vialov. Regularities of glacial ice shields movements and the the theory of
plastic viscous flow. Physics of the movements of ice, IAHS, 47:266-275, 1958.
N onreflecting Boundary ConditioIis for
Multiple Domain Wave Scattering In
Unbounded Media

Marcus J. Grote 1 , Christoph Kirsch 1 and Patrick Meury2

1 Department of Mathematics, University of Basel,


Rheinsprung 21, CH-4051 Basel, Switzerland.
2 Seminar for Applied Mathematics, ETH Zurich, Switzerland

Summary. A nonreflecting boundary condition is presented, which generalizes the


well-known Dirichlet-to-Neumann (DtN) approach for time-harmonic scattering in
unbounded domains to multiple scattering problems. Because this boundary condi-
tion allows each scatterer to be enclosed by a separate artificial boundary, the size of
the computational domain, and hence the computational cost, are greatly reduced.

1 Introduction

For the numerical solution of wave scattering problems in unbounded media,


a well-known approach is to enclose all obstacles, inhomogeneities and nonlin-
earities with an artificial boundary B. A boundary condition is then imposed
on B, which leads to a numerically solvable boundary-value problem in a finite
computational domain Q. The boundary condition should be chosen such that
the solution of the problem in Q coincides with the restriction to Q of the
solution in the original unbounded region. Otherwise spurious refiections will
appear at B, which will travel back into the interior computational region and
spoil the numerical solution throughout Q.
Dirichlet-to-Neumann (DtN) maps yield exact nonrefiecting boundary condi-
tions and thus avoid spurious refiections from B. They are explicitly known
for various equations or geometries [1, 2, 3, 4, 5]. Once combined with a finite
difference or finite element discretization inside Q, they lead to a highly accu-
rate and efficient numerical scheme.
Here we extend the DtN approach to multiple scattering problems, where ev-
ery scatterer is enclosed by a separate artificial boundary B j . See [6] for an
introduction to multiple scattering. Hence Q consists of multiple disjoint com-
ponents, Q j . We derive an exact DtN boundary condition on B, the disjoint
union of all B j , by combining multiple contributions from purely outgoing
wave fields. We present theoretical results that show existence and uniqueness
ofthe solution to the (artificial) boundary value problem, as well as numerical
results that demonstrate the accuracy and efficiency of our method.
392 M.J. Grote et al.

2 Acoustic waves on two scatterers

We consider two disjoint scatterers in unbounded two-dimensional space. The


scatterers may contain obstacles, inhomogeneities and nonlinearity. Let r de-
note the boundary of all obstacles and a oo the free space outside r. The
scattered field u = u(r, B) is a solution of the exterior boundary value problem
problem

Llu + k 2 u = f in a oo c ]R2, (1)


u = g on r, (2)

lim
r--+CX)
Vr (f)f)r - ik) u = o. (3)

The wave number k and the source term f can vary in space, while f may be
nonlinear. We have chosen the Dirichlet-type condition (2) for simplicity. The
Sommerfeld radiation condition (3) ensures that the scattered field corresponds
to a purely outgoing wave at infinity.
We assume that both scatterers have compact support and that they are
well separated. In this case, they can be surrounded by two non-overlapping
disks centered at Cl, C2 with radii R l , R 2 , respectively, such that in the un-
bounded domain D outside these disks, the scattered field satisfies the homo-
geneous Helmholtz equation with a constant wave number k > 0, together
with the Sommerfeld radiation condition:

Llu + k 2 u = 0 in D, k> 0 constant (4)

lim
r--+(X)
Vr (f)f)r - ik) u = 0 (5)

Let a = a oo \ D denote the finite domain inside the disks. a consists of


two disjoint components, a l and a 2 • a is bounded by f)a = ruB, where
r = r l U r2 and B = f)D consists of two circles Bl and B 2. To solve the scat-
tering problem (1)-(3) in the finite domain a, a boundary condition is needed
at the artificial boundary B, which ensures that the solution in with that a,
boundary condition imposed on B, coincides with the restriction to a of the
solution in the original unbounded region a oo .

2.1 Derivation of the DtN map

Let Dl denote the unbounded domain outside Bl and D2 the unbounded


domain outside B 2 We split the scattered field u in the unbounded exterior
0

region D into two purely outgoing wave fields Ul, U2 which solve the following
problems:
Nonreflecting Boundary Conditions in Multiple Scattering 393

LlUl + k2ul = 0 in D l , (6)

lim
r~CX)
Vi (~
ur - ik) Ul = 0, (7)

LlU2 + k 2u2 = 0 in D 2, (8)

lim
r-+oo
Vi (~
uT
- ik) U2 = o. (9)

Either wave field is influenced by a single scatterer and therefore completely


oblivious to the other. The following proposition shows that solutions to (4)-
(9) are uniquely determined by the values of U on B, Ul on B l , and U2 on B 2 ,
respectively.
Proposition 1. Let K C ]R2 be a compact set with smooth boundary. Then
the exterior Dirichlet problem
Llu + k 2 u = 0 in]R2 \ K, k > 0 constant (10)
U = 0 on fJK (11)

lim
r-+CX)
Vi (~
ur - ik) U =0 (12)

has only the trivial solution.


Proof. Without loss of generality, we can assume 0 E K. Because K is compact,
there exists an Ro > 0, such that every circle BR centered at 0 with radius R
satisfies BR C ]R2 \ K, for all R ;::: Ro. Let U now be a solution to (10)-(12).
A direct computation shows

JIVi
BR
(:r - ik) uI2 ds = R
BR
J1~~ 12 +k 2 1u1 2 ds-ikR
BR
J(u ~: - ~~u) ds.

(13)
From (12) we observe that the left side of (13) tends to zero as R --t 00. Next,
we use Green's formula and (10) to conclude that

BR
J(u ~: - ~~ U) ds J(u ~~ - ~~ U) ds.
=
oK
(14)

Here fJ/fJn is the derivative in the direction of the normal vector on fJK point-
ing away from K. The right hand side of (14) vanishes because of (11). Hence

Jlul
(13) implies that
lim
R---+oo
2 ds = o. (15)
BR
By Rellich's theorem (see for example [7], Lemma 2.11), we conclude
u == 0 in]R2 \ K, (16)
which completes the proof of Proposition 1. D
394 M.J. Grote et al.

The solutions Ul and U2 to (6)-(9) can be explicitly written, in local polar


coordinates (rl' ( 1 ), (r2' ( 2 ), as a Fourier series

for j = 1,2. The prime after the sum indicates that the term for n = 0 is
multiplied by 1/2, while H~I) denotes the n-th order Hankel function of the
first kind. Now let u E C 2 (000) be a given function which satisfies (4), (5).
Let Ul and U2 be the solutions to (6), (7) and (8), (9), respectively, together
with the following matching condition on B:

(18)

Both u and Ul +U2 satisfy (4), (5) in D = D 1 nD 2. Since u and Ul +U2 coincide
on B, they coincide everywhere in the exterior region D. We summarize this
result in the following proposition.
Proposition 2. Let u E C 2(000) be a function which satisfies (4), (5). Then

(19)
where Ul and U2 are solutions to the problems (6), (7) and (8), (9), respectively,
together with the matching condition (18). The decomposition ofu into the two
purely outgoing wave fields Ul and U2 is unique.

Proof. Uniqueness: The uniqueness of the decomposition follows from Propo-


sition 1 and from the linear independence of Hankel and Bessel functions - see
[8] for details.
Existence: We define the functions

(20)
Then we introduce the propagation operators

PI : CO(Bd ---+ CO(B 2 ), ull Bl f--+ ull B21 (21)


P2 : CO(B 2 ) ---+ CO(B 1 ), u21 B2 f--+ u21 B 1 • (22)

Explicit formulas for Pj , j = 1,2, in local polar coordinates are given by


(17), with some coordinate transformations between (rl' ( 1 ) and (r2' ( 2). The
matching condition (18) can be written as an operator equation

(23)

on the Banach space X = CO(B 1 ) x CO(B 2 ), where the operator K : X ---+ X


is given by
Nonrefiecting Boundary Conditions in Multiple Scattering 395

Note that the operator equation (23) with vanishing right-hand side admits
only the trivial solution, by the uniqueness proof above. Therefore, if K is a
compact operator, existence and uniqueness of a solution to (23) for an arbi-
trary right hand side follows from Fredholm's alternative. When the solution
to (23) is found, it can be extended by (17) to a solution Ul of (6), (7) and a
solution U2 of (8), (9), respectively. In their common domain D = D1 nD 2 , the
superposition U1 + U2 then satisfies (4), (5), and (18) on B. By Proposition 1,
U == U1 + U2 in D.
It remains to show that the operator K is compact. The propagation op-
erator P1 is defined by

L..- Hn(1) (kr1) J ('


&) cos n (
&1 - &')'
cx)'
(h = -1"
[]()
(1) 271'
P1 U d&. (25)
°
U
71' n=O Hn (kR1)

In (25), r1 = r1(&2) and &1 = &1(&2) denote the polar coordinates of the points
on B2 relative to the center of B 1. The truncated version of P1, denoted by
P{" N E N, is defined as in (25), with the infinite sum truncated at N.
Lemma 1. The propagation operators have the following properties:
1. p{' : CO(B 1) -+ CO(B 2) is a bounded linear operator of finite rank
2. P1 : CO(B 1) -+ CO(B 2) is a bounded linear operator
3. IIP{' - P1 11 -+ 0, N -+ 00
From Lemma 1, compactness of P1 follows (see for example [9]' Corollary
11.3.3).
Proof. (Lemma 1)
1. This property is obvious from the definition of p{'.

2. The linearity of P1 is obvious from the definition. We shall now show the
boundedness of P1 .
Because the scatterers are well separated, there exists an rmin > R 1 , such
that r1(&2) ;:::: rmin for all &2 E [0,271']. Because the function xIH~1\x)12 is
monotone decreasing for x > 0 and n > 1/2 [10]' we therefore have

kr1IH~1)(kr1)12 :::; krminIH~1)(krmin)12 :::; kR1IH~1)(kR1)12, n;:::: 1, (26)


from which we derive

(27)

We will now show the convergence of the series 2:~=1 an. From the asymp-
totic behavior of the modulus of the Hankel functions for large orders [11],
396 M.J. Grote et al.

IH~l)(ZW rv _1_
21m
((eZ)2n
2n
+4 (eZ)-2n),
2n
n ~ 00, ZE JR, (28)

we derive the following asymptotic behavior for the ratio lan+I/anl:

n~ 00, (29)

with
ekrmin
"(n:=~' (30)
Because "(n ~ 0, n ~ 00 and rn ~ 0, n ~ 00, we find that the expression
on the right hand side of (29) converges, and therefore

.
hmsup I- - I = hm
an+l . I- -I= -
an+l Rl- < 1. (31)
n--+oo an n--+oo an r min
The ratio test now ensures convergence of the series L::~=l an. We define
(32)

and estimate

(33)

(34)

Lan.
00

::; 211ull 00 (35)


n=O

3. We use the definitions of P l and pf to estimate

Lan.
00

11Ft' - Plil ::; 2 (36)


n=N+l

The right hand side tends to zero as N ~ 00. We conclude IIPf -Ptil ~ 0,
N~oo.

This completes the proof of Lemma 1, which implies the compactness of the
operator Pl. Compactness of P2 is shown similarly and compactness of K then
follows. This completes the proof of Proposition 2. D
Nonreflecting Boundary Conditions in Multiple Scattering 397

As a consequence, for any given function U E C2(Doo) satisfying (4), (5), we


can determine an explicit relation between the values of UIB on the artificial
boundary (Dirichlet data) and the values of the normal derivative (OnU)IB
(Neumann data). This DtN map for U is given by

OnU = M[Ull + T[u2l on B l , (37)


onu = M[U2l + T[Ull on B2, (38)
Ul + P[u2l = U on B l , (39)
P[ull + U2 = U on B2. (40)

The operators M, T and P are defined by

The expressions on the right hand sides of (37), (38) and on the left hand
sides of (39), (40) are evaluated explicitly by using (41)-(43) and the exact
Fourier representation (17), which involves some straightforward but technical
coordinate transformations.
The matching condition (39), (40) cannot be inverted explicitly, and
thereby Ul and U2 eliminated in the DtN map (37)-(40). Instead, one also
needs to compute the values of Ul on Bl and U2 on B 2.
With the DtN condition (37)-(40), we are now able to solve a boundary
value problem in the finite domain D:

Llu + k 2 u = f in D, (44)
U = g on r, (45)
onu = M[Ull + T[u2l on B l , (46)
onu = M[U2l + T[Ull on B2, (47)
Ul + P[u2l = U on B l , (48)
P[ull + U2 = U on B2. (49)

In [8l we prove the following theorem, which ensures existence and uniqueness
of a solution to the DtN problem (44)-(49).
Theorem 1. Assume that the free space problem (1)-(3) has a unique classi-
cal solution U E C 2(Doo) which satisfies (4), (5). Then the double scattering
boundary value problem (44) -( 49) has a unique solution in D, which coincides
with the restriction of u to D.
398 M.J. Grote et al.

2.2 Combination with a numerical scheme

The boundary value problem (44)-(49) can be discretized by any numerical


scheme suited for the solution of elliptic boundary value problems, for example
by a finite difference or finite element scheme. Equations (46), (47) is a Robin-
type boundary condition, combined with (48), (49). The discretization of (44)-
(49) will lead to a large system of linear equations for the unknowns, which
are the values of the solution on the grid points, for example: The matching
condition (48), (49) requires the storage of additional values, namely the values
ofthe purely outgoing wave fields Uj on the boundary components B j , j = 1,2.
These auxiliary values are useful during post-processing for the evaluation of
U outside the computational domain and in the far-field [8]. Details on the
finite difference and finite element implementation of the multi-DtN condition
(46)-(49) can be found in [8].

3 Numerical example
We consider scattering of an incident plane wave impinging on three sound-soft
obstacles with kidney-shaped boundaries. The wave number is k = 8IT. For the
numerical solution with a second-order finite difference scheme we generalize
the DtN condition presented above, from two to three scatterers. This gener-
alization is straightforward and explicitly described in [8]. The contour lines
of the real part of the total field are shown in Fig. 1. In [8] we compared our
multi-DtN condition with the standard DtN condition applied to one single
very large domain and showed that the two solutions coincide.

4 Conclusion
We have generalized the Dirichlet-to-Neumann boundary conditions for single
scattering to multiple scattering problems. To do so we have used an expan-
sion of the scattered field into multiple purely outgoing wave fields. We have
shown that the multi-DtN condition is exact, which implies that no spurious
reflections appear at the artificial boundary. When used in a numerical scheme,
the multi-DtN condition allows for much smaller computational domains than
the single-DtN condition, especially when the scatterers are far away from
each other. Moreover, the amount of work does not increase with increasing
distance between the obstacles. We have illustrated the use of the multi-DtN
condition via a numerical example.

References
1. Keller, J.B., Givoli, D. (1989): Exact non-reflecting boundary conditions. J.
Compo Phys. 82, 172-192
Nonrefiecting Boundary Conditions in Multiple Scattering 399

10tal field, real part

-3 -2 -1

Fig. 1. Contour lines of the real part of the total field for plane wave scattering on
three kidney-shaped obstacles with sound-soft boundaries. The plane wave is incident
from the right and the wave number is k = 87r. The multi-DtN condition is used at
the artificial boundary components, combined with a second-order finite difference
scheme in the interior.

2. Givoli, D. (1992): Numerical Methods for Problems in Infinite Domains. Elsevier


3. Grote, M.J., Keller, J.B. (1995): On Nonrefiecting Boundary Conditions. J.
Compo Phys. 122,231-243
4. Givoli, D. (1999): Recent Advances in the DtN FE Method. Archives of Comput.
Meth. Engin. 6, 71-116
5. Giichter, G.K., Grote, M.J. (2003): Dirichlet-to-Neumann Map for Three-
Dimensional Elastic Waves. Wave Motion 37, 293-311
6. Martin, P.A. (1995): Multiple Scattering: an Invitation. In: Cohen, G. (ed) Proc.
Third Int. Conf. on Math. and Numerical Aspects of Wave Propagation, SIAM
Philadelphia, 3-16
7. Colton, D., Kress, R. (1992): Inverse Acoustic and Electromagnetic Scattering
Theory. Springer, Berlin, Heidelberg
8. Grote, M.J., Kirsch, C.: Dirichlet-to-Neumann Boundary Conditions for Multiple
Scattering Problems, J. Compo Phys., submitted
9. Werner, D. (1995): Funktionalanalysis. Springer
10. Ryzhik, LM., Gradshteyn, LS. (1994): Table of Integrals, Series, and Products.
Academic Press
11. Abramowitz, M., Stegun, LA. (1970): Handbook of mathematical functions.
Dover Publications
On the Choice of the Regularization Parameter
in the Case of the Approximately Given Noise
Level of Data*

Uno Hiimarik and Toomas Raus

University of Tartu, Institute of Applied Mathematics, ESTONIA


[email protected], toomas. [email protected]

Summary. We consider ill-posed problems Au = f with operator A E .c(H, H),


A = A' ~ 0, where H is the Hilbert space and range R(A) is non-closed. Regularized
solutions U r are obtained by a general regularization scheme, including the Lavrentiev
method, iteration methods and others. We assume that instead of f E R(A) noisy
data 1 are available with the approximately given noise level 8: it holds III - fll/8 ::;
const for 8 ---> O. We propose a new a-posteriori rule for the choice of the regularization
parameter r = r( 8) guaranteering U r (8) ---> u. for 8 ---> 0, where u. is solution of
problem Au = f. The error estimates are given.

1 Introduction

We consider an operator equation

Au = f, f E R(A) , (1)

where A E L(H, H), A = A* ~ 0 is the linear continuous self-adjoint and


non-negative operator; u and f are elements of the real Hilbert space H. We
do not suppose that the range R(A) is closed and so in general our problem
is ill-posed. The kernel N(A) may be non-trivial. As usual in the treatment of
ill-posed problems, we suppose that instead of the exact right-hand side f we
have only an approximation E H. i
The approximate solution U r of the ill-posed problem Au = f is found by
some regularization method and depends on the regularization parameter r. If
the noise level I) with Ilf - ill::; I) is known, then the proper parameter choice
r = r( I)) guarantees ur(J) ---> u. for I) ---> 0, where u. is solution of Au = f, the
nearest to the initial approximation Uo (see Section 2; often Uo = 0).
In this paper we are interested in the case of approximately given noise
level I): it is unknown, holds the inequality Iii - fll ::; I) or not. Instead of this
inequality we assume that Iii - fll/I) ::; const for I) ---> 0 and we give a new
rule for the parameter choice r = r(l)) guaranteeing ur(J) ---> u. for I) ---> O. The
error estimates are presented as well.

* This work was supported by the Estonian Science Foundation, Grant No. 5785.
Choice of the Regularization Parameter 401

2 Regularization methods
We consider the regularization methods in the general form (see [1, 2])

(2)

where Ur is the approximate solution, Uo - initial approximation, r - regular-


ization parameter, I - the identity operator and the function grC>-') satisfies
the conditions (3) and (4):

sup Igr(A)I::; 'Yr, r 2: 0, (3)


O~A~a

sup VI1-Ag r (A)1 ::;'Ypr-P, r2:0,0::;p::;po. (4)


O~A~a

Here Po, 'Y and 'Yp are positive constants, a 2: IIAII, 'Yo ::; 1 and the greatest
value of Po, for which the inequality (4) holds is called the qualification of
method.
The following regularization methods are special cases of the general
method (2).

M1 The Lavrentiev method Ua = (aI +A)-1 j. Here Uo = 0, r = a-I, gr(A) =


(A + r- 1 )-1, Po = l.
M2 The iterative variant of the Lavrentiev method. Let mEN, m 2: 1, Uo =
UO,a E H - initial approximation and um,a = (aI + A)-I(aum_l,a + j).
Here r = a-I, gr(A) = ±(1- C-:rA) m), Po = m.
M3 Explicit iteration scheme (the Landweber's method). Let
be a constant and
°< /1 < l/IIAII

Un = Un- l - /1(AUn- l - j) , n = 1,2, ....


±
Here r = n, gr(A) = (1- (1 - /1At) , Po = 00.
°
M4 Implicit iteration scheme. Let a > be a constant and

aUn + AUn = aU n- l + j, n = 1,2, ....

Here r = n, gr(A) = ±(1- C>~Ar), Po = 00.

M5 The method of the Cauchy problem. We take the solution of the Cauchy
problem
u'(r) + Au(r) = j, u(O) = Uo
for the approximation Ur to the solution of problem (1). Here gr(A)
±(1- erA), Po = 00.
402 U. Hiimarik, T. Raus

3 Parameter choice in case of the known noise level of


data

In regularization methods (2) the important problem is the choice of a proper


regularization parameter r. If r is too big, the numerical implementation will
be unstable and U r will be useless; if r is too small, the approximation U r is
dominated by the initial guess Uo. The rules of the parameter choice can be
devided into two groups, where the noise level is used in the rules of the first
group and not used in the rules of the second group.
If the noise level 8 with Iii - III :::; 8 is known the most prominent rule for
methods M2-M5 is the Morozov's discrepancy principle.
The Morozov's discrepancy principle [3]. In this rule the regulariza-
tion parameter r = rD is chosen as the solution of the equation

IIAUr - ill ::::0 M with b = const > 1.


The second rule of the first group is the modification of the discrepancy
principle (MD rule) [4]. In this rule the regularization parameter r = rMD
is chosen as the solution of the equation

IIBr(AUr - i) 11::::0 M with b = const > 1,


where the operator

B _ {I for Po = 00 ,
r - (I - Agr (A))l/ p o for Po = 00

depends on the qualification Po of the method.


The discrepancy principle and its modification coincide for regularization
methods M3-M5 where Po = 00, but these rules differ for the Lavrentiev
method and its iterative variant, where IIBr(Au",m - i)11 = IIAu",m+1 - ill.
Some useful properties of the MD rule which are important for our further
study are:
1) convergence: IIU rMD -u*lI----; 0 for 8 ----; 0; here u* is nearest to Uo solution
of problem Au = I;
2) order-optimality: if Uo - U* = APv , v E H,
1 p
Ilvll :::; (!, p > 0, then Ilu rMD -

u*1I :::; Cp (!e+ 1 8 p + 1 , 0 < p:::; Po;


3) quasioptimality: there exists a constant c such that

The MD rule has some advantages over the discrepancy principle since
(i) the MD rule can also be used for the Lavrentiev method;
Choice of the Regularization Parameter 403

(ii) the method M2 with r = rD is order-optimal only for the range P E


(0, Po - 1], but not order optimal for p > Po - 1;
(iii) it can be shown that the method M2 with r = rD is not quasi-optimal.
However, the discrepancy principle and its modification have an essential
disadvantage. Namely, these choices are unstable in this sense that if the actual
error of the right-hand side is larger than bO, then the error of the approximate
solution may be arbitrarily large, independently of the value of the ratio of the
actual and supposed noise level. For example, if b = 2 and the actual noise
level is three times larger than the noise level 0, which we use in the rule, then
the error of the approximate solution may be arbitrarily large.
There are also parameter choice rules, which do not use the noise level
delta. Sometimes these choices are called as heuristic or delta-free choices.
The first such parameter choice rule was the quasioptimality criterion [5, 6].
According to this rule the parameter was chosen for which the function
k(r) = rllBr(Au r - ])11 has the minimum. Other popular delta-free rules are
the Wahba's cross-validation rule [7, 8] and the Hansen's L-curve criterion
[9, 10]. Some heuristic rules are also proposed in [11].
Although these rules often work well (or even better than the discrepancy
principle and the MD rule), it was shown by Bakushinskii [12], that one cannot
prove the convergence of the approximate solution for heuristic parameter
choice rules.

4 Parameter choice rule in the case of the approximately


given noise level of data
In applied ill-posed problems the exact noise level is often unknown. Therefore
in the following we assume that the actual noise level is unknown and only
some guesses about this level can be made. It means that the supposed error
level 0 > 0 is given, but we do not know exactly, if II] - fll :::; 0 holds or not.
Our aim is to present a rule for the stable parameter choice which guarantees
the convergence of the approximate solution to the exact solution if only the
ratio II] - fll/o is bounded in the process 0 ----+ 0, and to give some error
estimates of the approximate solution.
In the following the function

plays an important role.


Note that for the Lavrentiev method and its iterative variant Br =
(I + rA)-1 and <p(r) = <p(a- 1) = )a(Aum +1,Cl< - ],A(Au m +2,Cl< _ ]))1/2;
for iterative methods <p(r) = <p(n) = vn(Au n - ],Au n _ ]))1/2.
Rule P. Let 0:::; s :::; 1 and b1 , b2 be the constants such that b2 ?: b1 > Om,
where Om = 1/2, Om = 1/-J2m + 3, Om = 1/-J2p,e, Om = Ja/2 and Om =
404 U. Hiimarik, T. Raus

1/y'2e for methods M1-M5 respectively. If rp(1) ::; b20 then choose r(o) = l.
In the contrary case we find at first r2 (0) > 1 such that

(6)

(7)
For the regularization parameter r(O) we choose the parameter r, for which
the function t(r) = r S IIBr(Au r - j) I has the global minimum on the interval
[1, r2(0)].
In iterative methods the following rule P' can be used for the choice of the
stopping index n(O) as the parameter r. For the rule P' the analogous results
hold as for the rule P.
Rule P'. Let 0 ::; s ::; 1 and b be the constant such that b > Cm. Find
n2(0) as the first n = 1,2, ... , for which rp(n) ::; M. For the regularization
parameter n(O) we choose n E N, for which the function t(n) = nSllAu n - jll
has the global minimum on the interval [1, n2(0)].
In [13] the rule for the choice of the parameter was considered, in which for
the regularization parameter the parameter r2(0) was taken. We can consider
the rule P as the generalization of this rule, since in case s = 0 these rules
coincide due to the fact that the function IIBr(Au r - j) I is monotonically
decreasing with respect to r. On the other hand, in case s = 1 the rule P
is similar to the non-selfadjoint analogue of the parameter choice rule by the
quasioptimality criterion, since we choose the minimum point of the function
rllBr(Au r - j) I for the regularization parameter. The only difference between
the rule P and the analogue of the quasioptimality criterion is the interval, on
which the function rllBr(Au r - jll is minimized: the intervals are [1, r2(0)] and
[1, 00) respectively.
In [13] the following results are proven for methods M1-M5:
(i) for each j E H we have lim rp(r) = 0;
r->oo
(ii) if Ilj - III ::; 0, IluD - u*1I ::;
M, b 2: Cm, then for each r, r 2: RM,li =
CmM/(b - Cm)o we have rp(r) ::; M; here C m = 12V15/125, C m =
(3/2)(3/2)mm/(m + 3/2)m+3/ 2, em = (3/(2;.te))3/2, C m = (30:/2)(3/2),
C m = (3/(2e))3/2 for methods M1-M5 respectively;
(iii) if II!~ ill::; const for 0 - t 0 then Ilur2 (li) - u* I - t 0 for 0 - t O.
Due to the continuity of the function rp(r) from the property (i) follows
that the choice of finite parameters r2 (0) and r( 0) ::; r2 (0) according to Rule
P is possible. The property (ii) says that if we know a constant M > 0 such
that IluD - u*1I ::; M, then it is sufficient to search the parameter r2(0) in
the finite interval [1, RM,li]. Note that the function rp(r) is non-monotone and
therefore in Rule P we must use the conditions (6)-(7) instead of inequalities
ho ::; rp(r) ::; b20.
Note that the analogues of the results of the paper [13] for non-selfadjoint
problems are presented in [14].
Choice of the Regularization Parameter 405

Theorem 1. Let A E L(H, H), A = A* ~ 0, f E R(A). Let the parameter


r(5) be chosen according to Rule P. If 111"8 111 :::; const in the process 5 ----., 0,
then in methods Ml-M5

IIU r (8) - u* II ----., 0 for 5 ----., O.

Proof. Denote Gr := 1- Agr(A). Then we have

U r - u* = Gr(uo - u*) + gr(A)(i - 1),


from which with (3) follows that

(8)
To prove the theorem, it suffices to show the convergence of the right-hand
side of (8). In [13] it is proven that

(9)
From inequality r(5) :::; r2(5) and from (9) follows the convergence of the
second term of (8). To show the convergence of the first term of (8), we prove
at first that

(10)
We have
Br(Aur - i) = ABrGr(uo - u*) - BrGrU - 1), (11)
from which with regard to the inequality IIBrGrU - 1)11 :::; Iii - fll :::; C5
follows that

To show the convergence

we consider the cases a) r2(5) ----., 00, b) r2(5) :::; r = const separately. If
r2 (5) ----., 00 in the process 5 ----., 0 then using the Banach-Steinhaus theorem we
can prove similarly as in [2] (p. 43) that rPIIABrGr(uO - u*)11 ----., 0 if r ----., 00
(0:::; p :::; 1). Now we consider the case r2(5) :::; r = const. Using (11), (4) we
get

r2(5)1/21IA3/2B~;~8)Gr2(8)(Uo - u*)11 :::; r2(5)1/21IA1/2B~;~)(Aur2(8) - i)11


+r2( 5)1/21IA 1/2 B~;~8) G r2 (8) U - 1) II :::; b25 + 1'1/2 Iii - fll : :; (b2 + CI'1/2)5 ,
from which follows that
406 U. Hiimarik, T. Raus

IIA3/2B~:~8)Gr2(8)(uo-u*)11---*0 for 0---*0.


In [2] (p. 66) the implication
AGrn(uo-u*)---*O (n---*oo) =} Grn(UO-u*)---*O (n---*oo) (14)
is proven. Similarly we can show that if A3/2B~~2Grn(uO - u*) ---* 0 (n ---* (0),
then ABrnGrn(uO - u*) ---* 0 (n ---* (0) which proves the convergence (13) in
this case. Now the convergence (10) follows from (12),(13) and (9).
Let us remind that the parameter reo) is the global minimum point of the
function t(r) = rSIIBr(Au r - i)11 in [1,r2(0)]. Therefore from (10) follows the
convergence
rS(o) IIBr(8) (AU r (8) - i)ll---* 0 for 0 ---* O.
Using (11) we get
r S(0) II ABr(8) Gr(8) (uo - u*) II : : ; r S(0) II B r(8) (AUr(8) - i) II + r S(o)Co ---* 0
if 0 ---* 0.
From this relation with the implication of type (14) we get the convergence
IIGr(8) (uo - u*) I ---* 0 for 0 ---* 0, which with (8) proves the theorem.
In the next two theorems we give estimates for the error of the approximate
solution in cases Iii - fll : : ; 0 and Iii - fll : : : 0 respectively. The proofs of these
theorems will be presented in another forthcoming paper.
Theorem 2. Let A E L(H, H), A = A* ::::: 0, f E R(A), Iii - fll : : ; o. Let
the parameter reo) be chosen according to Rule P with s E (0,1) and let the
function t(r) = rSIIBr(Au r - i)11 be monotonically increasing on the interval
[r(0),r2(0)]. Then for methods Ml-M5 the error estimation

IIUr(8) - U* II : : ; c(b 1 , b*) _1_ inf {II (I -


1 - s r2:0
Agr(A))(uo - u*) II + ,rO} (15)

holds, where b* = max cp(r)/o::::: b2 and R(o) is the greatest parameter


r(8):Sr:SR(8)
for which cp(r) = b20.
Theorem 3. Let A E L(H, H), A = A* ::::: 0, f E R(A). Let the parameter
reo) be chosen according to Rule P with s E (0,1). Then in case Iii - fll > 0
for methods Ml-M5 the following error estimations hold:
a) if 0::::; Iii - fll : : ; 00, where 00 := IIBr(8)(AU r (8) - i)ll, then

IIUr(8) -u*11 : : ; cl(b 1 , b*) 1~ s ~~~ {II(I -Agr(A))(uo -u*)11 +,rlli - fll};
(16)
Choice of the Regularization Parameter 407

5 Numerical experiments
Discretization of ill-posed problems often leads to the linear systems of alge-
braic equations with large condition numbers. In our numerical experiments
we solved many linear systems of equations Au = 1, where A was the 100 x 100
diagonal matrix with the diagonal elements AI, ... , A100. For generating eigen-
values Ak of A, solution u = (U1' ... , U100 f, noise in 1 and supposed noise level
J different schemes were used and the concrete scheme was chosen randomly
by computer. Schemes for eigenvalues Ak of A and for solution u were

Ak = rand (0, l)a k- 1 , a = 1.3; 2; 5; 10;


Ak = rand (0, l)k- a , a = 1; 2; 3; 4; 10;
Ak = H(rand (0, 1))Ak-1, Al = 1; Pk(x) = x; x2; e- x ; x/Vk;
k = 2,3, ... , 100;
Uk = rand (0, l)k- a , = -1; -0.5; 0; 0.5; 1; 2; 4;
a
Uk = Rdrand (0, 1))Uk-1, U1 = 1; Rk(X) = X', x 2,. e- x , x 3 ,.
k = 2,3, ... , 100 ;
where rand (0,1) is a random number in interval (0,1). The exact right hand
side 1 was found by formula 1 = Au and perturbations satisfied the relation
111- 111 = 1O- j / 2 11111, j = 1,2, ... ,13 (Euclidean norms). Concrete random
perturbations were distributed uniformly by the formula

_ _ [n ] 1/2
h = lk + 2(rand (0,1) - 0.5) I 1 - 111 / ~ i~ (2(rand (0,1) - 0.5))2 ,
k = 1,2, ... ,n
or all the noise was concentrated on one eigenvalue, chosen randomly:

Ao = lko + fo(rand (0,1) - 0.5)111 - 111/lrand (0,1) - 0.51,


A= h for k -=I- ko .

For the supposable noise level J = III - 111/d was taken, where values of d were
1, 3, 5, 10, 20, 50, 100. By the Lavrentiev method 6000 different variants of
the problem Au = 1 were solved, parameter 'r was chosen by the rule P. The
ratios

were computed, characterizing the coefficients of the quasioptimality (compare


with formula (5)). The ratio V also shows how well the rule P works in com-
parison with the MD rule, which uses the actual noise level. Namely, we solved
408 U. Hiimarik, T. Raus

the same problems also by MD rule using the actual noise level and then the
ratio V was almost 1 (average of V was 0.95, maximum of V was 1.24).
The results of numerical experiments are given in Table 1. The results show
that in the case of the exactly estimated noise level (d == Iii - fll/8 = 1) rule
P works nearly as well as the MD rule (averages of ratio V were 0.991 and
0.95 respectively). As expected, in the case of the underestimated noise level
(d > 1) the error of the approximate solution is larger than for the MD rule
which uses the exact noise level. But the ratio of errors of these approximate
solutions is relatively small in comparison with the error, made by estimating
the noise level. For example, if the noise level was d = 100 times smaller than
the real value, the error of the approximate solution for rule P was only 7%
larger than for the MD rule which uses the exact noise level (the corresponding
averages of V were 1.022 and 0.95), and for 96% of problems the ratio V for
rule P was smaller than 1.5. Numerical experiments also showed that if the
actual noise level is larger than the noise level used in the MD rule, then this
rule is not good. For example, if the actual noise level was 3 times larger than
the noise level used in the MD rule, the ratio V was in most cases larger than
10.
Table 1. The Lavrentiev method, Rule P, s = 0.75, bl = b2 = 1.5Cm

Number Average V Maximal V % problems % problems


d= Iii - fll/8 of problems for which for which
V < 1.5 V <3
1 851 0.991 3.95 97.70 99.19
3 901 0.982 4.74 97.44 99.36
5 826 0.984 7.13 96.51 98.88
10 873 0.956 5.99 97.89 99.60
20 861 0.953 6.83 97.32 99.46
50 874 0.974 11.30 97.10 99.08
100 813 1.022 18.50 96.03 98.58
All cases 6000 0.980 18.50 97.15 99.17

References

1. Vainikko G. (1982): The discrepancy principle for a class ofregularization meth-


ods. U.S.S.R. Comput. Math. Math. Phys., 22, 3, 1-19.
2. Vainikko G., Veretennikov A. (1986): Iteration Procedures in Ill-Posed Problems.
Nauka, Moscow, (in Russian).
3. Morozov V. (1966): On the solution of functional equations by the method of
regularization. Soviet Math. Dokl., 7, 414-417.
4. Raus T. (1984): Residue principle for ill-posed problems. Acta et Comment.
Univ. Tartuensis, 672, 16-26 (in Russian).
5. Tikhonov A. N., Glasko V. B. (1965): Use of the regularization method in non-
linear problems. USSR Comput. Math. Math. Phys., 5, 3, 93-107.
Choice of the Regularization Parameter 409

6. Tikhonov A., Arsenin V. (1977): Solution of Ill-Posed Problems. Wiley, New


York.
7. Wahba G. (1977): Practical approximate solutions to linear operator equations
when the data are noisy. SIAM J. Numer.Anal. 14, 651-657.
8. Golub G. H., von Matt Urs. (1997): Generalized cross-validation for large-scale
problems. J. Comput. Graph. Statist., 6, 1, 1-34.
9. Hansen P.C. (1992): Analysis of discrete ill-posed problems by means of the
L-curve. SIAM Rev., 34, 561-580.
10. Calvetti D., Morigi S., Reichel L., Sgallari F. (2000): Tikhonov regularization
and the L-curve for large discrete ill-posed problems. Numerical analysis 2000,
Vol. III. Linear algebra. J. Comput. Appl. Math., 123, 1-2, 423-446.
11. Hanke M., Raus T. (1996): A general heuristic for choosing the regularization
parameter in ill-posed problems. SIAM J. Sci. Comput., 17, 954-972.
12. Bakushinskii A. (1984): Remarks on choosing a regularization parameter us-
ing the quasi-optimality and ratio criterion. U.S.S.R. Comput. Math. Math.
Physics., 24, 181-182.
13. Raus T. (1990): An a posteriori choice of the regularizartion parameter in case of
approximately given error bound of data. Acta et Comment. Univ. Tartuensis,
913,73-87.
14. Raus T. (1992): About regularization parameter choice in case of approximately
given error bounds of data. Acta Comment. Univ. Tartuensis, 937, 77-89.
Adaptive Discontinuous Galerkin Finite
Element Methods with Interior Penalty for the
Compressible N avier-Stokes Equations

Ralf Hartmann 1 and Paul Houston 2

1 Institute of Aerodynamics and Flow Technology, German Aerospace Center,


Lilienthalplatz 7, 38108 Braunschweig, Germany. Email: [email protected]
2 Department of Mathematics, University of Leicester, Leicester LEI 7RH, UK.
Email: [email protected]. Supported by the EPSRC (Grant
GR/R76615)

Summary. In this article we consider goal-oriented a posteriori error estimation for


the symmetric interior penalty discontinuous Galerkin finite element discretization of
the compressible Navier-Stokes equations. Numerical experiments demonstrating the
accuracy of the error estimation and the performance of the adaptive mesh refinement
strategy will be presented.

1 Introduction

In recent years there has been tremendous interest in the design of discon-
tinuous Galerkin finite element methods (DGFEM) for the discretization of
compressible fluid flow problems; see, for example, [2, 3, 6, 7] and the refer-
ences cited therein. The key advantages of these schemes are that DGFEM pro-
vide robust and high-order accurate approximations, particularly in transport-
dominated regimes, and that they are considerably flexible in the choice of
mesh design. Indeed, DGFEM can easily handle non-matching grids and non-
uniform, even anisotropic, polynomial approximation degrees.
In this paper we introduce the symmetric version of the interior penalty
DGFEM for the numerical approximation of the compressible Navier-Stokes
equations. We then consider the a posteriori error analysis and adaptive mesh
design for the underlying discretization method. In particular, here we focus
on so-called 'goal-oriented' a posteriori error estimation which bounds the er-
ror measured in terms of certain target functionals of real or physical interest.
Typical examples that we shall consider here include the drag and lift coeffi-
cients of a body immersed in a viscous fluid. By employing a duality argument
we derive a weighted (Type I) a posteriori error bound which reflects the error
creation and error propagation mechanisms inherent in viscous compressible
fluid flows. On the basis of this a posteriori estimate, we design and implement
the corresponding adaptive algorithm to ensure both the reliable and efficient
control of the error in the prescribed target functional. The superiority of the
proposed approach over standard mesh refinement algorithms which employ
Adaptive DGFEM for the Compressible Navier-Stokes Equations 411

empirical error indicators will be demonstrated. This paper represents a con-


tinuation of our previous work presented in the articles [10, 11].

2 The compressible N avier-Stokes equations

Writing p, v = (VI, V2) T, p, E and T to denote the density, velocity vector,


pressure, specific total energy and temperature, respectively, the equations are
given by

where fl is an open bounded domain in ]R2. Here, the vector of conservative


variables u, the convective fluxes fic , i = 1,2, and the viscous fluxes f[, i =
1,2, are defined by u = [p,pVI, PV2, pE]T, fiC(u) = [pVi,pVIVi + 61ip,PV2Vi +
62i P, pHVi]T and f[ = [0, Tli, T2i, TilVI + Ti2V2 + KTxif, respectively. Here, K
is the thermal conductivity coefficient and H is the total enthalpy defined
by H = E + p/ p. The pressure is determined by the equation of state of an
ideal gas, i.e., P = (-y - l)p(E - ~v2), where I = cp/cv is the ratio of specific
heat capacities; for dry air, I = 1.4. For a Newtonian fluid, the viscous stress
tensor is given by T = fJ, ('Vv + ('Vv)T - ~('V . v)I) , where fJ, is the dynamic
viscosity coefficient; the temperature T is given by KT = 1lr (E - ~v2) ,where
Pr = 0.72 is the Prandtl number.
The non-dimensionalized form of the Navier-Stokes equations (1) are given
by
'V. FC(u) - 'V. (G Ij 8u/8xj, G2j 8u/8xj) = 0 in fl, (2)
where repeated indices are summed through their range. Here, the matrices
Gij = 8f[(u, 'Vu)/8u xjl for i,j = 1,2, i.e., f[(u, 'Vu) = I:~=I Gij 8u/8xj,
i = 1,2, where
412 R. Hartmann, P. Houston

and Re denotes the Reynolds number.


Given that rl c ]R2 is a bounded region, with boundary r, the system of
conservation laws (2) must be supplemented by appropriate boundary condi-
tions. For simplicity of presentation, we assume that r may be decomposed
as follows r = rDU rN"up U rN"ub U rw, where rD , rN,sup, rN,sub and rw are
distinct subsets of r representing the Dirichlet (inflow), Neumann (supersonic-
outflow), Neumann (subsonic-outflow) and solid wall boundaries, respectively.
Thereby, we may specify the following boundary conditions: u = UD on r D ,
FV(u, V'u), II = 0 on rN,sup UrN,sub; note that on rN,sub an additional condition
which imposes a given pressure Pout is enforced. For solid wall boundaries, we
consider the distinction between isothermal and adiabatic conditions. To this
end, decomposing rw r
= rW,iSO U rW,adia, we set v = 0 on w , T = Twall on
rW,iSo, ll· V'T = 0 on rW,adia, where Twall is a given wall temperature; we refer

to [2, 3, 5, 6] and the references cited therein for further details concerning the
imposition of suitable boundary conditions.

3 Discontinuous Galerkin Discretization

In this section we introduce the discontinuous Galerkin method with interior


penalty for the discretization of the compressible Navier-Stokes equations (2).
We assume that rl can be subdivided into shape-regular meshes Th = {"'}
consisting of quadrilateral elements "'. For each", E Th, we denote by ll", the
unit outward normal vector to the boundary 0"', and by h", the elemental
diameter. An interior edge of Th is the (non-empty) one-dimensional interior
of 0"'+ n 0"'-, where ",+ and ",- are two adjacent elements of Th . Similarly,
a boundary edge of Th is the (non-empty) one-dimensional interior of 0'" n r
which consists of entire edges of 0"'. We denote by C'I the union of all interior
edges of Th , by cr the union of all boundary edges, and set C = C'I U Cr.
Next, we define average and jump operators. To this end, let ",+ and ",-
be two adjacent elements of Th and x be an arbitrary point on the interior
edge e = 0"'+ n 0"'- C C'I. Moreover, let v and r. be vector- and matrix-
valued functions, respectively, that are smooth inside each element ",±. By
(v± ,r.±) we denote the traces of (v, r.) on e taken from within the interior of
",±, respectively. Then, we define the averages at x E e by {{v}} = (v+ + v-) /2
and {{r.}} = (r.+ + r.-)/2. Similarly, the jumps at x E e are given by [v] =
v+ 0 ll",+ + v- 0 ll",- and [r.) = r.+ . ll",+ + r.- . ll",-. For matrices Q., r. E
]Rmxn, m, n ~ 1, we use the standard notation Q. : r. = 2::;;'=1 2::~=1 O'klTkl;
additionally, for vectors v E ]Rffi, wE ]Rn, the matrix v0w E ]Rffixn is defined
by (v 0 w)kl = Vk WI·
Given a polynomial degree P ~ 1, we define the finite element space V h =
{v E [L2(rl)]4 : vi", E [Qp(",)]4, '" E Th}, where Qp("') denotes the space of
tensor product polynomials on '" of degree P in each coordinate direction. We
Adaptive DGFEM for the Compressible Navier-Stokes Equations 413

consider the following interior penalty discontinuous Galerkin discretization of


equations (2): find Uh E V h such that

N(Uh' Vh) == - r .rc(Uh): Vh Vh dx + L l81<\r


1n
r 1i(ut, ul;", nl<) . vt ds
I<ETh

+
Inr P(Uh, VhUh) : VhVh dx - lEx
r {{P(Uh, VhUh)}} : --
[Vh] ds

- lExr {{ (G~ 8hVh/8x;, Gi;8hVh/8x;)}} : --


[Uh]) ds

+ lex 17[uh] : [Vh] ds + Nr(Uh' Vh) = ° (3)

for all Vh in V h. Here, the subscript h on the operators V hand 8h/8x;,


i = 1, 2, is used to denote the discrete counterparts of V and 8/ 8x;, i =
1,2, respectively, taken elementwise. Furthermore, 1i(-,.,') denotes a numerical
(convective) flux function, assumed to be Lipschitz continuous, consistent and
conservative. The function 17 E LOO(E) denotes the so-called discontinuity
penalization function; defining h E LOO(E) by h(x) = min{hl<+' hl<~} when
x E e = 8",+ n 8",- c ET , and h(x) = hI< when x E e = 8", nrc Er , we set
°
17 = C E / (h Re), with a parameter C E > that is independent of h. Finally,
we write

Nr(Uh, Vh) = r
lr
1i(ut, ur(ut), n) . r 17 (ut -
vt ds + l~u~ ur(ut)) . vt ds,
-r
} rDurW,iso
.P(Uh, VhUh) : vt 0 nds - r } rW,adia
j"V(Uh' VhUh) : vt 0 nds
- 1r
r\rN,sup
(G~(ut)8hVtl8x;,Gi;(ut)8hVtl8x;): (ut -ur(ut)) 0nds,

where the boundary function Ur (u) is given according to the type of boundary
condition imposed. We set ur(u) = UD on r D , ur(u) = U on rN,sup, and
ur(u) = (p, PV1, PV2, ~':.'~ + !pv
2 f on rN,sub' Furthermore, we set ur(u) =

(p, 0, 0, PCvTwanf on r W , iso, and ur(u) = (p, 0, 0, ~)T = (p, 0, 0, pE - !pv 2)T
on rW,adia' Finally, we note that on the adiabatic boundary rW,adia, we define
j"V(u, Vu) such that
P(u, Vu) . n = (0, Tl1nl
A

+ T12n2, T21nl + T22n2, 0) T .


Remark 1. We remark that the discretization of the viscous terms has been
done by employing the symmetric version of the interior penalty method, cf.
[1], and the references cited therein. In particular, we note that this scheme
is derived by first re-writing (2) as a system of first-order partial differen-
tial equations through the introduction of appropriate auxiliary variables. By
defining suitable numerical flux functions, these additional variables are sub-
sequently eliminated; see [1, 8] for details. We note that within this process
414 R. Hartmann, P. Houston

the transpose of the matrices G ij , i, j = 1,2, naturally arise in the definition


of the DGFEM. Moreover, this is necessary to ensure the adjoint consistency
of the resulting method which is essential for the approximation of functionals
of the solution, cf. [1,8]. As a final remark, we note that the discontinuity pe-
nalization parameter E must be chosen sufficiently large in order to guarantee
the stability of the underlying method.

4 Goal-oriented a posteriori error estimation

In this section, we shall be concerned with controlling the error in the nu-
merical solution measured in terms of a given target functional J(-); for a
detailed discussion, we refer to the review articles [4, 13]. Assuming that J(.)
is differentiable, we write

where J/[W](') denotes the Frechet derivative of J(.) evaluated at some w


in V. Here, V is some suitably chosen function space such that V h C V.
Analogously, we write

M(u, Uh; U Uh, v) = N(u, v) - N(Uh, v)

11 N~[eu +
-

= (1 - e)Uh](U - Uh, v) de (5)

for all v in V. Here, N~[w](·, v) denotes the Frechet derivative of U f-+ N(u, v),
for v E V fixed, at some w in V. We remark that the linearization defined in
(5) is only a formal calculation, in the sense that N~[w](·, .) may not in general
exist. Instead, a suitable approximation to N~[w](·,·) must be determined, for
example, by computing appropriate finite difference quotients of N(·, .), cf.
[9,10]. Given a suitable linearization, we introduce the following dual problem:
find Z E V such that

M(u, Uh; w, z) = J(u, Uh; w) Vw E V. (6)

We assume that (6) possesses a unique solution. Clearly, the validity of this
assumption depends on both the definition of M(u, Uh;"') and the choice of
the target functional under consideration, cf. [10]. For the proceeding error
analysis, we must therefore assume that the dual problem (6) is well-posed.

Proposition 1. Let U and Uh denote the solutions of (2) and (3), respectively,
and suppose that the dual problem (6) is well-posed. Then,

J(U) - J(Uh) :::; L 117,,1, (7)


"ETh
Adaptive DGFEM for the Compressible Navier-Stokes Equations 415

where

Ti", = 1'"
Rh . Wh dx + r
Jo",\r
(FC(Uh) . u'" -1i(ut, Uh, U"')) . wt ds

+ r
Jo",nr
(FC(Uh) . u'" -1i(ut, ur(ut), U"')) . wt ds

- Jo",\r
r 17[uh]--
: wt 0 u'" ds

+ ~ r ((G~{AWh/8xi' G~8hWh/8xi) : [Uh] - [FV(Uh, V'hUh)]' wt) ds


2 Jo",\r --
- Jo",n(rDUrW)
r 17 (ut - ur(ut)) . wt ds

- JrOl<n(rN,suburN,,,,p) (P(ut,V'hut).u",).wt ds
-ia",nrw,adia (FV(ut, V'hut) - PV(ut, V'hut)) : wt 0 U ds

+ r
J o",n(r\rN,sup)
(G~(ut)8hwtl8xi,G~(ut)8hwtl8xi) :

: (ut - ur(ut)) 0 U ds,


and Wh = Z - Zh for all Zh in Vh. Here, Rhl", = -V'h . FC(Uh) + V'h .
FV(Uh, V'hUh), K E Th, denotes the elementwise residual.
Proof. Choosing w = u - Uh in (6), recalling the linearization performed in
(4), and exploiting the Galerkin orthogonality property of the DGFEM, we
get
l(u) - l(uh) = J(u, Uh; u - Uh) = M(u, Uh; U - Uh, z)
= M(u, Uh; U - Uh, Z - Zh) = -N(Uh' Z - Zh) VZ h E V h.
Equation (7) now follows by the divergence theorem together with the triangle
inequality.
We end this section by noting that the Type I a posteriori error bound (7)
depends on the unknown analytical solution to the primal and dual problems.
Thus, in order to render these quantities computable, both U and Z must
be replaced by suitable approximations. Here, the linearizations leading to
M(u, Uh;"') and J(u, Uh;') are performed about Uh and the dual solution Z
is replaced by a DGFEM approximation z computed on the same mesh Th
used for Uh, but with a higher degree polynomial.

5 Numerical example
In this section we present a numerical example to highlight the advantages
of designing an adaptive finite element algorithm based on the weighted er-
416 R. Hartmann, P. Houston

DG{l), global refinemenl ____


OG(lJ,reslduallnd. ---)(---
00(1), welgllled-residuais indo ....... .

0.01

(a) (b)

Fig. 1. (a) Mach isolines, Ma = fa, i = 1, ... ,10, of the flow around the NACA00l2
airfoil; (b) Convergence of the error IJ(u) - J(uh)1 using each mesh refinement strat-
egy

Table 1. Adaptive algorithm based on the weighted error indicator h" I.


Elements DOF J(u) - J(Uh) 2:" 'I)" 81 2:" h,,1 82
63 1008 -1.534e-01 -1.29ge-0l 0.85 9.2e-01 6.02
105 1680 -4.838e-02 -1.114e-02 0.23 7.4e-01 15.35
189 3024 7.814e-03 5.574e-03 0.71 7.ge-01 100.60
306 4896 -3.81ge-02 -1.296e-020.34 8.1e-01 21.11
522 8352 -2.107e-02 -9.24ge-03 0.44 8.5e-01 40.25
909 14544 -7.686e-03 -3.746e-030.49 7.ge-01 102.9
1512 24192 -1.04ge-03 -9.130e-04 0.87 1.0e+00 962
2559 40944 -2.628e-04 -2.395e-04 0.91 6.2e-01 2374

ror indicator 17],,1 in comparison with both uniform mesh refinement, as well
as an adaptive algorithm based on an empirical refinement indicator which
does not require the solution of an auxiliary (dual) problem; for simplicity, we
employ a Type II residual indicator of the form derived in [10]. Throughout
this section, we employ the Vijayasundaram flux for the discretization of the
convective terms and set p = 1 (bilinear elements). Finally, for both adaptive
refinement strategies, we use the fixed fraction refinement algorithm with re-
finement and derefinement fractions set to 20% and 10%, respectively; we also
note that for the computation of 11],,1, the dual solution is approximated using
piecewise biquadratic polynomials.
We consider a Mach 0.8 flow at an angle of attack 0: = 10° with Reynolds
number Re = 73 and constant temperature on the profile, cf. [3, 12]. The so-
lution to this problem consists of a flow that is mainly subsonic with a small
supersonic region above the airfoil, see Fig. l(a). Here, we consider the evalu-
Adaptive DGFEM for the Compressible Navier-Stokes Equations 417

Table 2. Nonlinear Newton residuals (Res.) and convergence rates on a sequence of


uniformly refined meshes. *: On the coarsest mesh only the last 4 out of 6 iteration
steps are displayed.

Mesh 1 Mesh 2 Mesh 3 Mesh 4 Mesh 5


Res. Rate Res. Rate Res. Rate Res. Rate Res. Rate
6.5-02* 8.1-01 4.4-01 2.4-01 1.1-01
2.5-02 3 4.2-02 19 3.3-02 13 1.6-02 16 5.9-03 19
6.7-04 37 1.2-03 36 1.6-03 20 2.6-04 62 1.4-04 43
6.6-07 1021 1.4-06 808 1.8-05 92 2.6-07 976 1.7-04 804
3.7-10 1755 3.1-10 4538 2.9-09 6199 5.5-11 4750 1.7-10 1021

DG(I),gIobalretinB~:-~ ~ 00(1). local retina~:~ ==

111-12 o~----;----:::,,---c;,,----:c------;! 1 ~.120L -~~-~",-----,,,,~~,,-~~,-----'

rtIJmberofN8'WIoo sleps numberolNBWlonslllPS

(a) (b)

Fig. 2. Convergence of the nonlinear residual with the number of Newton steps
employed: (a) Uniform mesh refinement; (b) Adaptive mesh refinement using the
empirical error indicator

ation of the inviscid drag coefficient (Cdp) on the surface of the airfoil. On the
basis of a fine grid computation, the reference value of the functional is given
by J(u) ~ 0.224.
In Fig. 1 (b) we compare the true error in the computed target functional
J(.) using all three mesh refinement strategies. Here, we clearly observe the
superiority of the weighted a posteriori error indicator; at all refinement steps
the error in the computed functional is less than the corresponding quantity
when either uniform refinement or adaptive refinement based on an empirical
residual indicator is employed. Indeed, on the final mesh the true error in J(.)
is over an order of magnitude smaller than IJ(u) - J(uh)1 computed on the
sequence of meshes generated by the empirical indicator.
In Table 1 we collect the data of the adaptive algorithm when employ-
ing the weighted indicators. Here, we show the number of elements and de-
grees of freedom (DOF) in V h, the true error in the functional J(u) - J(Uh),
the computed error representation formula, the approximate a posteriori error
bound and their respective effectivity indices e1 = LI< TJI</(J(u) - J(Uh)) and
418 R. Hartmann, P. Houston

82 = 2:1< irJI<I/IJ(u) - J(uh)l· First we note that on all refinement steps the
correct sign of the error is predicted by the computed error representation
formula. Furthermore, whereas on very coarse meshes the quality of 2:1< 7]1< is
rather poor, in the sense that 81 is noticeable smaller than one, we see that
the effectivity indices 81 slowly tend towards unity as the mesh is refined. On
the other hand, even just the application of the triangle inequality leads to sig-
nificant over-estimation of the true error in the computed functional; indeed,
here we see that 82 slowly increases.
As a final remark, we note that in all of our computations, we employed
a (damped) Newton iteration to solve the system of nonlinear equations aris-
ing from the discontinuous Galerkin discretization of the compressible Navier-
Stokes equations. Within each Newton step, GMRES with Block-Gauss-Seidel
preconditioning is exploited to solve the resulting linearized problem involving
the Jacobian matrix, cf. [9]' for further details. For each of the three mesh re-
finement algorithms employed above, the nonlinear solver proceeds as follows:
starting on the coarsest mesh with free-flow conditions the nonlinear prob-
lem is solved by the Newton iteration described above. When the nonlinear
residual converges below 10- 8 , the mesh is refined once; then the discrete so-
lution is interpolated onto the new mesh and is thereby taken as the starting
solution for the Newton iteration on the newly refined mesh. In Table 2 we
present the history of this solution process on a sequence of uniformly refined
computational meshes; these results are also summarized in Fig. 2(a). On the
coarsest mesh, with free-flow values, the Newton iteration requires a few steps
until the iterative solution reaches the range of quadratic convergence. Indeed,
after only 6 Newton steps the nonlinear residual is below the given tolerance
of 10- 8 and the mesh is refined; on subsequent meshes, the Newton iteration
requires only 4 steps to reduce the residual below the given tolerance.
The convergence behaviour of the Newton iteration is analogous even when
locally refined meshes are employed. Indeed, in Fig. 2(b) we plot the nonlinear
residuals against the number of Newton steps employed for each of the meshes
generated using the empirical error indicator. Here, we again see that on the
first mesh 6 Newton steps are required to satisfy the convergence criterion,
while only 4 are necessary on subsequent meshes. Analogous behaviour is also
observed on the adaptive meshes generated using the weighted error indicator
irJl<l; for brevity, these results have been omitted.
References

1. Arnold, D, Brezzi, F., Cockburn, B., Marini, D. (2002): Unified analysis of discon-
tinuous Galerkin methods for elliptic problems. SIAM J. Numer. Anal., 39(5),
1749-1779
2. Baumann, C.E., Oden, J.T. (1999): A discontinuous hp finite element method
for the solution of the Euler and Navier-Stokes equations. Int. J. Numer. Meth.
Fluids, 31, 79-95
Adaptive DGFEM for the Compressible Navier-Stokes Equations 419

3. Bassi, F., Rebay, S. (1997): A high-order accurate discontinuous finite element


method for the numerical solution of the compressible N avier-Stokes equations.
J. Comput. Phys., 131, 267-279
4. Becker, R, Rannacher R. (2001): An optimal control approach to a-posteriori
error estimation in finite element methods. In: Iserles, A. (ed), Acta Numerica,
pp. 1-102, CUP
5. Capon, P.J. (1995): Adaptive Stable Finite Element Methods for the Compress-
ible Navier-Stokes Equations. Ph.D. Thesis, University of Leeds
6. Dolejsi, V. (2002): On the discontinuous Galerkin method for the numerical solu-
tion of the Euler and the Navier-Stokes equations. Int. J. Numer. Meth. Fluids,
(submitted)
7. Dolejsi, V., Feistauer, M. Schwab, Ch. (2002): On discontinuous Galerkin meth-
ods for nonlinear convection-diffusion problems and compressible flow. Math.
Bohem., 127(2), 163-179
8. Harriman, K., Houston, P., Senior, B., Stili, E. (2003): hp-Version discontinuous
Galerkin methods with interior penalty for partial differential equations with non-
negative characteristic form. In: Shu, C.-W., Tang, T., Cheng, S.-Y. (eds), Recent
Advances in Scientific Computing and Partial Differential Equations. Contempo-
rary Mathematics, 330, 89-119
9. Hartmann, R (2004): The role of the Jacobian in the adaptive discontinuous
Galerkin method for the compressible Euler equations. In: Warnecke, G. (ed),
Analysis and Numerical Methods for Conservation Laws, Springer-Verlag (sub-
mitted)
10. Hartmann, R, Houston, P. (2002): Adaptive discontinuous Galerkin finite ele-
ment methods for nonlinear hyperbolic conservation laws. SIAM J. Sci. Comp.,
24(3),979-1004
11. Hartmann, R, Houston, P. (2002): Adaptive discontinuous Galerkin finite ele-
ment methods for the compressible Euler equations. J. Compo Phys., 183, 508-
532
12. Numerical Simulation of Compressible Navier-Stokes Equations-External 2D
Flows Around a NACA0012 Airfoil. GAMM Workshop, Dec. 4-6, 1985, Nice,
France (Edt. INRIA, Centre de Rocquefort, de Rennes et de Sophia-Antipolis,
1986)
13. Stili, E., Houston, P. (2002): Adaptive finite element approximation of hyper-
bolic problems. In: Barth, T., Deconinck, H.(eds) Error Estimation and Adap-
tive Discretization Methods in Computational Fluid Dynamics. Lecture Notes in
Computational Science and Engineering, Volume 25, pp. 269-344, Springer-Verlag
On a Novel Technique for Parallel
Unstructured Mesh Generation in 3D

Jan Haskovec 1 and Pavel Solin 2

1 Faculty of Mathematics and Physics, Charles University, Sokolovski 83, Prague,


Czech Republic jan. [email protected]
2 Department of Computational and Applied Mathematics, Rice University, 6100
Main Street, Houston, TX 77251-1892 [email protected]

Summary. We present a novel approach to unstructured tetrahedral mesh genera-


tion in 3D with minimum geometrical input. The boundary of the domain is identified
adaptively with adjustable precision. The grid points of the mesh are distributed by
means of an algorithm based on the analogy with a system of electrically charged
particles. A-priori local refinements of the mesh are possible. Surface and volume
meshes are built using advancing-front-type algorithms. All steps of the algorithm
are suitable for efficient parallelisation. An application of the presented approach is
shown.

1 Motivation
Nowadays, computational mathematics becomes part of new challenging inter-
disciplinary computer-assisted technologies in medicine, natural sciences, in-
dustry and elsewhere 1 . Specifically for mesh generation this means that sim-
plification and automation of geometrical inputs becomes a crucial issue. Tra-
ditional algorithms, among which the most popular ones are based on Solid
Modeling Techniques (SMT) and CAD modellers, have not been designed to
process machine-generated information such as outputs of MRI scans, image
recognition devices, physical measurements, geographic databases etc. It is our
aim to design a fully automatic parallel generator of unstructured tetrahedral
meshes with this capability.

The example geometry in Fig. 1 is given by the formula

Q= {(x,y,Z);(X-R(¢)cos(¢))2+(Y-R(¢)Sin(¢))2 +(z- !rr)2 ~r2(¢),


¢ E (0, 47l") }

with
1 The authors acknowledge the financial support of the Grant Agency of the Czech
Republic under Grant No. GP102/01/D114.
On a Novel Technique for Parallel Unstructured Mesh Generation in 3D 421

Fig. 1. Example of a 3D geometry whose optimal computer representation is a non-


trivial issue

R(¢) = 1- :L,
8~
r(¢) = 0.3 (1 - :L) .
8~

The minimum information necessary for the definition of the geometry of a 3D


domain n is its characteristic function X,

XD = {
1, x E n (1)
0, x¢ n.
In this paper we introduce a first working version of a C++ mesh generator
XGEN3D that has been designed to process this form of geometrical input.
The algorithm consists of four steps,

step 1: adaptive identification of the boundary an,


step 2: iterative distribution of grid points,
step 3: generation of surface mesh,
step 4: generation of volume mesh,

that will be discussed in the following sections.


Mesh generation is a scientific discipline with a very long tradition, and due
to its variety and diversity it is difficult to select a few references on which our
work is exactly based. One of the best places where mesh-generation-related
informatio~ of virtually any kind can be found , that we used intensively, is
Robert Schneiders' web page [3].
422 J. Ha.skovec, P. Solin

2 Adaptive identification of the boundary an


In order to keep the computer implementation at a reasonable level of com-
plexity, the user is requested to provide the following parameters in addi-
tion to the characteristic function Xn: co-ordinates of a (reasonably small)
cube C c R3 (initaZ cube) such that n c C and tolerances TOL n , TOLan,
TOLan::::; TOLn, with which the geometry of n will be approximated. The
meaning of these parametes will be explained later. Moreover the user defines
a mesh density function h(x) : C ---> R that at every point x E n indicates the
desired mean edge length.
The algorithm produces a continuous, piecewise triangular approximation
of the boundary an. It proceeds recursively, starting by dividing the initial
cube into eight smaller sub-cubes (hence oct-tree), and using the function Xn
to determine whether a sub-cube intersects with the boundary an or not. The
adaptive splitting process is continued until the size of the sub-cubes reaches
the parameter TOL n . The resulting sub-cubes are called basic cubes. Splitting
of the basic cubes further continues until TOLan is reached. The resulting
sub-cubes are called boundary cubes.
At this point the complete (discrete) information about the boundary an
is contained in the boundary cubes. A simple algorithm allows for accurate
location of points (boundary points) where the edges of the boundary cubes
intersect with the boundary an. For TOLan sufficiently small, the boundary
an can be locally approximated by a plane. In other words, for each boundary
cube the corresponding boundary points form a planar polygon with 3 to 7
vertices PI, P2 , ••• , Pk •
If k = 3, we already have a portion of the desired triangular approximation
of an. If k ::::: 4, we need to construct a triangulation of the polygon. This is
most easily achieved by calculating its center of gravity,

and defining k triangles (PI, P2 , Pc), (P2 , P3 , Pc), ... , (Pk - l , Pk , Pc).
Summarized, the recursive algorithm of approximating the boundary an
can be written as follows:

Boundary identification algorithm:


1. Begin with the initial cube C.
2. Split the actual cube into 8 sub-cubes.
3. With each sub-cube:
3.1. Does it intersect with the boundary?
NO: Is the size of its edge smaller than TOLn?
YES: STOP.
NO: Continue with step 2.
YES: Is the size of its edge smaller than TOLan?
YES: Build an approximation to aninside of the cube and STOP.
On a Novel Technique for Parallel Unstructured Mesh Generation in 3D 423

NO: Continue with step 2.


4. STOP.

The output of this algorithm is a piecewise triangular approximation to the


boundary aD. The algorithm can analyse as complicated geometry as desired
and approximate it as accurately as desired. The only limiting factor is the
amount of memory and CPU-time consumed by the software. Let us point
out that the resulting set of triangles cannot be used as a surface mesh yet
- the size and shape of the triangles varies irregularly and their diameter has
nothing in common with the user-specified mesh density function h(x).
Fig. 2, 3 and 4 show the approximation of a part of the domain D from
Fig. 1 with three different values of TOLan.

low accuracy optimal accuracy high accuracy


Fig. 2. Approximation of an with various levels of accuracy: TOLan = 0.1, 0.03
and 0.015, respectively

3 Distribution of grid points


First we need to estimate the total number N of grid points. With the shape
of D known from the previous step, we numerically integrate the mesh density
function h(x) over the domain. An average edge length is defined as

Finally we use the information about the volume of an equilateral tetrahedron


with the edge length ha to compute the approximate number of tetrahedra in
the mesh and the optimal number of grid points N.
Next we place N geometrical points with pseudo-random positions into'
the domain D. By pseudo-random we mean that the positions are generated
randomly, but we do not allow any two points to lie too close to each other.
When all of the N points are generated, we start optimizing their positions.
For this we adopt the approach [4] that minimizes a suitable global potential iP
defined on the set of all grid points. One way to choose iP is to look for analogy
with a system of electrically charged particles. From a potential one can derive
forces that act on the particles, and one can apply a time stepping procedure
in order to let the system converge to a state with minimum energy.
424 J. HaSkovec, P. SoHn

With a suitable choice of the repulsing force Fij of particles Pi, Pj, steady
state of the system is usually achieved after a few iterations. An example of
the repulsing force F that respects the variable mesh density h is

Pi+Pj ) Pi-Pj
Fij = F(Pi , Pj ) = h ( 2 IPi _ Pi 13·
The total force working on each particle Pi is computed from contributions
of particles that lie in a limited distance from Pi. In this way we use the
oct-tree structure available from Step 1 to avoid quadratic complexity of the
algorithm. One further uses the magnitude of forces working at each particle
together with the mesh density function in order to calculate a suitable time
step. The time step is defined in such a way that any particle Pi displaces at
most h(Pi)/D, D > o. The default value of the parameter D in the code is
D = 5. If D is chosen too large, the system will evolve very slowly and one
will have to perform a large number of iterations before the steady state is
reached. On the other hand, a too small value of D would cause instabilities
and prevent the system from converging at all.
In addition to that one has to withdraw the kinetic part of the total energy
from the system after each time step by resetting the velocities of all particles to
zero. This action is analogous to cooling of a physical system that is necessary
in order to reach the bottom of the potential well.
When moving a particle, we have to check whether it crosses the boundary
an on its way. If so, we compute the intersection of the boundary with its
trajectory and position the particle to the point of intersection. In this mo-
ment the particle becomes a boundary particle and it is not allowed to leave
the boundary anymore. From now on the particle is being moved along the
boundary - we compute the force acting on it but only consider the projec-
tion onto the plane tangential to an.The outline of the algorithm is as follows:

Algorithm for the distribution of grid points:


1. Compute the total number of grid points N.
2. Place the points into n in a pseudo-random way.
3. Until steady state is reached, repeat:
3.1. For each particle Pi do:
3.1.1. Calculate the force Fi acting on Pi.
3.1.2. Does Pi lie on the boundary an?
YES: Compute the projection of Fi to the boundary.
Move the particle according to the projection.
NO: Does the last trajectory of Pi intersect with an?
YES: Place Pi to the point of intersection.
NO: Move the point Pi according to Fi .
4. STOP.
On a Novel Technique for Parallel Unstructured Mesh Generation in 3D 425

Let us remark that it is desired that the mesh density function h(x) does
not vary too fast, otherwise tetrahedra with obtuse angles would arise in the
correspondent areas of the domain, which would lead to low quality of the
resulting mesh. It is difficult to state any quantitative conditions, but we can
roughly say that the function h(x) should be Lipschitz-continous with the
Lipschitz constant not greater than 1/3. However, this value depends on the
domain geometry. Generally speaking, the function h(x) should be chosen as
smooth as possible.

4 Surface meshing
The set of boundary grid points is used to generate a surface mesh - this is
a standard task that virtually all mesh generators have in common. We apply
an Advancing Front (AF) algorithm [4] for this purpose. See, e.g., [2] for a
more general description of AF algorithms.
Because of some temporary technical difficulties that we hope to overcome
soon we request the user to partition the surface of the domain into M sub-
regions aD I , aD2 , ... , aDM of simpler shapes. Function Numbering(x,y,z)
then returns the index k of a subregion aDk the point [x, y, z] belongs to. Or-
dered chains of abscissas that separate a subregion aDk from the rest of aD
are called frontiers.
We generate the surface triangulation for each subregion aDk separately
using an AF technique. The k-th frontier is the starting point. We copy all its
abscissas to the list of abscissas A. Then, for the first abscissa (P, Q) in A, we
search a boundary grid point Z = [x, y, z] such that Numbering(x,y ,z)=k and
the angle (P, Z, Q) of abscissas (P, Z) and (Q, Z) is maximal. In other words,
we look for the maximizer Z of the angle (P, Z, Q) in the set of all bound-
ary grid points such that Numbering=n. The fact that the angle (P, Z, Q) is
maximal ensures that there is no boundary grid point lying inside the triangle
(P, Q, Z). If the mesh density function h(x) is Lipschitz-continous with rea-
sonably small Lipschitz constant and we have reached an equilibrium steady
state in the process of iterative grid points distribution, we can always find
a maximizer Z such that the triangle (P, Q, Z) obeys the minimal-angle-rule
and this results in a high quality mesh.
When the point Z is found, we add the newly created triangle into the list
of surface triangles Ts and remove the abscissa (P, Q) from the list A. Then
we check if the abscissas (P, Z) and (Q, Z) are already contained in A; if so,
we remove them from the list. In the opposite case we add them to A. We
repeat the search of a maximizer for the next abscissa in the list, until the list
becomes empty - this means that the subregion aD k is covered with a surface
triangulation. We can sketch the algorithm as follows:

Surface triangulation algorithm:


1. For each subregion aD k caD:
426 J. Ha.skovec, P. Solfn

1.1. Copy its frontier abscissas into a list A.


1.2. Repeat until A is empty:
1.2.1. Find a boundary grid point Z c arh that maximizes
the angle (P, Z, Q) for the first abscissa (P, Q) E A.
1.2.2. Add the triangle (P, Q, Z) to the surface mesh Ts.
1.2.3. Remove the abscissa (P, Q) from A
1.2.4. Is abscissa (P, Z) already contained in A?
YES: Remove it from A.
NO: Add it to A.
1.2.5. Is abscissa (Q, Z) already contained in A?
YES: Remove it from A.
NO: Add it into A.
2. STOP

The result of this algorithm is the surface triangulation list Ts. Examples of
two different surface meshes on the domain D from Fig. 1 are shown in Fig. 3.
Obviously the quality of the mesh strongly depends on the distribution
of grid points. In general, it is extremely difficult to give a proof of stabil-
ity of the meshing algorithm or statements about the quality of the mesh.
According to our experience, if the disribution of grid points is reasonable,
the meshing algorithm is stable and produces meshes of very good quality,
containing prevalently almost-equilateral triangles.

5 Volume meshing
We use a generalization of a two-dimensional algorithm [4] for the construction
of the volume mesh Tv. The input parameters are the list Ts of surface triangles
and the set of inner grid points. The idea of the volume meshing algorithm is
similar as in the case of the surface mesh, again based on the AF technique.
We start with a first triangle (P, Q, R) in the list Ts. For this triangle we find
a grid point Z such that the sum of the angles (P, Z, Q) + (Q, Z, R) + (R, Z, P)
is maximal. According to our experience, this criterion turned out to be op-
timal for the meshing algorithm. Next we add the newly created tetrahedron
(P, Q, R, Z) into the list Tv of tetrahedra and delete the triangle (P, Q, R) from
the list Ts. We check if the triangles (P, Q, Z), (P, R, Z) and (Q, R, Z) are con-
tained in Ts. If so, we remove them from the list. In the opposite case we add
them to the list. Then we proceed to the next triangle in Ts and so on, until
Ts becomes empty. The algorithm is written as follows:

Volume meshing algorithm:


1. Repeat until Ts is empty:
1.1. Find a maximizer Z for the first triangle (P, Q, R) E Ts.
1.2. Add the tetrahedron (P, Q, R, Z) into the list Tv.
1.3. Remove the triangle (P, Q, R) from Ts.
On a Novel Technique for Parallel Unstructured Mesh Generation in 3D 427

Fig. 3. Meshes on the example geometry from Fig. 1. The first one is gradually
refined towards the thinner end of the spiral and has 1460 surface elements. The
other consists of 1872 uniformly-sized elements. In both cases geometry analysis
parameters TOLn = 0.15 and TOLan = 0.03 were used.
428 J. Ha.skovec, P. Solin

1.4. Is triangle (P, Q, Z) contained in Ts?


YES: Remove it from Ts.
NO: Add it to Ts.
1.4. Is triangle (P, R, Z) contained in Ts?
YES: Remove it from Ts.
NO: Add it to Ts.
1.4. Is triangle (Q, R, Z) contained in Ts?
YES: Remove it from Ts.
NO: Add it to Ts.
2. STOP

According to our experience, the algorithm is stable and it produces tetrahe-


dral meshes of very good quality.

6 Parallelisation
All algorithms utilized for the adaptive oct-tree analysis of the boundary an,
distribution of grid points and generation of the surface and volume meshes
are extremely well suited for running in parallel. Since we are in the middle of
parallelization of the code, let us only give a brief description of the techniques
we use.

6.1 Parallel adaptive identification of the boundary an


The initial cube C, n c C, is split into eight subcubes in the initial step. The
algorithm is then run on each of the subcubes recursively, fully independently
of the other instances. Since the resulting subdomains of n are disjoint, ne
can run eight instances of the boundary analysis algorithm without any risk of
potential collisions. If there are more than eight CPUs, obviously it is possible
to start the parallel run on further levels of recursion.

6.2 Parallel distribution of grid points

When distributing the grid points, the most time-consuming part of the algo-
rithm is the computation of the total force acting on the particles. With Ncpu
processors at our disposal, one divides n into Ncpu subdomains. No domain
decomposition method is needed since splitting the initial cube C does the
job. An independent instance of the algorithm is run for each of the resulting
subdomains.

6.3 Parallel generation of surface and volume meshes

The surface of the domain is again split into subregions of simpler shapes in
the same spirit as in Section 4. The construction of the surface triangulation
On a Novel Technique for Parallel Unstructured Mesh Generation in 3D 429

in every subregion is a task independent of events in the other subregions.


For the sake of technical simplicity, in our conception the number of CPUs is
limited by the number of physical areas. This simplification will be eliminated
later.
A similar idea as above is applied to the volume meshing - the initial cube
is split and independent instances of the volume meshing algorithm are run on
the subcubes. After the algorithm finishes independent runs in the subcubes,
one final sweep is done to generate tetrahedra that lie across the boundaries
of the subcubes. Since these are not many, a single process is sufficient to
accomplish this task.

References

1. Cheung, Y.K., Lo, S.H., Leung, A.Y.T. (1996): Finite Element Implementation,
Blackwell Science, Oxford
2. George, P.L. (1991): Automatic Mesh Generation: Application to Finite Element
Methods, Wiley, New York
3. Schneiders, R.: http://www-users . informatik. rwth-aachen. derroberts/me-
shgeneration.html
4. Solin, P., (2000): On a Mesh Generation Technique Based on a Special Smoothing
Procedure for Uniform Inner Point Distribution. Acta Technica ASCR, 45, 397-
417.
Adaptive Finite Element Methods for
Turbulent Flow

Johan Hoffman l and Claes Johnson 2

1 Courant Institute of Mathematical Sciences, New York University, 251 Mercer


Street, New York, NY 10012-1185, USA, [email protected]
2 Department of Computational Mathematics, Chalmers University, SE 412 96,
Goteborg, Sweden, [email protected]

Summary. We present recent results using adaptive finite element methods, based
on a posteriori error estimates, to compute various output functionals for incom-
pressible flow problems in 3d, for both laminar and turbulent flows. The a posteriori
error estimates are based on the solution of an associated dual problem with data
connected to the output functional we want to compute.

1 Introduction

We present recent results from [15, 14], extending earlier results in [17, 13]'
where we use adaptive finite element methods, based on a posteriori error esti-
mates, to compute various output functionals in incompressible flow problems
in 3d, for both laminar and turbulent flows. The a posteriori error estimates are
based on the solution of an associated linearized dual problem that contains
information about error propagation in space-time.
The idea of using duality arguments in a posteriori error estimation goes
back to Babuska and Miller [2] in the context of postprocessing 'quantities of
physial interest' in elliptic model problems. A framework for more general sit-
uations has since then been systematically developed by in particular Eriksson
& Johnson and Becker & Rannacher, with coworkers, see e.g. [6, 4, 19, 20].
Applications to incompressible flow have been increasingly advanced with com-
putation of functionals such as the drag coefficient for 2d stationary benchmark
problems in [3,9], and drag and lift coefficients and pressure differences for 3d
stationary benchmark problems in [15]. In [17] time dependent problems in 3d
are considered, and the extension to Large Eddy Simulation LES of turbulent
flow is investigated in [13]. In [14] a temporal mean of the drag coefficient of
a surface mounted cube in a turbulent channel flow is computed using aLES.
If we use a subgrid model in a LES, the subgrid modeling error is included
in the a posteriori error estimates, which opens the possibility of comparing
the error using different subgrid models. Altogether, the a posteriori error esti-
mates open the possibility of adaptively choosing both an optimal mesh and an
optimal subgrid model. This approach to a posteriori error estimation with re-
spect to the averaged solution, using duality teqniques, in terms of a modeling
Adaptive Finite Element Methods for Turbulent Flow 431

error and a discretization error was developed for convection-diffusion-reaction


equations in [10, 18, 16, 11, 12]. Related approaches with a posteriori error es-
timates in terms of a modeling and a discretization contribution to the total
error have been suggested. For example, more recently in [5] similar ideas are
presented with applications to 2d convection-diffusion-reaction problems.
Due to the local nature of turbulence in many applications, in particular
for the problem in [14], we stress the possibilities of adaptive mesh refinement
for such problems. In [14] we are able to locally resolve scales of motion cor-
responding to a Reynolds number of about 1000, and in theory we would be
able to locally resolve scales corresponding to a Reynolds number of about
105 , using an ordinary PC or laptop computer.

2 Turbulent flow and LES


The incompressible Navier-Stokes equations for a spatial domain .n c ]R3 take
the form:

u+(u·V7)u-vLlu+V7p=j, V7·u=O, in .n x I, (1)

where u(x, t) = (Ui(X, t)) is the velocity vector and p(x, t) the pressure of the
fluid at (x, t), j is a given driving force, v is the kinematic viscosity, and
I = (0, T) is a time interval. We assume that (1) is normalized so that the
reference velocity and typical length scale are both equal to one. The Reynolds
number Re is then equal to V-I.
For low Re we may have time independent solutions that satisfies the sta-
tionary Navier-Stokes equations, where we simply drop the time derivative in
(1). For higher Re we have time dependent solutions, and for sufficiently high
Re we get turbulent solutions.
In a turbulent flow we are typically not able to resolve all scales of motion
computationally. We may instead aim at computing a running average u h of
u on a scale h, defined by

u h (x, t) = h13 1
Qh
u(x + y, t) dy, (2)

where h = h(x, t) is a parameter related to the local resolution of the problem


and Qh = {y E ]R3 : IYil ::; h/2}. In the LES literature it is common to define
the averaging operator through convolution by a certain filter function, and
there is a multitude of filter functions being used. Though we only consider
the case of the filter corresponding to (2) in this paper, the teqniques for
a posteriori error estimation are general and apply to other filters, possibly
with modifications for commutation errors associated with such filters.
By an extension of (u, p, 1) to ]R3 by reflection for all x ~ .n, the averaging
operator (2) commutes with space and time differentiation. If we take the
432 J. Hoffman, C. Johnson

running average of the equations (1), corresponding to a LES, we obtain the


following equations for u h :

uh + (u h . \l)u h - v.1u h + \lph + \l . Th(u) = fh, \l . u h = 0, in [l x I,


(3)
where Ti} (u) = (UiUj)h - u7uj is the Reynolds stress tensor. The closure prob-
lem of LES is how to model Th (u) in terms of u h in a subgrid model fh (u h ). In
this paper we focus on the computation of chosen output functionals for the
problem (3) using adaptive finite element methods, and we refer to [8, 24] and
the references therin for work on subgrid modeling for LES, and we refer to
[14] for details on the mathematical formulation of the problems (1) and (3).

3 Adaptive finite element methods

An adaptive algorithm includes feed-back from computation to achieve the


computational goal with minimal computational cost. In an adaptive finite
element method this feed-back from computation relies on a posteriori error
estimates.
In [15,14] we compute approximations g(U, P) of functionals g(u,p), where
(U, P) is a numerical approximation of (u, p), and we prove a posteriori error
estimates of the form

Ig(u,p) - g(U, P)I::; L EJ(, (4)


KETk

11 R7· w71
where
EJ( = Li K
(5)

is an error indicator for element K in the mesh T." with R7


residuals, and
w7 dual weights from the solution of an associated linearized dual problem, at
iteration k. An adaptive algorithm for computing approximations g(U, P), to
a tolerance TOL, then takes the form:
Algorithm 1 (Adaptive mesh refinement) Start at k = 0, then do
(1) compute approximation to the primal problem on T.,
(2) compute approximation to the dual problem on T.,
(3) if L EJ( < TaL then STOP, since Ig(u,p) - g(U, P)I ::; TaL, else
KETk
(4) refine a fixed fraction of the elements in T., with largest EJ( --+ T.,+l
(5) set k = k + 1, then goto (1)
Adaptive Finite Element Methods for Turbulent Flow 433

4 Numerical examples

We now present two different applications of Algorithm 1 to incompressible


flow in 3d; first a stationary flow from [15]' and then a turbulent flow from
[14]. For details of the computations and the a posteriori error estimates we
refer to [15, 14].

4.1 Stationary benchmark problems in 3d

In [25]' computational results for a collection of benchmark problems for


laminar flow around a cylinder in 2d and 3d are presented, with contribu-
tions from 17 research groups. We consider the case of 3d stationary flow
around a cylinder with square cross-section D x D, with D = 0.1, cen-
tered at (0.5,0.2,0.205) aligned in the x3-direction, in a channel of dimen-
sions 2.5 x H x H, with H = 0.41. We have no slip boundary conditions
on the cylinder and the channel walls. At the outflow boundary we use a
transparant outflow condition, see [23], and the inflow condition is given by
u(O, X2, X3) = (16Urnx2(H - X2)X3(H - X3)/ H 4 , 0, 0). The kinematic viscosity
is v = 10- 3 and Urn = 0.45, which gives a Reynolds number Re = UD/v = 20,
with (j = 4U(0, H/2, H/2)/9.
We consider the computation of the drag coefficient, and the computation of
a pressure difference upstream and downstream of the cylinder, using a cG(l)
method (continuous piecewise linear trial and test functions) on tetrahedral
meshes for both the primal and the dual problems.
To evaluate the performance of the duality based error indicator (5) as
a refinement criterion in Algorithm 1, we compare with a commonly used
alternative error indicator

(6)

where II·IIK is a norm on the element K, based only on the size of the residuals,
coupling to energy estimates (see e.g. [1]).

Computation of the drag coefficient The computational goal is to ap-


proximate the drag coefficient CD, defined in [25] by

_ 2FD(U,P)
CD = U2DH ' (7)

where FD (u, p) is the drag force on the cylinder. Based the results in [25] we
choose CD = 7.6 as our reference value.
In Figure 1 we compare the convergence rates of the two error indicators
(5) and (6) with respect to the reference value CD = 7.6. It is obvious that the
refinement criterion (5), based on both the residual and the solution to the
dual problem, does a better job than the refinement criterion (6), solely based
434 J. Hoffman, C. Johnson

Fig.!. Convergence rates for the computation of the drag coefficient CD (left), and
the pressure difference .dp (right), for duality based refinement ('0') and residual
based refinement ('* '), as a log-log plot of number of unknowns versus relative errors.

on the residuals without any information from the dual problem relating the
residual to the error in the drag coefficient CD.
We then evaluate the a posteriori error estimates as a stopping criterion
for the adaptive algorithm by introducing the notion of an effectivity index
Ieff' defined by
lei I = estimated error/true error, (8)
and in Table 1 we present Ieff as a function of the number of unknowns. The
a posteriori error estimates in this case are quite sharp. After a few initial
refinements the error estimates is off by less than a factor 2, and may thus be
useful as a stopping criterion.

Table 1. Effectivity indices Ief f = estimated error/true error for computing the
drag coefficient CD (left), and the pressure difference .dp (right), as functions the
number of unknowns.

# dof CD: Ieff # dof .dp: Ieff


5.656 3.36 5.656 0.85
7.456 13.54 8.620 0.99
11.996 4.32 14.044 0.62
18.336 2.53 21.636 0.71
33.120 2.26 36.872 0.88
62.252 1.41 67.412 0.92
116.616 1.27 80.392 1.11
225.588 0.92 222.756 1.14
436.444 0.76 426.612 1.20
844.956 0.66 797.940 1.08
Adaptive Finite Element Methods for Turbulent Flow 435

Computation of a pressure difference Next we consider the problem


of computing the pressure difference in two points upstream and down-
stream of the cylinder respectively, defined by L1p = p(x d ) - p(XU ), with
XU = (0.45,0.20,0.205) and x d = (0.55,0.20,0.205). Based on the results in
[25] we use L1p = 0.176 as a reference value.
In Figure 1 we compare the error indicators (5) and (6), and we find that
the duality based approach again is the better. In Table 1 we present effectivity
indices for the a posteriori error estimates, which we find to be quite sharp
with Ief f close to unity.

4.2 Turbulent flow around a surface mounted cube

We now use Algorithm 1 to compute the temporal mean of the drag coefficient
CD over a time interval I = [To, T], with To = 10 and T = 20, defined by

1 (
CD = IT _ Tal lTo CD(t) dt, (9)

where CD(t) is the drag coefficient at time t for a surface mounted cube in
a turbulent channel flow, using a cG(1 )cG(I) method (continuous piecewise
linears in space-time) for both the primal and the dual problem, on tetrahedral
meshes '4 that we choose to be constant in time for each iteration k. In the
definition of the LES for the adaptive step k we let h = h(x) be defined to be
the piecewise constant function that equals the diameters of the finite elements
in the computational mesh '4.
In our computational model we use the Navier-Stokes equations to model
the incompressible fluid around a cubic body of dimension H x H x H that sits
on the floor of a rectangular channel oflength 15H, height 2H, and width 7H,
centered at (3.5H, 0.5H, 3.5H). At the inlet we use a velocity profile interpo-
lated from experiments, we use no slip boundary conditions on the body and
the vertical boundaries, slip boundary conditions on the lateral boundaries,
and a transparent outflow boundary condition. The viscosity v is chosen to
give a Reynolds number Re = UbH/v = 40.000, where we have used Ub = 1.0.
We use no subgrid model in the computations, but we use the following
scale similarity subgrid model

(10)

from [22] to estimate the modeling residual, see [14], measuring the small scale
influence on the resolved scales.
In Figure 2 we plot the mean drag coefficient as a function of number of
degrees of freedom. We find that even though we do not reach full convergence
using the avaliable number of degrees of freedom, the value for the mean drag
coefficient seems to asymptotically approach a value between 1.45-1.5. We
know of no experimental reference values of CD, but in [21] CD is approximated
436 J. Hoffman, C. Johnson

:. J-------~

r
'V

Fig. 2. Mean drag coefficient CD over the time interval [10, 20] as a function of the
number of degrees of freedom (left), discretization error eD ('0') and modeling error
eM (' *') after 13 adaptive mesh refinements as functions of the length of the time
interval [To, T], with T fix and To varying, assuming Uh(To) = uh(To) (middle),
and a posteriori error estimates of the discretization error eD ('0') and the modeling
error eM (' *') for the time interval [10,20], as functions of the number of degrees of
freedom in a l091O-l091O plot (right).

computationally. The computational setup is similar to the one in [14] except


the numerical method, the length of the time interval, and that we in [14] use
a channel of length ISH, compared to a channel of length 10H in [21]. Using
different meshes and subgrid models, approximations of CD in the interval
[1.14,1.24] are presented in [21].
The diameter of the smallest element in the mesh 7i4 is about 10- 3 (with
H = 0.1), which corresponds to a local Reynolds number Relac ;::0 (2H/h)4/3 ;::0
1200 (with channel height 2H), using standard Kolmogorov arguments oftur-
bulent flow [7], or Relac ;::0 h -1 = 1000, assuming the numerical viscosity of the
cG(l)cG(l) method is acting as a term h('\lUh , '\lUh ). That is, we are locally
able to resolve scales corresponding to a Reynolds number of about 1000, even
though it would be impossible globally to a similar computational cost. Since
turbulence often is a local phenomena, adaptive methods are ideal for compu-
tation of turbulence. In theory, if we refine the same elements in each step of
the algorithm we would get a finest h;::o H X (1/2)13 ;::0 10- 5, corresponding to
Relac ;::0 10 5 . That is, we would be able to locally resolve flows corresponding to
a Reynolds number of 105 in a Direct Numerical Simulation using an ordinary
PC or laptop computer.
After 13 adaptive mesh refinements we plot the a posteriori error estimates
of the discretization and the modeling errors in Figure 2 as functions of the
length of the time interval [To, T] (T fix, To varying), where we have assumed
that the initial solution is exact for each To, so that Uh(To) = uh(To). We find
that the error at first increases with the length of the time interval, but when
the interval exceeds a certain length the error does not increase significally
beyond a certain level, and thus the computational cost of computing CD is
relatively constant for time intervals longer than a certain length.
In Figure 2 we also plot the discretization and the modeling errors for CD
over the time interval [10,20] as functions of the number of degrees of freedom,
Adaptive Finite Element Methods for Turbulent Flow 437

where we note an expected decrease in the estimates of the discretization error


as we refine the mesh. We also note that the estimates of the modeling error on
the other hand increases. This might at first seem alarming, but is in fact to be
expected since in this case we have used the simple model (10) to estimate the
Reynolds stresses in the modeling residual. Even though the true Reynolds
stresses are smaller for a finer resolution h of the problem, the model (10)
will in fact first increase as we resolve more scales of motion since it is solely
based on the resolved velocity fluctuations on the scale 2h. This is of course a
problem, and in a continuation of this study we seek sharper estimates of the
Reynolds stresses based on scale extrapolation.

Remark 1. The use of a stabilized Galerkin finite element method in the com-
putations may be viewed as a type of subgrid model in itself, since we then
in fact solve a modified set of equations using a standard Galerkin method.
We will further investigate this relation between numerical stabilization and
subgrid modeling in a continuation of this work. In this paper we only con-
sider the stabilization to be part of the numerical method and not an explicit
sub grid model.

5 Summary

In this paper we have presented results from [15, 14], extending earlier results
in [17, 13], where we use adaptive finite element methods based on a posteriori
error estimates to compute approximations of output functionals in incom-
pressible fluids, for both laminar and turbulent flow. The a posteriori error
estimates are based on the solution of an associated linearized dual problem,
and are used as error indicators for the adaptive mesh refinement algorithm.
In the problem of computing the mean drag coefficient in a turbulent chan-
nel flow, we emphasize the local nature of turbulence that makes adaptive
methods ideal for efficient and accurate computations. Due to the computa-
tional goal of approximating the mean drag coefficient we refine the mesh
according to the corresponding a posteriori error estimate, resolving scales of
motion corresponding to local Reynolds numbers of about 1000, and in the-
ory we would be able to resolve local scales of motion corresponding to local
Reynolds numbers of the order 105 to a similar computational cost.
In continuations of this study we will address methods for sharp estimation
of the modeling residual based on scale extrapolation, as well as adaptive
strategies to combine numerical stabilization with subgrid modeling for LES.

Acknowledgments

The first author would like to aknowledge the support by DOE grant DE-
FG02-88ER25053.
438 J. Hoffman, C. Johnson

References
1. M. AINSWORTH AND J. T. ODEN, A posteriori error estimation infinite element
analysis, Computat. Meth. Appl. Mech. Eng., 142 (1997), pp. 1-88.
2. 1. BABUSKA AND A. D. MILLER, The post-processing approach in the finite ele-
ment method, iii: A posteriori error estimation and adaptive mesh selection, Int.
J. Numer. Meth. Eng., 20 (1984), pp. 2311-2324.
3. R. BECKER AND R. RANNACHER, A feed-back approach to error control in adap-
tive finite element methods: Basic analysis and examples, East-West J. Numer.
Math., 4 (1996), pp. 237-264.
4. - - , A posteriori error estimation in finite element methods, Acta Numerica,
10 (2001), pp. 1-103.
5. M. BRAACK AND A. ERN, A posteriori control of modeling errors and discretiza-
tion errors, SIAM J. Multiscale Modeling and Simulation, (2003).
6. K. ERIKSSON, D. ESTEP, P. HANSBO, AND C. JOHNSON, Introduction to adaptive
method for differential equations, Acta Numerica, 4 (1995), pp. 105-158.
7. C. FOIAS, O. MANLEY, R. ROSA, AND R. TEMAM, Navier-Stokes Equations and
Turbulence, Cambridge University Press, 200l.
8. T. GATSKI, M. Y. HUSSAINI, AND J. LUMLEY, Simulation and Modeling of
Turbulent Flow, Oxford University Press, 1996.
9. M. GILES, M. LARSON, M. LEVENSTAM, AND E. SULI, Adaptive error control
for finite element approximations of the lift and drag coefficients in viscous flow,
Technical Report NA-76/06, Oxford University Computing Laboratory, (1997).
10. J. HOFFMAN, On dynamic computational subgrid modeling, Numerical Analysis
Group Preprint, Oxford University, (1999).
11. - - , Dynamic Subgrid Modeling for Scalar Convection-Diffusion-Reaction
Equations with Fractal Coefficients, Multiscale and Multiresolution Methods:
Theory and Application (Ed. T. J. Barth, T. Chan and R. Haimes), Lecture
Notes in Computational Science and Engineering, Springer-Verlag Publishing,
Heidelberg, 200l.
12. - - , Dynamic subgrid modeling for time dependent convection-diffusion-
reaction equations with fractal solutions, International Journal for Numerical
Methods in Fluids, Vol 40, (2001), pp. 583-592.
13. - - , Adaptive finite element methods for les: Duality based a posteriori error
estimation in various norms and linear functionals, submitted to SIAM Journal
of Scientific Computing, (2002).
14. - - , Adaptive finite element methods for les: Computation of the drag coef-
ficient in a turbulent flow around a surface mounted cube, submitted to SIAM
Journal of Scientific Computing, (2003).
15. - - , Computation of functionals in 3d incompressible flow for stationary bench-
mark problems using adaptive finite element methods, submitted to Mathematical
Models and Methods in Applied Sciences (M3AS), (2003).
16. - - , Subgrid modeling for convection-diffusion-reaction in 2 space dimensions
using a haar multiresolution analysis, to appear in Mathematical Models and
Methods in Applied Sciences, (2003).
17. J. HOFFMAN AND C. JOHNSON, Adaptive finite element methods for incompress-
ible fluid flow, Error Estimation and Solution Adaptive Discretization in Com-
putational Fluid Dynamics (Ed. T. J. Barth and H. Deconinck), Lecture Notes
in Computational Science and Engineering, Springer-Verlag Publishing, Heidel-
berg, 2002.
Adaptive Finite Element Methods for Turbulent Flow 439

18. J. HOFFMAN, C. JOHNSON, AND S. BERTOLUZZA, Subgrid modeling for


convection-diffusion-reaction in 1 space dimension using a haar multiresolution
analysis, to appear in Computer Methods in Applied Mechanics and Engineering,
(2003).
19. C. JOHNSON AND R. RANNACHER, On error control in CFD, Int. Workshop
Numerical Methods for the Navier-Stokes Equations (F.K. Hebeker et.al. eds.),
Vol. 47 of Notes Numer. Fluid Mech., Vierweg, Braunschweig, 1994, pp. 133-144.
20. C. JOHNSON, R. RANNACHER, AND M. BOMAN, Numerics and hydrodynamic
stability: Toward error control in cfd, SIAM J. Numer. Anal., 32 (1995), pp. 1058-
1079.
21. S. KRAJNOvrc: AND L. DAVIDSON, Large-eddy simulation of the flow around a
bluff body, AlA A Journal, 40 (2002), pp. 927-936.
22. S. Lru, C. MENEVEAU, AND J. KATZ, On the properties of similarity subgrid-
scale models as deduced from measurements in turbulent jet, J. Fluid Mech., 275
(1994), pp. 83-119.
23. R. RANNACHER, Finite element methods for the incompressible navier stokes
equations, Preprint Intsitute of Applied Mathematics, Univ. of Heidelberg,
(1999).
24. P. SAGAUT, Large Eddy Simulation for Incompressible Flows, Springer-Verlag,
Berlin, Heidelberg, New York, 2001.
25. M. SCHAFER AND S. TUREK, Benchmark computations of laminar flow around
a cylinder, Flow Simulation with High-Performance Computers II: Notes on Nu-
merical Fluid Mechanics, 52 (1996), pp. 547-566.
Numerical Solution of a Nonlinear Evolution
Equation Describing Amorphous Surface
Growth of Thin Films

Ronald H.W. Hoppe 1 ,2 and Eva Nash 3

1 Department of Mathematics, University of Houston, Houston TX 772004-3008,


U.S.A.
2 Institute for Mathematics, University of Augsburg, D-86159 Augsburg, Germany
3 Infineon Technologies AG, D-81541 Munich, Germany

Summary. We consider a nonlinear parabolic partial differential equation that de-


scribes the evolution of the surface morphology in the deposition of thin glassy films
by molecular beam epitaxy. The dynamics of the growth process exhibits some un-
expected initial linear behavior, before the nonlinear dynamics sets in. Therefore,
for the numerical solution we suggest a combined spectral element/finite element
approach. Results of numerical simulations are given that show a good agreement
with experimental measurements.

1 Introduction
We consider the deposition of thin glassy films on the surface of substrates
such as silicon by molecular beam epitaxy. Such processes play an important
role in materials science with regard to the coating of surfaces in order to
obtain specific surface properties (cf., e.g., [11]).
In particular, we assume that the particle beam is impinging perpendicularly
to the surface of the substrate (cf. Fig. 1).
Denoting by D := [0, L]2 C IR? the surface of the substrate, the deposition
process can be described by the temporal and spatial distribution of the height
profile u(x, t), xED, t ;:::: 0, as given by

u(X, t) = H(x, t) - F t , (1)


where H(x, t) is the absolute height and F refers to the deposition rate which
is assumed to be constant.
As far as the development of an appropriate mathematical model is concerned,
the deposition evolves according to
au
at(x,t) = g(u(x,t)) XED, t;::::O (2)

where the right-hand side 9 describes the surface growth. There have been
many attempts to establish appropriate models for the morphology of deposi-
tion processes featuring amorphous surface growth (cf., e.g., [1, 4, 13]). Here,
Numerical solution of a nonlinear evolution equation 441

Particle Beam

jjjjjjjjjj

Film H(x, t)

Substrate (e.g., silicon)


Fig. 1. Schematic representation of the deposition of thin films

following [5, 10] and [11], we take three major growth mechanisms into ac-
count:
The first one describes surface growth due to particle interaction

(3)

where S stands for the influence of interatomic and van der Waals forces.
The second one is curvature induced surface relaxation

Here, D refers to the material dependent diffusion coefficient.


Finally, the third mechanism is due to structure coarsening in the sense that
particles at locations with high gradients of the height profile move to locations
with lower gradients. Here, we follow the model suggested by Moske (see [11])

where C denotes the mean surface mobility.


As long as the particle beam impinges perpendicularly onto the surface of the
substrate, we have l'Vul « 1. In this case, the functions gi = gi(U), 1 ::; i ::; 3,
in (3),(4), and (5) simplify to

gl(U) := al Ll 2u , g2(U) := a2 Llu , g3(U) := a3 Ll(I'VuI2)

where ai < 0 , 1 ::; i ::; 3. The evolution equation takes the form

with an initial condition u(x, 0) = uo(x) , xED, and either periodic boundary
conditions or homogeneous Neumann boundary conditions on r = aD.
442 R.H.W. Hoppe, E. Nash

The constants aI, a2 and a3 in the evolution equation are usually determined
by parameter identification with respect to experimentally obtained measure-
ments by using Auger spectroscopy and scanning electron microscopy.
However, if the particle beam does not impinge perpendicularly, there are
overhangs in the profile and even topological changes due the formation
of inclusions. In this case, we must use the original form of the functions
gi = gi(U) , 1:::; i :::; 3, as given by (3), (4), (5), and resort to other techniques
as, for instance, level set methods (cf., e.g., [8, 9]; see also [3]).
We note that the nonlinear 4th order evolution equation (6) resembles the
well-known Cahn-Hilliard equation which describes spinodal decomposition,
i.e., phase separation in binary alloys (cf., e.g., [6]).
The paper is organized as follows: In section 2, we will briefly address the
dynamics of the growth process which features some unexpected initial linear
behavior. This motivates the use of a combined spectral element/finite ele-
ment approach for the numerical solution of the nonlinear evolution equation
(6) that is described in sections 3 and 4. Finally, in section 5 we will give some
simulation results in terms of visualizations of the height profile for different
film thicknesses.

2 Dynamics of the growth process

The solution of the nonlinear evolution equation exhibits some unexpected


initial linear behavior. This can be explained by an appropriate decomposition
of the spectrum (J" C IR of the associated linearized operator which is self-
adjoint and sectorial. In particular, we specify three constants

"(- < 0 < "(+ < "(++ < 1


such that, referring to Amax as the maximum eigenvalue, the spectrum is de-
composed into the four parts

(J" .- (-00, "(- Amax) h- Amax, "(+ Amax)


(J"+ .- h+ Amax, "(++ Amax) .- h++Amax,+oo)

We further denote by X-- , X- , X+ and X++ the subs paces spanned by


the corresponding eigenfunctions:

X-- .- span {cp(A) I A E (J"--} X- := span {cp(A) I A E (J"-} ,


X+ .- span {cp(A) I A E (J"+} X++ := span {cp(A) I A E (J"++} .

Then, the direct sum of X+ and X++ can be shown to be a dominant sub-
space which determines the dynamical behavior of solutions to the nonlinear
evolution equation in the following sense:
Numerical solution of a nonlinear evolution equation 443

U y=x+ + x++

Fig. 2. Illustration of the initial linear behavior of the solution

Theorem 1. Assume u(·, 0) E H 3 (D) with

Ilu(·,0)112,n ::; r := C L 3 - a , a> 0


and set

flO := L -2 J Uo(x) dx
n
Then there exists t* > 0 such that the solution u = u(x, t), xED, t E [0, t*),
of (6) stays with probability 1 in a vicinity of the dominant subspace

Y := flO + x+ EEl x++ (7)


until at t = t* it leaves a ball B R (0) with radius R > r.
Proof. For a proof of this result we refer to [2].

Figure 2 illustrates the initial linear behavior of the solution in case flo = O.
We note that a related result for the Cahn-Hilliard equation has been estab-
lished in [12].

3 Spectral Galerkin approximation


For the spectral Galerkin approximation we consider the weak formulation of
the implicitly in time discretized nonlinear evolution equation which involves
the Sobolev space V := H;er(D) in case of periodic boundary conditions and
V := H2(D), if homogeneous Neumann boundary conditions are imposed.
Using the backward Euler scheme and denoting by u m E V an approximation
of u(., t m ) at time tm and by Tm := tm - t m - 1 the time step from level m - 1
to level m, the problem is as follows: Find u m E V such that for all X E V
444 R.H.W. Hoppe, E. Nash

j umXdx = j Um-IXdx + Tm j[alu m + a2.1um + f(um)].1Xdx , (8)


n n n
where the nonlinearity f(u m ) is given by f(u m ) := a31V'u m I2 .
We refer to Ai , 1 ~ i ~ N, as the first N eigenvalues of -.1 and denote by
V N := span {'Pi I 1 ~ i ~ n} the finite dimensional subspace of V spanned by
the associated orthonormal eigenfunctions.
The spectral Galerkin approximation is then a linear combination uN =
N
I: UN i'Pi E VN so that (8) with V replaced by VN gives rise to the non-
k=1 '
linear system

gi(uN) = (1 + alTmAi - a2TmA~)UN,i + (9)


N

+ TmAi j 1(2: uN,k'Pk)'Pi dx - u7J,-;1 o


n k=1

The solution of that nonlinear system by Newton's method would be quite


expensive, since it requires the computation of the Jacobian at each iteration
step. It turns out that it is sufficient to use the method of successive iterations
which corresponds to the approximation of the original problem by the semi-
implicit Euler Scheme: For v 2: 0 compute u;;',v E VN as the solution of

m v XN d X = j UN
j uN' m-I XN d X + Tm j[ aIu m
N 'v + (10)
n n n
+a2.1u7J'v + f(u7J,v-I)jLlXN dx

where U;;.,-I := u7J- I. In this case, we can explicitly solve for the components
of the new iterate

(11)

where

f im' v-I := j f('"""' u N:k


~
N
m v-I
'Pk ) 'Pi dx
n k=1

We only have to evaluate the nonlinear terms r::"v-I which can be efficiently
done by the Fast Fourier Transform in case of periodic boundary conditions and
by the Fast Cosine Transform for homogeneous Neumann boundary conditions.
In both cases, this is done with respect to an equidistant grid consisting of M2
grid points where M has to be chosen larger than the dimension of the trial
space VN in order to avoid aliasing effects.
Numerical solution of a nonlinear evolution equation 445

The semi-implicit Euler scheme is carried out with an automatic step-size


control. The error due to the time discretization is estimated by

where um E V is the solution of the semi-implicit trapezoidal rule

J
n
umXdx = J
n
um - 1Xdx + T; J
n
[a1(u m + um - 1) + (12)

+ a2Ll(u m + um - 1) + U(u m ) + f(u m - 1)))] LlXdx , XE V .

If V = VN , the solution of (12) can be easily computed according to

(UN)i = (u~-l)i - T; [a1 Ai ((uN)i + (u~-l)i) -


- a2 A; ((UN)i + (u~-l)i) - Ai U;:' + f;:,-l)] , 1 ~ i ~ N.

In the practical realization of the step-size control, we additionally take into


account the error due to the discretization in space. The error Ilum - uN112,n
is caused by the negligence of those eigenmodes i > N that have not been
considered in the spectral Galerkin approximation, but are relevant for the
exact computation of the nonlinearities. Therefore, for sufficiently large P, we
set

Given a tolerance tol > 0, we check for convergence:

(13)

where (j < 1 is an appropriate weighting factor. If (13) is satisfied we proceed


with the new time-step

(j tol IluNl12,n - leujV I


(14)
IluN - uN 112,n
Otherwise, we repeat the previous time-step with Tm.

4 The finite element method

The spectral Galerkin method becomes inefficient when the nonlinear dynam-
ics sets in, i.e., when the solution leaves the dominant subspace Y as given by
(7). Although a theoretical bound for the exit time t* is known (cf., e.g., [2]),
446 R.H.W. Hoppe, E. Nash

this bound is an overestimation and hence not practicable. Therefore, we stop


the spectral approach and switch to a finite element method, if the convergence
test (13) fails for several consecutive time-steps.
The finite element approximation is based on a reformulation of the 4th order
equation (6) as a system of two 2nd order equations

ou/Ot Llw
} in Q := D x [0,(0) (15)
w a1u + a2Llu + a31\7u1 2
We discretize in time by the implicit Euler method and in space by continuous,
piecewise linear finite elements with respect to a simplicial triangulation Th of
n. Denoting by 8 1 (D; Th) the associated finite element space, for each time-
step we have to solve the nonlinear system of equations:
Find (uh',wh') E 8 1(D;Th) x 8 1(D;Th) such that

1 ~mur;:-l + 1
n
uh' Xh dx
n
\7wh" \7Xh dx = 0, Xh E 8 1(D;Th) , (16)

1
a1
n
1
Uh'1/Jh dx - a2
n
\7uh' . \71/Jh dx + a3 1
n
l\7uh'1 21/Jh dx - (17)

-1 n
Wh'1/Jh dx = 0, 1/Jh E 8 1(D;Th) .

Providing a hierarchy (Th i )1=0 of triangulations, we solve the nonlinear system


(16), (17) on the finest grid by Newton-Multigrid:
Given (u;"V, w;"V) E 8 1 (D; Ie) x 8 1 (D; Ie), we compute
m,v+1
we =
m,v
we
+ om,v
Wi

where the Newton increment (0:,1/,0:';",) is the solution of the linear system

1
n
o:'Vxedx + Tm 1
n
\7o:t . \7xe dx = 1
n
u,!,-l xedx - (18)

1 n
u;"Vxedx Tm 1
n
\7w;',I/. \7xedx Xe E 8 1(D;Th, ,

1
n
0m,Vol,
Wi 'f/e
dx a1 1
n
0:,V1/Je dx + a2 1
n
\7o:'V . \71/Je dx - (19)

1 n
f'(u'!',V)0':e,V1/Je dx - 1
n
W'!"V1/Je dx + a1 1
n
u;"v 1/J~dx

1
- a2
n
\7u;"V . \71/Jedx + 1
n
f(u;"V)1/Je dx
Numerical solution of a nonlinear evolution equation 447

The system (18),(19) is solved by linear multigrid using incomplete LU decom-


position both as smoother on all levels 1 :::; i :::; I!. as well as an iterative solver
for the coarse grid correction equation on level i = O.
For the finite element approach we use a similar step-size control as in case
of the spectral Galerkin method, except that we replace the estimation of the
error due to the discretization in space by a residual-type a posteriori error
estimator.

5 Simulation results

We have used the combined spectral element/finite element approach for the
numerical simulation of the deposition of the metallic glassy film Zr AlCu on
silicon substrates.
The computational domain fl has been chosen as a square of length L = 200nm
in each direction. Periodic boundary conditions have been imposed and the ini-
tial height profile uo(x) , x E fl has been determined randomly with ilo = O.
In the spectral element approach we have used 125, 200, and 250 modes per di-
mension, whereas for the computation of the nonlinearities by the Fast Fourier
Transform a uniform grid with M = 400 grid points in each direction has been
employed which is sufficiently large to avoid aliasing effects.
In the finite element method we used a hierarchy (1()~=o of five simplicial
triangulations with ho = 1/25 and h4 = 1/400.

,
I
1~1
I:
[...,.,)
, I:

Fig. 3. Computed height profile for different film thicknisses [100 nm (left), 360 nm
(middle), and 480 nm (right)]

Figure 3 displays the computed height profiles for different film thicknesses in·
a grey scale ranging from black (0 nm) to white (4 nm). One clearly observes
the effect of structure coarsening: a surface pattern with a mesa-like structure
evolves featuring hills with fiat plateaus that are separated by narrow deep val-
leys. We note that already for 125 modes per dimension the computed profiles
are both qualitatively and quantitatively in good agreement with experimen-
tally obtained data (cf., e.g., [11]).
448 R.H.W. Hoppe, E. Nash

Acknowledgement. The work of the authors has been supported by the


German National Science Foundation (DFG) within the Collaborative Re-
search Center SFB 438 and the Graduate School GK 283.

References

1. A.L. Barabasi, and H.E. Stanley (1995): Fractal Concepts in SurfaceGrowth. Cam-
bridge Univ. Press Cambridge
2. D. Blamker (2000): Stochastic Partial Differential Equations and Surface Growth.
Wissner Augsburg
3. R.H.W. Hoppe, W.G. Litvinov, S.J. Linz (2003): On solutions of certain classes
of evolution equations for surface morphologies. Journal of Nonlinear Phenomena
in Complex Systems 6, 582-591
4. M. Kardar, G. Parisi, Y.-C. Zhang (1986): Dynamic scaling of growing interfaces.
Phys. Rev. Lett. 56, 889-892
5. S.J. Linz, M. Raible, and P. Hiinggi (2000): Stochastic field equation for amor-
phous surface growth. Lecture Notes in Physics 557, 473-483
6. S. Maier-Paape, T. Wanner (2000): Spinodal decomposition for the Cahn-Hilliard
equation in higher dimensions: Nonlinear dynamics. Arch. Rat. Mech. Anal. 151,
187-219
7. E.M. Nash (2001): Finite-Elemente und Spektral-Galerkin Verfahren zur nu-
merischen Lasung der Cahn-Hilliard Gleichung und verwandter nichtlinearer Evo-
lutionsgleichungen. Shaker Aachen
8. Ch.-D. Nguyen (2003): Level set methods for nonlinear deposition equations. Dis-
sertation. Institute for Mathematics, University of Augsburg
9. Ch.-D. Nguyen, R.H.W. Hoppe (2003): Amorphous surface growth via a level set
approach. submitted to Nonlinear Analysis: Theory, Methods, and Applications
10. M. Raible, S.J. Linz, P. Hiinggi (2000): Amorphous thin film growth: Minimal
deposition equation. Phys. Rev. E 62,1691-1705
11. M. Raible, S.G. Mayr, S.J. Linz, M. Moske, P. Hiinggi, K. Samwer (2000): Amor-
phous thin film growth: theory compared with experiment. Europhys. Lett. 50,
61-67
12. E. Sander, T. Wanner (2000): Unexpectedly linear behavior for the Cahn-Hilliard
equation. SIAM J. Appl. Math. 60, 2182-2202
13. D.E. Wolf, J. Villain (1990): Growth with surface diffusion. Europhys. Lett. 13,
389-394
Constrained Mountain Pass Algorithm for the
Numerical Solution of Semilinear Elliptic
Problems

Jifl Horak

University of Basel, Department of Mathematics, Rheinsprung 21, 4051 Basel,


Switzerland
Current address: University of Cologne, Department of Mathematics, Weyertal
86-90, 50923 Cologne, Germany [email protected]

Summary. A new numerical algorithm for solving semilinear elliptic problems is


presented. A variational formulation is used and critical points of a CI-functional
subject to a constraint given by a level set of another CI-functional (or an intersection
of such level sets of finitely many functionals) are sought. First, constrained local
minima are looked for, then constrained mountain pass points. The approach is
based on the mountain pass theorem in a constrained setting.

Weak solutions of semilinear elliptic partial differential equations can typ-


ically be represented as critical points of nonlinear functionals. The easiest
approach in constructing critical points is via a minimizing sequence. It leads
to the method of the steepest descent which yields local minimizers. Another
tool used to prove existence of critical points is the mountain pass theorem
of Ambrosetti and Rabinowitz [1], repeated in Sec. 1. Choi and McKenna
[4] introduced a method based on a constructive form of this theorem - the
mountain pass algorithm. It is able to find numerical approximations to criti-
cal points of mountain pass type (typically, saddle points at which the second
derivative of the functional has exactly one negative eigenvalue, i.e., there is
just one direction in the function space at which the functional decreases on
both sides of the critical point).
The main objective of the work presented in this contribution is to de-
sign a numerical method (constrained mountain pass algorithm) that can find
numerical approximations of more complicated saddle type critical points (typ-
ically, with more negative eigenvalues of the second derivative of the functional,
i.e., more "directions of decrease"). A similar question was posed in [5]. The
"high-linking algorithm" presented by the authors can, however, be only ap-
plied to a narrow family of problems. The current work presents a different
approach that appears to be more universal.
We illustrate the main idea on a simple example that was used in both [4]
and [5]:
in n = (0,1) x (0,1) ,
(1)
on on .
450 J. Horak

Weak solutions of this problem correspond to critical points of the functional

defined on HJ (Q). It is not difficult to see that u == 0 is not only a solution of


(1) but also a local minimum of I. Since the steepest descent method would
likely yield this trivial solution (or diverge since I is not bounded from below),
a different method needs to be used to approximate nontrivial solutions. In [4]
a mountain pass solution (Fig. 1(a)) was found numerically using the moun-
tain pass algorithm. Later, the high-linking algorithm of [5] provided a more
complicated saddle type solution (Fig. 1(b)).

Fig. 1. Solutions of (1). Approximate interval of values of u(!7): (a) (0,6.62]'


(b) [-12.88,14.98], (c) [-16.24,16.22]' (d) [-12.93,19.75]

The idea of the current approach is to "reduce" the number of the "direc-
tions of decrease" of the functional by introducing constraints on admissible
functions. Roughly speaking, we could expect to reduce mountain pass points
to constrained local minima, or more complicated saddle type points to, for
example, constrained mountain pass points.
Define a constraint given by a new functional J. Let 5 = {u E HJ (Q) \
{O} I J(u) := In [IV'uI2 - u 4 ] dx = O}. By testing (1) with u we find that all
nontrivial weak solution of (1) belong to 5. Instead of looking for critical points
of I we will look for critical points of I with respect to 5, i.e., we will solve
Constrained Mountain Pass Algorithm 451

I'(u) - )..J'(u) = 0, where).. E JR is a Lagrange multiplier. It can be shown [6]


that any solution (u, )..), u =J. 0 of this equation satisfies).. = 0, hence u also
solves (1). It should be noted that it is not always possible or even desirable to
find constraints, such that the Lagrange multipliers vanish. As applications in
Sec. 3 show, these multipliers can be a part of the formulation of the problem.
In the constrained setting, the solution in Fig.1(a) can also be found as a
local minimum of I with respect to the constraint S by the method of Sec. 2.l.
Similarly, the solution of [5] in Fig. 1(b) can also be found as a mountain pass
point of I with respect to S by the method of Sec. 2.2. The solutions shown in
Figs. 1(c,d) are also constrained mountain pass points found by our method,
they, however, were not found by the method of [5]. Hence already this simple
example shows advantages of the current approach. But it is the problems to
which the high-linking method cannot be applied (e.g., those in Sec. 3) that
make the approach unique.
The outline of the rest of the paper: Section 1 presents a summary of known
theoretical results - the mountain pass theorem and its constrained version.
In Section 2 a description of the constrained steepest descent method (CSDM)
and the constrained mountain pass algorithm (CMPA) is given. Finally, Sec-
tion 3 shows the application of the method to two problems which cannot be
handled by the high-linking algorithm of [5]: a second order problem with two
constraints, a fourth order problem on an unbounded domain.

1 Theoretical Background

The mountain pass algorithm of [4] is based on the classical mountain pass
theorem of [1]. The constrained mountain pass algorithm presented in this
contribution is based on the constrained mountain pass theorem. We review
both theorems in this section.
Let B be a real Banach space and I E C1(B,JR) a continuously Frechet
differentiable functional.
Definition 1. I satisfies the Palais-Smale condition at the level a E JR if any
sequence {un} C B such thatI(u n ) ----+ a andI'(u n ) ----+ 0 possesses a convergent
subsequence.

Theorem 1 (mountain pass). Let el,e2 be two distinct points in B. Define

c = inf max I(u) , (2)


-yEr uE-y([D,l])

where r = bE C([O, 1], B) 11(0) = el,,(1) = e2}.


If c > max{I(el),!(e2)} and I satisfies the Palais-Smale condition at the
level c, then c is a critical value of I, i.e., there exists u E B such that I'(u) = 0
and I(u) = c.
452 J. Horak

Let us now introduce k constraints. Assume that J i E C1(B, ~), i E


{1, ... , k}. We are interested in finding real numbers A1, ... , Ak and a function
u E B that satisfy
k
I'(u) - L Ad{(u) = 0 . (3)
i=l

Such a function u is a critical point of I with respect to the set

Equation (3) is a general formulation of problems that we want to solve nu-


merically in this contribution. We assume further that if u E S, then JI(u) i- 0
for all i and {JI(u)}~=l are linearly independent.
The constrained mountain pass theorem [6] is based on the work of Bon-
net [2]. In order to use his definition we denote 11J'lsu I = infal, ... ,akE~ III'(u)-
L~=l adI(u) I and tg(u) = infai+ ... +a~=l I L~=l adI(u)ll·
Definition 2. I and J 1 , ... ,Jk satisfy the Palais-Smale condition at the level
a E lR if for any sequence {un} C B such that J 1 (un) ---) 1, ... , J k (un) ---) 1 and
I(un ) ---) a and either III'Isun I ---) 0 or tg(u n ) ---) 0 there exists a convergent
subsequence.

Theorem 2 (constrained mountain pass). Let e1 i- e2 belong to a path-


connected component of S. Define c by (2), where r = b E C([O, 1], S) 1,,(0) =
e1,,,(1) = e2}.
If c > max{I(e1),I(e2)} and I and J 1 , ... , J k satisfy the Palais-Smale
condition at the level c, then c is a critical value of I on S, i. e., there exists
a solution of equation (3) with u E Sand I(u) = c.

2 Description of the Algorithm

2.1 Constrained Steepest Descent Method

Let from now on B = H be a Hilbert space. In order to apply Theorem 2,


convenient points e1, e2 E S need to be found. Choosing e1 and e2 as local
minima of I on S seems reasonable. They can be found using a constrained
steepest descent method.
The method solves numerically the following initial value problem:

d
dt((t) = -P((t)"VI(((t)) , ((0) = (0 E S , (4)

where "V I (u) is defined as the Riesz representation of the Fn§chet derivative
J'(u) and Pu is the orthogonal projection on the tangent space of S at u E S
and is given by
Constrained Mountain Pass Algorithm 453

k
PuV = V - I>lOj'VJj(U) VEH,
j=1

where the coefficients OJ solve the linear algebraic system

k
'L)''V Ji(u), 'V Jj(u))Oj = J:(u)v for i E {I, ... , k}.
j=1

Properties of (4) are studied in [6]. If ( is a solution, then (( t) E S on its


interval of existence and 1(( (t)) is a nonincreasing function of t.
Equation (4) is discretized in two steps:

Step 1 Choose !:ltn > 0 small, define Un+! = Un - !:ltnPun 'VI(u n ).


Finding the gradients of I and J 1, ... , Jk at Un usually means solving a lin-
ear partial differential equation by a convenient numerical method (e.g., finite
element method).

Step 2 - Projection Since un+! lies in the tangent space of S at Un but not
necessarily on S we need to approximate it by some U n +1 E S. This is usually
accomplished by scaling (cf. examples of Sec. 3).

The size of !:ltn in Step 1 is chosen to be smaller than a prescribed maximum.


After Step 2 we check whether I(u n+!) < I(u n ). If not, we halve the size of
!:ltn and repeat both steps. If the value of I cannot be decreased any further,
we stop.

2.2 Constrained Mountain Pass Algorithm

Assume we work in a finite dimensional approximating subspace of H (conve-


niently chosen in 2.1). We take a discretized path in S connecting e1 and e2,
two local minima obtained by the constrained steepest descent method. We
find the maximum of I along the path. The point at which the maximum oc-
curs is moved a small distance in the direction of the projection of the steepest
descent of I to the tangent space to S, and then projected back to S. Hence the
path has been deformed in S and the maximum of I lowered. The deforming
of the path is repeated until the maximum along the path cannot be lowered
any more - a critical point is reached.

Path Initialization A path in S connecting e1 and e2 is represented by


a collection of P points Zo, ... , Zp E S. There is no general rule for choosing
these points but in many cases the following obvious choice suffices:

Zj = e1 + .p(e2 - e1) j E {O, ... , P} ,


Zj E S is an approximation of Zj as in Step 2 of 2.1.
454 J. Horak

Main Loop First, find the maximum of I on the path, i.e., find jm with
I(zjm) ::::: I(zj) Vj. Use interpolation to improve the maximum by moving Zjm
closer to Zjm+ 1 or Zj",-l'
Second, update Zjm by moving it in the direction of the steepest descent of
I projected to the tangent space of Sat Zj", to decrease the value of I(zj",)' In
fact, this amounts to the application of the two steps of the constrained steepest
descent method 2.1 (with Zj", instead of Un, Un+l is then the updated Zjm)'
Repeat these two steps until one of the following occurs:
1. the value of I(zj",) cannot be decreased any further,
2. in several recent consecutive repetitions of the loop the index jm has always
been the same - infinite loop.
In the first situation IfPzj ", 'VI(Zj",) I = IWlszj ", I is too small, i.e., Zjm is an
approximation of the desired critical point, the algorithm stops. In the second
situation the path needs to be refined.

Refining the Path We prescribe a number of points to be inserted between


Zj", and Zjm±l (for example two). At the same time we remove the same
number of points from both ends of the path so that the number P of points
on the path stays the same.

It is sometimes useful and more efficient to design a Newton scheme for problem
(3) and to use the solution obtained by CMPA as an initial guess for this
scheme. The approximate constrained mountain pass solution does not need
to be very precise. This means we can stop the CMPA early and do not need
to refine the path that many times.

3 Examples

3.1 Fucik Spectrum of the Laplacian

-f::J.u = p,u+ - IIU- in Dc ]R2 ,


(5)
u=O on aD

with In u2 dx = 1, where D = {(Xl, X2) E ]R21 X2 > 0, X2 < 2 - 4Xl, X2 <


2 + 4xI} is an isosceles triangle with base 1 and height 2, u± = max{ ±u, O}.
A point (p" II) E ]R2 is called a Fucik eigenvalue if (5) has a weak solution U E
HJ(D). The Fucik spectrum is then the collection of all Fucik eigenvalues. It
has been studied for various domains both analytically and numerically in [8].
The numerical investigation was based on a continuation method starting at
some known solution. Such a solution can be numerically obtained by CSDM
and CMPA.
For t E (0,1) define a variational problem with two constraints:
Constrained Mountain Pass Algorithm 455

1(u) = L l\7uI 2dx, J 1(U) = L (u+)2dx, J 2(U) = L


(u-)2dx ,

S={UEHJ IJ1 (u)=t,h(u)=1-t}.


Critical points of 1 on S are solutions of (5) with (/1, v) obtained as Lagrange
multipliers. It is shown in [6] that 1 and J 1 , J 2 satisfy the Palais-Smale condi-
tion at any level.
We apply the steepest descent method of Sec. 2.1. The scaling in the pro-
jection step is performed in the following way: for fl E HJ, fl+ oF 0, fl- oF 0
find t+, L > 0 such that u = t+ fl+ - L fl- E S. Figure 2 shows two local
minimizers of 1 on S for t = 0.2.

Fig. 2. Solutions of (5) found by CSDM. Approximate values of (fL, v) and u(n):
(a) (52.6,47.8), [-2.50,1.66], (b) (65.4,34.3), [-2.19,1.91]

These local minimizers are used as endpoints of the path in CMPA. The
algorithm converges to the solution in Fig. 3(b). During the run of CMPA the
size of 111'lszj = I = IlPzj = \7 1(zj=) II is checked and if it is small for a number of
iterations but later grows again, we may use the point Zjm as an initial guess
in Newton's method. If this method converges, we obtain a numerical solution
that is most likely different from the one to which CMPA eventually converges.
The solution in Fig. 3(a) was obtained this way.
A finite element method with piecewise linear functions on a triangular
grid with 11,097 nodes and 21,760 triangles was used, the path in CMPA had
P = 50 points.

3.2 Fourth Order Problem in ]R2

{)2U
b. 2u+c 2 ""2+u+g(u)=0 xE]R2, (6)
uX 1

u(x) -., 0 as Ixl -., 00 ,


456 J. Horak

Fig. 3. Solutions of (5) found by CMPA. Approximate values of (p" 1/) and u(D):
(a) (103.2,45.8), [-2.55,1.90]' (b) (105.7,43.4), [-2.41,1.84]

where g(u) = eU - 1 - u, c2 E (0,2). It has been studied in [7], its solutions


represent traveling waves in the direction of the xl-axis with speed c in a model
of a nonlinearly supported plate. The authors present a variational existence
proof based on the mountain pass theorem for a certain class of nonlinearities
g. Weak solutions of (6) in H2 = W 2,2(JR2) are constructed as critical points of
a functional. Since the functional does not satisfy the Palais-Smale condition
of Theorem 1, additional work has to be done in order to recover some form of
compactness. This turns out to be difficult for the above mentioned exponential
nonlinearity and hence the proof does not cover this case.
The numerical results of [7] give, however, a strong evidence of existence
of mountain pass solutions even with the exponential nonlinearity. Moreover,
a comparison is made with a one-dimensional ODE version of problem (6),
for which a wide variety of numerical solutions was computed by a shooting
method [3]. Hence a question arisen whether there is a way of finding more
numerical solutions of the two-dimensional PDE problem than just those in [7]
found using the mountain pass algorithm. We will show that the constrained
mountain pass algorithm yields additional numerical solutions.
Define functionals I, J E C l (H2 , lR.) by

G(~) = 1t; g(t)dt ,

where C > 0 is a constant. Critical points of I on S are weak solutions of (6)


with c2 obtained as a Lagrange multiplier. Let the inner product on H2 be
(c/J,'I/J) = flR2 [(.6.¢) (.6.'IjJ) +¢'IjJ]dx. It is shown in [7] that the corresponding norm
is equivalent to the standard norm on H2.
Constrained Mountain Pass Algorithm 457

Although the domain is the whole plane ]R2, for the numerical purposes
we can work on a large enough (but bounded) rectangle because our solutions
decay to zero as Ixi ----> 00 . Further, any translation of a solution of (6) is also a
solution with the same values of I and J. These translations can be prevented
by assuming symmetries of solutions (as in [3, 7]): U(Xl,X2) = U(-Xl,X2)
and U(Xl,X2) = U(Xl, -X2) \f(Xl,X2) E ]R2. Hence we can work on a rectangle
[-Kl' 0] x [-K2' 0]. On the boundaries Xl = -Kl and X2 = -K2 the conditions
U = 0 and ~u = 0 are implemented, on Xl = 0 and X2 = 0 the symmetry
conditions.

" "

Fig. 4. Solutions of (6) found by CSDM: (a) c ~ 1.247, (b) c ~ 1.313

CSDM with C = 150 yields numerical solutions shown in Fig. 4 (the profiles
of the waves have been highlighted, only a part of the computational domain is
shown). These types of solutions were found in [7] by the unconstrained moun-
tain pass algorithm. The projection step (Sec. 2.1) in the numerical methods is
again accomplished by scaling: for u E H2\ {O} find t > 0 such that U = til E S.
The two numerical local minima can be then used as end points of the
path in the constrained mountain pass algorithm. The algorithm converges to
the solution shown in Fig. 5(b). As noted in Sec. 3.1 already, by stopping the
algorithm early, if IWls, . II stays small for a number of iterations, and by
J~

applying Newton's method, a new numerical solution may be obtained - here


the function in Fig. 5(a).
A finite difference discretization was used with Kl = 70, K2 = 50 and the
step size ~Xl = ~X2 = 0.2. The number of points on the path in CMPA was
p= 50.

References
1. Ambrosetti, A., Rabinowitz, P .R. (1973): Dual variational methods in critical
point theory and applications. J. Functional Analysis, 14, 349-381
458 J. Horak

Fig. 5. Solutions of (6) found by CMPA: (a) c ~ 1.384, (b) c ~ 1.365

2. Bonnet, A. (1993): A deformation lemma on a C 1 manifold. Manuscripta Math.,


81(3-4), 339-359
3. Champneys, A.R., McKenna, P.J., Zegeling, P.A. (2000): Solitary waves in non-
linear beam equations: stability, fission and fusion. Nonlinear Dynam., 21(1),
31-53
4. Choi, YS., McKenna, P.J. (1993): A mountain pass method for the numerical
solution of semilinear elliptic problems. Nonlinear Anal., 20(4), 417-437
5. Ding, Z., Costa, D., Chen, G. (1999): A high-linking algorithm for sign-changing
solutions of semilinear elliptic equations. Nonlinear Anal., 38(2, Ser. A: Theory
Methods),151-172
6. Horak, J. (2003): Constrained mountain pass algorithm for the numerical solution
of semilinear elliptic problems. Preprint 2003-14, University of Basel
7. Horak, J., McKenna, P.J. (2003): Traveling waves in nonlinearly supported beams
and plates. In: Nonlinear Equations: Methods, Models and Applications, vol-
ume 54 of Progress in Nonlinear Differential Equations and their Applications,
pages 197-215. Birkhauser Verlag, Basel
8. Horak, J., Reichel, W. (2003): Analytical and numerical results for the Fucik
spectrum of the Laplacian. To appear in J. of Computational and Applied Math-
ematics
Optimal Shape Design of Diesel Intake Ports
with Evolutionary Algorithm*

Andras Horvath! and Zoltan Horvath 2

1 Department of Physics, Szechenyi Istvan University, Gyor, Hungary,


[email protected]
2 Department of Mathematics, Szechenyi Istvan University, Gyor, Hungary,
[email protected]

Summary. Intake port shape affects the quality of an engine significantly. In this
paper, we present a method for improving an existing geometry with evolutionary
algorithm type optimization. Characteristic parameters of different port shapes are
calculated with a self-developed CFD program. However only small deformations al-
lowed on original design, significant improvement achieved. Proposed robust parallel
evolutionary algorithm seems to be a suitable for other optimization problems on
heterogeneous non-reliable cluster of workstations.

1 Introduction

Design parameters (geometrical structure, injection parameters, etc.) of Diesel


engines affects the value of it in a very complex way. A large developing ef-
fort is taken into optimizing these parameters by engine designers. Modeling a
complete engine is a very difficult task both experimentally and numerically,
therefore a complex optimization can not be performed for the whole sys-
tem. Hence the traditional way is to split the engine into several main parts,
find some parameters that characterizes the value of each parts and perform
optimization processes only on these parts separately. However, after these
optimizations we have to put the parts together and check the whole system
whether the independently optimized parts can be assembled together or not
and check the overall behaviour of system.
One of these main parts is the intake port system. Its geometry highly
affects the main parameters (e.g. power, efficiency, pollution emission) of the
engine. (See Fig. 1 for illustrations of intake port shape.) The essential reason
of this strong dependence is that geometry of intake port biases the amount
and initial velocity distribution of drawn in air into the cylinder. The amount of
fresh air infiues the maximum mass of burnt fuel in one cycle, while the initial
velocity distribution affects the air-fuel mixture formation process which has
a high impact to the quality of burning.
* This paper was supported under the Hungarian Grant for Scientific Research
OTKA T43177.
460 A. Horvath, Z. Horvath

cylinder
outlet

Fig. 1. An opaque view of surface and a cross section of intake port shape

It is obvious that the more air drawn in is better. The dependence on


velocity distribution is more complex, but the engineering practice gives us a
plausible aspect: the most important property of velocity distribution in the
cylinder is the angular momentum per unit mass which characterizes the global
rotation. Both two small and too high global rotation are wrong for efficient
air-fuel mixture formation. There are commonly accepted optimal rotation
ranges for different type of engines coming from experiments.
The traditional way to describe this two important attribute of an intake
port is to calculate two non-dimensional parameters: the flux-coefficient and
the swirl-coefficient, denoted by Cf and C s respectively; for a more detailed
definition see Section 2.2.
Our task was to improve an existing intake port geometry, namely enhance
C f and keeping C s near to the original value. Because of avoiding a complete
redesign of existing and working engine head structure, only small deforma-
tions of the original geometry were allowed. However, this constraint causes
that only small improvement is to be expected, even 1% improvement in C f
can be important in engineering point of view.

2 The air flow in the intake port

2.1 CFD calculations

In the core of optimization process we needed a reliable and accurate CFD


software which can calculate the two characteristic flow parameters for port
shapes that are derived from original shape with small deformations. We used
a self-developed software based on a classical FVM method with some im-
provements for this purpose. A detailed description can be found in [3]. Here
we give only a short summary of the properties:
Optimal Shape Design of Diesel Intake Ports 461

- FVM method with flux vector splitting (Viyajasundaram-type, [7])


- solving compressible Euler or Navier-Stokes equations
- conservative in mass, momenta and energy
- explicit (first order) time steps
- local time-steps for faster calculations
- uses unstructured, conform tetrahedral mesh
- capable to handle small deformations of mesh

A tetrahedral mesh with 147775 elements was generated with a CAD-


software and was used as the discretization of the original shape. (See Fig. 2.)

Fig. 2. Cut of tetrahedral mesh and stationary velocity field near the valves

2.2 Characteristic parameters

Intake ports are characterized in the following standard way:


Let the pressure constant at inlet and outlet. (Pin and Pout) Let us measure
the mass flux (m) and the flux of angular momenta (T) at outlet (in the
cylinder) in stationary case.
The characteristic parameters of intake ports are the flux coefficient (Cf )
and the swirl coefficient (Cs ): (see [4])

C _ m/Pa (1)
f - A Va '
where Po is the density of air at the inlet, A is the approximate area of
smallest intake cross section (two times the valve inner seat area), Vo =
/2(Pin - Pout)/ Po is the characteristic velocity based on pressure drop and
B is the cylinder bore.
462 A. Horvath, Z. Horvath

The larger C f the better, while values of Cs has an "optimum interval"


coming from engineering experience. (See the introduction.)
m and T can be calculated from numerical model by approximating the
following integrals on outlet face (Sout):

m= r
}sout
pvdS, T = r
) Bout
p(v x (r - ro))dS

where ro is an arbitrary point on symmetr axis of cylinder, p and v are the


calculated density and velocity of air.
We found that calculations with this model can reproduce the experimen-
tally determined characteristic parameters within acceptable relative difference
even in the case of non-viscous calculations. For 4 different values of Pout we
could reproduce the measured values of C f in less than 1%, Cs in less than
10% relative difference. (See [3] for details.) Six small deformation on port
shape was realized and measured experimentally also: similar correspondence
was observed between computed and measured parameters.
Therefore we used our self-developed CFD code for evaluation of deformed
port shapes.

3 The optimization strategy

The main steps of finding a shape optimization strategy in our problem were
the followings:

1. Find and parametrize a set of small deformations.


2. Find a suitable object function which measures the quality of a deformed
shape.
3. Find and implement method to maximize the object function.

In the following subsections we will go through these steps.

3.1 Parametrization of small deformations

We used local, smooth deformations on the original grid. This way we did
not have to generate tetrahedral mesh for each deformed shape. Instead of
this very time consuming step we modified only the coordinates of vertices
and recalculated the geometrical parameters (volume, face normals, etc.) of
tetrahedra.
The overall deformation of the shape was put together from elementary
deformations. Each elementary deformation shifted the vertices only inside a
specified cylinder with a parallel axis with surface normal. (See Fig. 3.) The
shift of vertices is a fourth order polynomial of the distance from the axis of
cylinder and the depth relative to deformation center.
Optimal Shape Design of Diesel Intake Ports 463

Fig. 3. A smooth local deformation on a simple tetrahedral mesh

This choice fits to the engineering requirements also: we could assure that
the deformations keep small and keep some places (e.g. the valves, the top of
the cylinder) unchanged.
We used combinations of such elementary deformations. This way parame-
trization of deformations consists from a list of parameters of elementary defor-
mation, i.e. a list of deformation centers, radii and size. (Height of deformation
cylinder is not a free parameter: it does not affect the shape but the tetrahe-
dral mesh. We have to choose its value carefully depending on deformation
parameters. )

3.2 The object function

Our goal is to maximize Cf while keeping C s close to the original value. There-
fore we calculated the parameters with no deformations and used them as
reference values C,/f and c;e f .
Our object function was:

Ob·= C _D(1_~)2
J
f
C,/f C;ef
(2)

Namely we have a one-variable object function with a penalty term de-


pending on change in C s . We found D = 1.0 as adequate in our case. Thus
10% change in C s matches 1% change in Cf. (Remember that C s has no strict
464 A. Horvath, Z. Horvath

optimal value rather an optimal interval, that is a few percent change is not
significant.) For example: Obj = 1.02 means at least 2% improvement in mass
flux. (Depending on swirl number change.)
During the CFD calculations we observed significant "noise" in C f and Cs.
For example we ported our C code to two different hardware architectures and
found a difference in the order of 10- 5 in C f and 10- 3 in Cs. Similar differences
appeared on the same architecture in the case of very small deformations.
This noise derives from the rounding errors and the nature of calculations: the
convergence to the stationary flow is slow in Cs.
It means that object function has a lot of false local maxima as deformation
parameters vary. For this reason we decided to use a genetic type algorithm
for optimization. It is a common decision in shape optimization problems. (See
e.g. [6])

3.3 Optimization method

Genetic (GA) or evolutionary (EA) type algorithm have a lot of variants. (See
[2], [8]) Depending on the problem and the hardware possibilities different
versions of GAlEA should be used.
The main peculiarities of our problem and possibilities are the following:
- The workstations we can use have different CPU-speed.
- Calculation time of object function (a complete CFD simulation) depends
on geometry and takes 3-4 ours on our fastest workstations.
- We can use approximately 20-30 workstations for a few weeks therefore
2000-4000 object function evaluations is possible.
- There is significant probability of hardware errors during the calculations.
It is obvious that a classical GAlEA with master-worker type paralleliza-
tion is not suitable in our circumstances. One reason is the significant proba-
bility of hardware errors other is the different evaluation time. Both can lead
to a significant loss off efficiency due to the synchronization stages.
Furthermore, an island type parallelization with one island per workstation
is not a good choice either, because each workstation can calculate only 2-4
full generation with reasonable population size. Since new chromosomes from
other islands can be included only at the end of a complete generation cycle,
frequency of communication between different subpopulations is too small for
efficient parallelization.

A Robust Parallel Evolutionary Algorithm The specialties of our prob-


lem and hardware possibilities led us to a special type of parallelization strat-
egy. The main idea is to use a separate population on each workstation with an
evolutionary algorithm which tries to send and get messages about evaluated
chromosomes after each object function evaluation and if it got a new evaluated
chromosome from other machine, incorporates it into the population immedi-
ately. The communication is executed in two stages: the EA-process writes the
Optimal Shape Design of Diesel Intake Ports 465

evaluated chromosomes and object function values to the local hard disk and
checks it for evaluated chromosomes arrived from other machines. "Good"
chromosomes are moved between EA-units by "agent" programs which are
running independently from EA-processes.
This way we achieved stability against hardware failures and strong con-
nection between the EA-units with zero synchronization loss. It is clearly seen
that if an EA-unit fails, the other can work further, if the connection between
units is lost (network error), the units can continue on evaluating object func-
tions and if the workstation executing the agent program fails the agent can be
restarted on an arbitrary workstation. Even after a global failure the EA-units
could read back the evaluated chromosomes and continued working. However
each failure decreases the efficiency, the whole calculation will not stop and
after reparation the calculation can continue at full capacity.
Certeanly we need a special EA algorithm to guarantee that the new results
coming from other EA-units are used as soon as possible. In a classical GAlEA
cycle it is not possible. Therefore we were searching for a more flexible approach
and found the concept of "Flexible Evolution Agents" (FEA-s, see [9]).
FEA-s has not a strictly prescribed sequence of different genetic operators
but a central "decision engine" decides about which genetic operator (muta-
tion, crossover, etc.) will be executed in the next step. This kind of flexibility
is used to get adaptivity property of EA: a learning engine collects statistics
about success of operators and decision engine uses this information to choose
the operators to be execute next time. (See [9])
We did not implemented the adaptivity of FEA because each EA-unit
executes only 100~200 genetic operators in our case and it is too small for
a reliable statistics. However the non-deterministic order of genetic operators
allowed us to read and use the results of object function evaluation of other
units provided by communication agent program.
The skeleton of a robust and parallel EA-unit:

1. Population=empty
2. sort the population with niching
3. if ( there are new chromosomes on disk)
read new values
4. if ( Population. size < Size_min)
generate and evaluate a random chromosome
--> step 2
5. if ( Population. size > Size_max)
truncate Population
6. if ( the best of Population changed
try one line search step between
old and new best value
--> step 2
7. find a new chromosome by a random elementary step
8. evaluate the new chromosome
466 A. Horvath, Z. Horvath

9. add/replace new chromosome to Population


10. --> step 2

The possible elementary steps we used:


- mutate a randomly chosen chromosome
- single point crossover between two different chromosomes selected with bi-
nary tournament
- examine a random chromosome: if it is closer to a better chromosome by
a specified distance then mutate it (constrained mutation)
The set of elementary steps was pieced together using the experience of
test calculations on classical test problems. Hill climbing was not used because
it would hinder to immediately read and use results of other units. Instead,
we implemented step 6 and "constrained mutation" (see above) which are
searching near the existing good chromosomes and affect similar property.
The probability of elementary steps is chosen so that in a long calculation
the number of them will be equal with the number of such steps in a classical
GA-algorithm.

Test results We performed several test calculations with different agent


strategies on classical test problems. (Rastrigin and Keane-functions, see [5])
At first we present results with a very simple agent strategy, called "uniform
distribution" which means that the agent program collects the best two chro-
mosomes from each unit and sends them to all the other units periodically. We
found that the time period of collecting and distributing best results should
be more than 5-10 object function evaluation time but should be less than one
tenth part of total time of calculations. Within this wide range we found that
the convergence of best object function value does not depend on best results
redistribution frequency.
On Figure 4 we present the results of 20 variable Keane-function problem
with 1, 4 and 16 EA-units with uniform distribution strategy. One can observe
decreased efficiency in 16 EA-unit case. Such a decay is a well known property
of parallel algorithms.
We tested the robustness of our parallel EA-strategy in test problems. In
a test calculation with 16 EA-units we randomly stopped the agent program
for significant intervals to simulate network error. It is clearly seen that during
the "network errors" increasing best object function values slowed down, but
when the communication was restored, the results became rapidly increasing.
This shows that with no communication the EA-units was working further
producing a high variety of chromosomes and when the communication was
repaired, the units could use them to produce good entities. (See Fig. 4 on
right. )
To circumvent the efficiency loss mentioned above, we divided EA-units into
5-10 element groups and used uniform distribution within them. An another
agent program was applied to realize the communication between groups. This
Optimal Shape Design of Diesel Intake Ports 467

N=l - N..16ref.Hftrla!
N~4 •• -_._- N=15 w~h sifl"iUated errOf!
0_8 N=16 -•.•••

o~o~~~~-=~=-~~~~~=-~rom~~,=~~,= °o~--~~~~~=-~~~~~~-rom~~mm~~'=­
ObjooC fundion 8\111;l ....Hons Obj!!dltn:lionllvalulltions

Fig. 4. Test calculations with 20-variable Keane-function.

multilevel strategy has a close relationship with island model; in our case
a group of EA-units matches with an island.
In the real CFD shape optimization problem we could use at most 32
workstations. Following the multilevel agent strategy, we divided them into
3 groups (A, B and C) with approximately equal members and implemented
a non-symmetric data flow at the top level agent program: the best results
of Band C groups was sent to group A periodically. This way group Band
C evolved separately from other groups which is a good strategy to maintain
diversity, while group A could combine all the best chromosomes.

4 Results of intake port optimization

4.1 About technical aspects

Both CFD and EA code was written in standard C. The agents were Bourne
shell scripts. The calculations were performed on Linux workstations at
Szechenyi Istvan University. The maximum number of workstations was 32.
The CPU-speeds were between 1.5 and 2.4 GHz.
A typical evaluation took 3-4 hours (on 2.4 GHz Pentium4 machines)
There were 6 significant hardware-problems during calculations. (Power-
outs, hard disk problems, network problems, etc.) This fact proves that the
robustness of EA method had critical importance in our case.

4.2 Preparation

Based on previously mentioned calculations the most sensitive parts of intake


system geometry were chosen. (E.g. large pressure gradients on surface indicate
large resistance, high velocity values near the boundaries indicates important
parts, etc.)
468 A. Horvath, Z. Horvath

Small deformations of sensitive parts were selected with 5-90 mm radius


and (-15)-(+15) mm maximum size.
The goal of optimization process was to find the optimal deformation size
values in fixed locations with fixed radius. This way we used floating point
values of deformation sizes as genes.

4.3 Calculations

In the first calculation 20 deformation points were used. On each EA-unit


Size.min=40, Size.max=120 were set. The optimal value we found in 2000
function evaluation was 1.016. However it is significant, we wanted to get
a higher improvement.
The study of optimal chromosomes had important conclusions:
- There were 7 genes where the optimal value was at lower or upper limit of
that deformation size (extremal point)
- There were 5 genes where the absolute value of optimal deformation size
was less than 1 mm (irrelevant points)
Using this result a new set of deformations with new limits of deformation
sizes was chosen and a completely new optimization process was started: The 5
irrelevant points was dropped and 3 new (hopefully not irrelevant) was added.
The limits for the 7 extremal points was modified. We present the results of
the optimization of these 18 parameters. Figure 5 shows the object function
values during the calculations. One can observe very small improvements in
best values and decreasing diversity at the end of calculations. These symptoms
indicate that there is no reason to continue the calculations further.

,-
B ••• _-
C···

+~
0_995 ~

1.01,"--~-~_-:-_~_,:--~:-----'

Tlml lndaYi Timl in days

Fig. 5. Object function values during optimization process

The best chromosome we found has object function value 1.024. It means
a 2.5% improvement in C f and 3% decrease of Cs. In practice it may result in
e.g. aproximately 2.5% extra power with similar quality of burning.
Optimal Shape Design of Diesel Intake Ports 469

We present the most important differences between original and optimal


shapes on Figure 6. The changes are plausible, but the deformation sizes cannot
be figured out by hand.

I
i
i

Fig. 6. The resulted "optimal" shape. Black color: modified, optimal shape, light
gray: original shape, mid gray: unchanged parts

We examined the flow in optimal shape and found significant differences.


For example, the pressure gradient at surface decreased notably.

5 Conclusions

The robust parallel EA-method appeared to be useful in non-reliable hardware


circumstances. On a further work we will focus on optimizing EA-units and
agent strategies.
With small deformations (less than 12 mm size) a significant improvement
was achieved. With a larger set of deformations some further improvement to
be expected, but a much higher improvement probably requires major redesign.
We can conclude that our system is suitable for real 3D compressible fluid
flow shape optimization tasks.
470 A. Horvath, Z. Horvath

References

1. Feistauer, M. (1993) Mathematical Methods in Fluid Dynamics. Pitman Mono-


graphs and Surveys in Pure and Applied Mathematics, Longman Scientific &
Technical, UK.
2. Goldberg, D. (1989): Genetic algorithms in search, optimization and machine
learning. Addision-Wesley Pub!.
3. Horvath, A. and Horvath, Z. (2003): Application of CFD numerical simulation
for intake port shape design of a diesel engine. J. Comput. and App. Mech., 4,
129-146.
4. Kriis, H. (1993): Numerical simulation and experimental verification of DI diesel
intake port designs 4th Int. Conf. on Vehicle and Traffic Systems Technology,
Strasbourg.
5. Marco-Blaszka, N., Desideri, J. (1999): Numerical solution of optimization test-
cases by genetic algorithms. INRIA research report 3622.
6. Marco, N. and Lanteri, S. (2000): A two-level parallelization strategy for genetic
algorithms applied to optimum shape design. Parallel Computing, 26, 377-397.
7. Vijayasundaram, G. (1986): Transonic flow simulations using an upstream cen-
tered scheme of Godunov in finite elements. J. Comput. Phys. 63, 416-433.
8. Whitley, D. (2001): An overview of evolutionary algorithms: practical issues and
common pitfalls. Information and Software Technology, 43, 817-831.
9. Winter, G., Galvan, B., Alonso, S., Gonzalez, B. (2002): Evolving form genetic
algorithms to flexible evolution agents. In: Late Breaking Papers at the Genetic
and Evolutionary Computation Conference (GECCO-2002).
Numerical Simulation of Compressible Fluids
with Moving Boundaries: An Effective Method
with Applications*

Zoltan Horvath l and Andras Horvath 2

1 Department of Mathematics, Szechenyi Istvan University, Gyor, Hungary


[email protected]
2 Department of Physics, Szechenyi Istvan University, Gyor, Hungary
[email protected]

Summary. In this paper we present a numerical algorithm to the solution of the


equations of compressible nonviscous fluids on domains with moving (translating)
boundaries. Our moving mesh algorithm, defined on special tetrahedral meshes,
avoids global interpolation and re-meshing, thus it works quite efficiently on problems
with strongly deforming domains. As an illustration we give some computational re-
sults when our algorithm is applied to the simulation of gas flow in a high-voltage
circuit breaker.

1 The engineering problems and the scope of the paper

The simulation of many industrial processes requires numerical solution of


compressible fluids on moving domains. For example we can mention airfoil
oscillation, mixture formation in the combustion chamber and the cylinder of
a Diesel engine and flow development in a circuit breaker. With several prob-
lems the flow domain is strongly compressed and/or stretched in a certain in-
terval of the simulation time. So, to avoid small time step-sizes due to distorted
cells, re-meshing of the flow domain and, accordingly, interpolation of the state
variables are necessary from time to time, at least in case of explicit methods.
However, global re-meshing and the interpolation are time-consuming, more-
over, the latter introduces additional numerical and non-conservativity errors.
Further, even if the state variables are interpolated in a conservative way the
conservative errors of some other important conservative quantities, such as
total angular momentum, usually increase, see e.g. [2].
In this paper we would like to present our method that we applied succes-
fully to real-life problems, such as the simulation of the flow in a Diesel engine
and in a high-voltage circuit breaker (this can be considered a domain with
several pistons and valves). With both problems we have to compute compress-
ible (multicomponent) gas flow in a strongly deforming 3D domain where the
* This paper was supported under the Hungarian Scientific Research Fund
OTKA T43177.
472 Z. Horvath, A. Horvath

deformation is induced by translating boundary parts. In Section 2 we pose


our mathematical model, the multicomponent compressible Euler equations
on moving domains. Then in Section 3 we introduce a first order finite volume
method based on Vijayasundaram's numerical flux function with explicit time
stepping. The core of the numerical algorithm is the moving tetrahedral mesh
algorithm called snapper, the basic idea of which goes back to the snapper
algorithm for hexahedral meshes given in [2]. This results in a method which
requires re-meshing and interpolation only in the very close neighbourhood of
the moving objects and, moreover, these are done very efficiently.
This numerical method is coded and we show applications to a 3D (aca-
demic) test problem and the industrial problem of gas flow in a high-voltage
circuit breaker in Section 4. We conclude that comparisons of the computa-
tions with the exact solution of the test problem and with actual physical
measurements for the latter problem show good agreement.

2 The mathematical model

We have selected the multicomponent Euler equations of gas dynamics on mov-


ing domains as our mathematical model of the fluid flow problems described
in Section 1. This model is suitable for the modelling of flows where the effect
of viscosity is not significant comparing to that of convection.

au
at + div f(u) =0 Vt E [0, t max ], X E D(t) (1)
u(O, x) = uo(x) Vx E D(O) (2)
+ Be (boundary conditions) (3)
+ EOS (equations of states) (4)

where
- u : UtE ~,tmaxl {t} x D( t) -7 IRK +4 is the state variable with u = (Pl, ... , PK,
pvT,e) where Pk = Pk(t,X) (k = 1, ... ,K), P := L-Pm, v = v(t,x) =
(Vl, V2, V3)T, e = e(t, x) are respectively the density of the fluid components,
the density of the fluid, the fluid velocity and the total energy density (i.e.
the total energy per unit volume of the fluid);
- f = (ft, fz, hf is the flux vector with fi(U)=(P1Vi, ... , PKVi, (pviv+peif,
vi(e + p)T (i = 1,2,3) where ei is the ith coordinate unit vector; div f(u) =
it a~i~~);
- EOS denotes the set of the equations of states of a non-ideal gas mixture:
K K
p=RT'L ~:, e = 'L Pm1m(T) (5)
m=l m=l
Compressible fluids with moving boundaries and applications 473

where T = T(t, x) denotes the temperature of the fluid; Im(T) is the specific
internal energy given a priori by interpolation formulas based on tabulated
values, R is the universal gas constant and Wm is the molecule weight of
the mth fluid component;
- Be: linearly consistent boundary conditions formulated by the help of ghost
cells (for more details see e.g. [3] pp. 457-460, [11] pp. 222-224 with an
emphasis on moving boundaries; see also [2]);
- D(t) is the time dependent flow domain defined in the following way: D(O)
is a given initial domain, which is deforming according to the mapping
<p : [0, t max ] X D(O) --; ]R3, i.e. D(t) := {x = <p(t,~) E ]R31 ~ E D(O) }; we
suppose that <Pt := <p(t,.) is one-to-one for all t; then the velocity of the
points of D is given by K(t,X) := %f(t, <p;l(X)) \It E [0, t max ] and \Ix E D(t).
Note that <p is not unique if we prescribe only the deformation of the boundary
of D, which is the case in the situations we are focussing on in this paper.
The following lemma of the calculus called Reynolds' transport theorem is
the basic tool to obtain an integral formulation for the fluid flow on a moving
domain.
Lemma 1. Let V be a moving subdomain of D, i.e. V(O) C D(O) and V(t) :=
<Pt(V(O)). Then for any 'Ij; : ]R4 --;]R differentiable function we have

Vet)
J ~~ (t, x)dx = !J
Vet)
'Ij;(t, x)dx - J
8V(t)
'Ij;(t, S)K(t, s) . n(t, s) ds.

Integrating (1) over a moving sub domain V we get, by applying Lemma 1,


a weak formulation of the Euler equations on moving domains as

:t J
Vet)
udx+ J
8V(t)
f(u)·n-K·n u ds = 0 \IV c D moving sub domain of D

(6)
or, integrating in time and using the notation

u = uv(t) := jV~t)1 J
Vet)
u(t, x) dx,

JJ
tb

jV(tb)lu(tb) -jV(ta)lu(ta) + f(u)· n - K' nudsdt =0


(7)
ta 8V(t)
\I[ta, tb] C [0, t max ], \IV C D.

3 The numerical methods


For the construction of a numerical method to the problem posed in Section 2
we consider a moving tetrahedral mesh of D(t). This means that we have
474 Z. Horvath, A. Horvath

a face-to-face partition of D into tetrahedra T j , j = 1, ... , N such that each


tetrahedron is moving according to <p, the transformation function of D. We
allow that at certain time-points called snapping (or re-meshing) points of time
the structure of the partition and even N can be changed.
Further, we denote by Sjl the lth side of T j (l = 1, ... ,4) according to
a certain agreement, which does not change between the subsequent snapping
points. Then the tetrahedron neighbouring T j with sharing the face Sjl will
be denoted by the local indexing Tjl as well. (Tjl denotes a "ghost" tetrahe-
dron if Sjl c aD.) Further, let njl denote the outer unit normal vector of T j
corresponding to the face Sjl.
For the time discretization we suppose that an adaptively defined subdi-
vision of [0, t max ] as 0 = to < ... < t n - 1 < t n < ... < t m = t max is given,
where each snapping point belongs to {tn}~o; further Tn := t n - t n - l is the
nth time step-size.
Now we are in a position to introduce our numerical method. Suppose
that u is a weak solution of the problem posed in Section 2. Applying (7)
with V = T j on [tn-I, tn] we obtain the explicit scheme for the uj ~

IT.ttn )1 r
J JTj(t n )
u(tn,x)dx values

ITj (t n)1 Ujn - ITj (tn-l)1 u jn-l +Tn ~ - g (n-l


~ lIjl uj ,u n-l - Kjl
jl ,njl, -) = 0 , n = 1,2, ...
I
(8)
where g is the numerical flux function on sides and iljl' Kjl, Vjl approximate
in some sense njl, Kjl and ISjz!, respectively, such that

JJ
tn

(f(u)·n-K·nu)dsdt ~ TnVjlg(uj-l,ujl-l,iljl, Kjl). (9)


t n - 1 Sjl (t)

The initial values for (8) are defined by u~


)1 := ITl(
0 JTj(O)
r
uo(x)dx. For
J
a time-stepping scheme we have to define the moving mesh algorithm, the
numerical flux function g and the geometrical parameters. We devote the fol-
lowing three subsections to the definition of these.

3.1 The moving mesh algorithm: the snapper

Here we give an algorithm for an efficient discretization of the moving do-


main D = D(t) which is strongly deforming due to some translating parts
of the boundary. At first we divide D into non-overlapping blocks (moving
sub domains ) according to the type of the moving/deformation and discretize
the blocks separately, taking care of the face-to-face property on the common
parts of the boundary of the blocks; the union of the tetrahedra of the blocks
finally constitutes the tetrahedral mesh {Tj }. The type of a block B can be
Compressible fluids with moving boundaries and applications 475

fixed (if K == 0), shifting (if K(t, x) == ,8(t)b) or deforming. We call B of the
latter type if B is translation invariant in the sense that there exists a surface
So C IR3 such that Bo := UtB(t), the frame of B, equals the volume swept by
shifts of So, i.e. Bo = UOE[O,lJ(SO + Bb), and, moreover, there exists a moving
part of the boundary M C 8B such that K(t, x) = ,8(t)b whenever x E M(t)
and K . n = 0 otherwise.
The discretization of the blocks of the first or the second type can be an
arbitrary tetrahedral mesh fitting the geometry at time t = 0 and this is left
unchanged in time (first type) or simply shifted by ,8(t)b (second type). Let
us now assume that B is of deforming type and for the ease of presentation
suppose that So C 8B(O) and M(O) C So (e.g. the block of a valve in its
bottom dead center). The task we have to solve is the discretization of B
at time points t n given by the flow calculation (the "hydrocode") such that
B(tn) and its mesh is derived by the mapping <p from the given B(t n- 1 ) and
its mesh (n = 1,2, ... ); this step is called mesh modification. But first we need
the discretization of the frame Bo = B(O).

Layered tetrahedral mesh generation The steps of the discretization of


Bo are the following. (For an illustration see Figure 1.)
1. Triangulate M(O) C So and extend this triangulation to that of So.
2. Translate this triangulation in the direction of 1/£b where £ is a positive
integer to obtain a layer of prisms.
3. Divide the prisms into 3 tetrahedra each to get a face-to-face tetrahedral
mesh of this layer.
4. Translate the last layer with its meshing with Bb while necessary.
Note that Step 3. is not a trivial task, for a solution see [6].
We remark that this algorithm was succesfully applied to the discretiza-
tion of non-deforming and non-regular blocks either in such a way that first
we enframed the block to be meshed into a layered mesh (in these steps we
allowed "biased" translations), omitted tetrahedra not intersecting the block
and dragged the boundary nodes of the union of tetrahedra to the boundary
of the block.

The algorithm of the mesh modification Suppose that we are given the
mesh of B (tn-I), <p and Tn. We shall call the layer in the direction of b from
M (tn-I) the moving layer.
1. Compute first the new position of the moving part, i.e. M(t n- 1 + Tn).
2. If the height of the moving layer is smaller than half of the original height
(i.e. 1/£lbl) or even worse: M(t n- 1 + Tn) does not belong to the moving
layer, reduce Tn SO that in the new position of the moving part the height
of the layer is exactly the half of the original height and take M(tn) =
M(t n- 1 + Tn).
3. Inherit the topological structure of the mesh of B(t n- 1 ) to the mesh of B(tn);
only update the coordinates of nodes of M(t n ) according to the prescribed
476 Z. Horvath, A. Horvath

triangulation of S layer of prisms layer of tetrahedra repeated layers

Fig. 1. Layered tetrahedral mesh generation

mapping 'P, re-calculate the geometrical data (area of sides, etc.). In this
case we assign the node velocities as mean velocities, for example for the
node A on the moving part we take i'£(t,XA) = (XA(t n ) - XA(tn-1)/Tn .
4. If the height of the moving layer is exactly half of the original, we shall call
the neighbouring layer (in the direction of b) the new moving layer, and
its nodes back in direction b are snapped to the corresponding nodes on
M(t n - 1 ). The nodes of M(t n - 1 ) are snapped back to the initial position
of the former active layer and are deactivated (Le. signed that these points
do not belong from now to the flow domain). Of course, the geometrical
data have to be updated and since the tetrahedra in the new moving layer
corresponding to the moving part are derived by joining two layers the state
variables have to be interpolated (discussed below).
Step 4. in the algorithm above is called the snapping step and the whole
algorithm is the snapper (cf. [2]). For an illustration of this algorithm in 2D
case see Figure 2. A whole 3D tetrahedral mesh has a too complex structure
to illustrate the snapping on it, however the idea can be understood in 2D
and on Figure 3 we present the basic blocks of a 3D mesh, before and after
a snapping.

deform
---...
snap
---... ---...
deform

solid type 1 snap type 2 snap

Fig. 2. Sketch for the snapper algorithm


Compressible fluids with moving boundaries and applications 477

Due to the mesh generation and the mesh modification algorithms given
above, tetrahedral meshes before and after snapping have good properties
which enables us to do the interpolation of state variables in an effective and
conservative way. One can observe that grids before and after snapping can
be divided into "snapping blocks" of 2 or 3 prisms such way that these blocks
before and after snapping are coincident.
It is obvious that snap pings must be handled in a different way if the base
triangle of a prism of the moving layer is in the interior of M(t n - 1 ) (type 1
snap, 2 prism snapping blocks) or if there are only one or two vertices of the
base triangle on M(t n - 1 ) (type 2 snap, 3 prism snapping blocks). Type 1 and
2 snapping blocks in 2D are marked in Figure 2. In Figure 3 we present the
structure of type 1 and type 2 snapping blocks in 3D. We have to remark
that depending on which vertices of base triangle are on M(t n - 1 ), there are 6
different cases of type 2 snappings.

Fig. 3. Corresponding snapping blocks in type 1 (left) and type 2 (right) cases

The interpolation of the state variables can be done locally, inside the
snapping blocks. This means interpolation between 3 new and 6 old tetrahedra
in type 1, between 9 new and 9 old tetrahedra in type 2 snap. Since our method
is of first order and state variables are of density nature assigned to tetrahedra,
478 Z. Horvath, A. Horvath

these interpolation can be implemented in a natural way: state variables after


snapping are linear combinations of pre-snapping values and the elements of
transformation matrices can be calculated by determinig overlapping volume
ratios of tetrahedra. For example in type 1 snap the interpolation matrix from
3 to 6 tetrahedra is the following: (if we use an appropriate order of tetrahedra)

1/32/94/278/27 0 0)
( o 1/94/278/274/9 0
o 0 1/272/272/92/3
Notice that this snapping algorithm, including the elements of the interpola-
tory matrices, is independent of the triangulation of the moving surface.

3.2 The geometrical parameters for the scheme

In order to obtain appropriate geometrical parameters, lljl' K,jl and Djl, the
discrete version of the geometric conservation law (GeL) gives us a guideline.
For the concept and importance of the discrete GeL condition consult e.g. [4].
The GeL condition is derived from the fact that the constant flow u(t, x) ==
u' = canst. is a solution of (1) and also its weak form (7), whenever a suitable
Be is prescribed. Hence we have for all moving sub domains V and [tn-I, tn]
(c.f. (7))

(lV(tn)I-IV(tn-1)i) u*+
tn
JJ
- 1
ndsdt f(u*)-
&V(t)
(J J
tn - 1 &V(t)
K· ndsdt) u* =0

i.e., employing the identity JOV(t) n ds = 0,

JJ
tn

lV(tn)I-IV(tn-1)1- K· ndsdt = O. (10)


t n - 1 &V(t)

It is a natural requirement that the discretization of the problem should pre-


serve this property called discrete GeL, i.e. the deformation of the mesh alone
should not change a constant flow, which means that, for all u* = canst., if
uj-l == u* for all j then uj == u* for all j.
Lemma 2. If the numerical flux function g is conservative and consistent,
i.e. for all u, v, n, K there holds g(u, v, n, K) = g(v, u, -n, K) and g(u, u, n, K) =
f(u). n - K· nu, respectively, then the discrete GeL holds for the method (8)
whenever

and IT?I - ITjn-11 = Tn L iijlKjl . lljl Vj. (11)


I
Compressible fluids with moving boundaries and applications 479

Proof. Substituting uj-l = uj = u* into (8) the statement follows from the
consistency and conservatitity of the method. 0

Lemma 3. The method (8) with a conservative and consistent numerical flux
g and the "snapper" mesh deformation respects the discrete GeL whenever
_ t n- l +tn
njl = njl( 2 ),
_ 1 (12)
Kjl:= "3
A: node of Sjl

Proof. The first relation of (11) follows from the fact that the left hand side
of the equation equals the integral of the outer normal vector over the surface
of Tj(t n - l + t n )/2), which is the zero vector. To prove the second relation of
(11) it is enough to check by (10) that Tn Ll VjlKjl'lljl = JOV(t) K·ndsdt; but
this follows now from the actual choice of parameters and the definition of K.
o
In our actual algorithms applied to problems reported in this paper we used
(12).
We remark that besides considering the discrete GeL condition is natural
to hold true it is proven to guarantee first order accuracy of the scheme (8)
provided the method is in addition accurate on fixed meshes, see [4].

3.3 The numerical flux function: Vijayasundaram's function

For the numerical flux function g we employ Vijayasundaram's numerical flux


function (see [12] and also [3], [7]), which was proven accurate in our former
applied problems as well (see [5]).
Lemma 4. Let u E IRKH, n E IR3 be arbitrary and f defined in Section 2.
Then we have f(u) n = C(u, n) u with C(u, n) := L~=l f[(u) ni. Moreover,
the eigenvalues of C(u, n) are

where Y:= (pl/P"",PK/p)T, c:= yT 8( 8p


Pl, ... , PK
) + (H _llvI12)~,
ue
H:=
(e+p)/p.
The eigenvectors of C(u, n) corresponding to AK+3 and AKH are

right: ( v ±yyen )
H±yev·n

left: (~T yevn, ... , a~t Tyevn, -¥re vT ± yen, ¥re) .


In the formulas above we have, with the specific heat capacity at constant vol-
ume cv,m(T) = 8Im (T)/8T,
480 Z. Horvath, A. Horvath

op Rpm/Wm op RT
W - Pe Im(T).
oe LPmCv,m(T)'
m
;:;-- =
uPm m

Proof. The results follow from tedious computations and can be validated by
checking the definitions. 0
In (9) we have f(u)· n- K,' nu = 6(u, n, K,)u with 6(u, n, K,) := O(u, n) - K,·nI
(I is the (K +4)-by-(K +4) identity matrix). Then we chose for 9 the function
. -u+v + -u+v _
g(u, v, n, K,) = gVijaya .= 0(-2-' n, K,) u + 0(-2-' n, K,) v. (13)

Note that the computation of 6+ (and 6-) is done by using diagonalization


6± = Q-1 jj± Q where Q can be computed from the eigenvectors of 0 and
jj = diag(Ai(O) - n· K,).
It is clear that Vijayasundaram's flux function is conservative and consis-
tent, therefore, as a consequence of Lemma 2 the method (8) with the snapper
mesh deformation and geometrical parameters (12) respects the discrete GCL
condition.

4 Applications
We implemented the numerical algorithm, certainly including the snapper
mesh generation, in ANSI C programming language and applied to several
problems. Here we display some results of two problems. Based on our expe-
riences we may say that the code performs well for the considered test and
real-life problems.

4.1 Testproblern: rectangular block with an oscillating wall

We consider a rectangular block D(t) = [0,1] x [0,0.2] x [0,0.5-0.32 cos(100t)],


t E [0,0.25] and a one-component flow with initial data: Vo = 0, Po = 1.3,
To = 293K; ideal gas EOS: I = 1.404, P = (r - l)(e - 1/(2p)lpvI2)
and slip BC. For reference we computed exactly the total energy: E(t) =
0.404
fn(t) edx ~ 9759 ( 0.5-0.302~~S(100t) ) and the components of the total mo-
mentum: Pi(t) := fn(t) PVi dx, which appeared identically 0 for i = 1,2 and
P 3 ~ -0.7488 sin lOOt.
We tested our method, which respects the GCL property and the explicit
Euler method for time stepping (i.e. all geometrical data are evaluated at the
beginning of the time step; we know that this method does not respect GCL).
We had a mesh of 6· (20 x 5 x 20) = 12000 tetrahedra (at most).
We found numerically that there is a very small error in P3 and E(t) (with
relative error at most 10- 3 ). Moreover, our method produced the fifth of the
error of the explicit Euler method in P2 , underlining the importance of the
discrete GCL property.
Compressible fluids with moving boundaries and applications 481

4.2 Application: gas flow in a high voltage circuit breaker

We investigated the gas flow in a high voltage circuit breaker. In these investi-
gations the gas flow was induced by mechanical constraints of a configuration
of pistons and valves only, i.e. the current was taken identically zero. The fluid
flow domain is slightly not axisymmetric but employing its rotational symme-
try of 90 degrees and plane symmetry it is sufficient to compute the flow in its
eighth part. Hence we did not assume a priori the usual axisymmetric formula-
tions (c.f. [8]). The gas was a mixture of two components that were originally
separated. For an illustration of our results see Figure 4. Our code performed
very well: the computed and measured pressure and density values were com-
pared at two control points and there were at most 5% relative errors in the
measured and computed quantities. However, at certain small parts of the fluid
domain we experienced spurious pressure oscillations but these occured only
for a short period during the simulation time. This was somehow expected
(see e.g. [1], [9]), and clipping negative values from the energy approximations
cured the code. Finally we remark that the flow was proven significantly not
axisymmetric (c.f. Figure 4), which justifies our model.

Fig. 4. Graph of YI := PI/pat two points of time in two perpendicular plane sections
(the plane of the cross section and the direction of movement of parts is marked)

References
1. Abgrall, R. (1996): How to prevent pressure oscillations in multicomponent flow
calculations: A quasi conservative approach. J. Comput. Phys., 125, 150-160.
2. Amsden, A.A., O'Rourke, P.J., Butler, T.D. (1989): KIVA-II: A Computer Pro-
gram for Chemically Reactive Flows with Sprays. Report LA-11560-MS, Los
Alamos National Laboratory.
3. Feistauer, M. (1993): Mathematical Methods in Fluid Dynamics. Longman, Lon-
don
4. Guillard, H., Farhat, C. (2000): On the significance of the geometric conservation
law for computations on moving meshes. Comput. Methods Appl. Engrg., 190,
1467-1482.
5. Horvath, A., Horvath, Z. (2003): Application of CFD numerical simulation for
intake port shape design of a diesel engine. Journal of Computational and Applied
Mechanics, 4, 129~ 146
6. Horvath, A., Horvath, Z., Krizek, M.: Tetrahedralization of partitions formed by
prisms. (in preparation)
482 Z. Horvath, A. Horvath

7. Kroner, D. (1997): Numerical Schemes for Conservation Laws. Wiley and Teub-
ner, Chichester Stuttgart
8. Yan, J.D., Fang, M.T.C. and Hall, W. (1999): The development of PC based CAD
tools for auto-expansion circuit-breaker design. IEEE Trans. on Power Delivery,
14, 176-181.
9. Liu, X.D., Fedkiw, R.P., Osher, S., (1998): A quasi-conservative approach to the
multiphase Euler equations without spurious pressure oscillations. CAM Reports,
98-11, UCLA
10. Schenk, K., Bader, G., Berti, G. (1998): Analysis and Approximation of Multi-
component Gas Mixtures. Technical Report, TU-Cottbus
11. Toro, E.F. (1999): Riemann Solvers and Numerical Methods for Fluid Dynamics:
a Practical Introduction. Springer, Berlin Heidelberg
12. Vijayasundaram, G. (1986): Transonic flow simulations using an upstream cen-
tered scheme of Godunov in finite elements. J. Comput. Phys., 63, 416-433
Discontinuous Galerkin Methods for the
Time-Harmonic Maxwell Equations

Paul Houston 1 , Ilaria Perugia 2 , Anna Schneebeli 3 and Dominik Schotzau 4

1 Department of Mathematics, University of Leicester, Leicester LEI 7RH, United


Kingdom, email: [email protected]. Supported by the EPSRC (Grant
GRjR76615).
2 Dipartimento di Matematica, Universita. di Pavia, Via Ferrata 1, 27100 Pavia,
Italy, email: [email protected].
3 Department of Mathematics, University of Basel, Rheinsprung 21, 4051 Basel,
Switzerland, email: [email protected]. Supported by the SNSF (Project
21-068126.02).
4 Mathematics Department, University of British Columbia, Vancouver, BC V6T
lZ2, Canada, email: [email protected].

Summary. Interior penalty discontinous Galerkin methods for the time-harmonic


Maxwell equations in frequency-domain, together with their stability and conver-
gence properties, are reviewed. A new set of numerical tests carried out on a model
problem with a singular analytical solution validates the theoretical error estimates
of the presented method for the high-frequency case.

1 Introduction

In this paper, we review recent work on discontinuous Calerkin (DC) methods


for the discretization of the time-harmonic Maxwell equations: find the electric
field u such that

\7 x (J.1.- 1\7 x u) - w2 (c - iw-10-)u =j in n, (1)


nxu=O on r = an. (2)

Here, n is a simply-connected Lipschitz polyhedron in IR3 with connected


boundary = r an and outward normal unit vector n. The function j is a given
source term in L2(n). The temporal frequency is denoted by w > o. The
real-valued functions J.1., c and 0- are the magnetic permeability, the electric
permittivity, and the electric conductivity, respectively.
The main motivation for using a DC approach for the numerical approxi-
mation of the above problem is that DC methods, being based on discontin-
uous finite element spaces, can easily handle meshes with hanging nodes and
local spaces of different orders. This renders DC methods ideally suited for hp-
adaptive algorithms. Moreover, the implementation of discontinuous elements
can be based on standard shape functions; a convenience that is particularly
advantageous for high-order elements and that is not straightforwardly shared
484 P. Houston et al.

by standard edge elements commonly used in computational electromagnetics


(see [1,3,14] and the references therein for hp-adaptive edge element methods).
On the other hand, in the hp-context, since most of the degrees of freedom are
in the interior of the elements, the increase in the total number of degrees of
freedom with respect to the corresponding conforming method is not dramatic.
In this paper, we focus on interior penalty DC discretizations for (1)-(2) in
the low-frequency and high-frequency cases (the term w 2 s in (1) is neglected
in the former case, whereas the term iwu is neglected in the latter).
In the low-frequency case, problem (1)-(2) has to be completed by a diver-
gence-free constraint in the sub domain Do ~ D covered by insulating material
where u = 0 (additional scalar constraints arise if aDo is not connected, see,
e.g., [13]). This results in the following system:

V'x(f-t-1V'xu)+iwuu =j in D, V'·(su) = 0 in Do, nxu = 0 on r.


(3)
The main difficulty here is the incorporation of the divergence constraint in
the DC framework. Following [8], we show that this can be achieved using a
mixed approach where the constraint is accounted for by a suitable Lagrange
multiplier. We present the underlying theoretical properties, as well as the
energy norm a priori and a posteriori estimates that were derived in [8] and [6].
For further numerical tests, we also refer to [9].
In the high-frequency case, the problem that consists in finding the electric
field u such that

n x u = 0 on r. (4)

Here, we assume that w2 is not an eigenvalue of the underlying Maxwell eigen-


problem. While the design of interior penalty DC methods is straightforward,
the key difficulty for (4) arises in the numerical analysis ofthe methods due to
the indefiniteness caused by the zero order term. We present the main results
of a novel error analysis that was recently developed in [4]. These results show
that DC methods for (4) yield optimal rates of convergence in the energy norm
and the L2-norm. We further present a new set of numerical results on a test
problem with a singular solution.
To simplify the presentation in this article, we assume that f-t and s are
constants. However, the analysis in [8] covers the case of piecewise smooth
material coefficients, whereas the theoretical results in [4] hold for smooth
coefficients f-t and s only.

2 Discontinuous Galerkin Discretizations

In this section, we introduce interior penalty DC methods for the two model
problems in (3) and (4), and review their theoretical properties.
DC for the Time-Harmonic Maxwell Equations 485

2.1 Meshes, Trace Operators, Finite Element Spaces and DG


Norms

We consider shape-regular affine meshes 4, that partition the domain [J into


tetrahedra {K}; the parameter h denotes the mesh size of 4, given by h =
maXKETh hK, where hK is the diameter of the element K E 4,. We denote
by F~ the set of all interior faces of elements in Th , by Fr; the set of all
boundary faces, and set Fh := F~ U Fr;. We define the local meshsize h on Fh
by setting h(x) := max{ h K+, h K - }, if x is in the interior of 8K+ n 8K-, and
by h(x) := hK if x E 8K is on the boundary.
For piecewise smooth vector-valued and scalar-valued functions v and q,
respectively, we introduce the following trace operators. On an interior face
f E F~ shared by two neighboring elements K+ and K- with unit outward
normal vectors n±, respectively, denoting by v± and q± the traces of v and q
taken from within K±, respectively, we define the jumps and averages across
f by [v]r:= n+ x v+ +n- x v-, [q]N = q+n+ +q-n-, {Iv}}:= (v+ +v-)/2
and {Iq}} := (q+ + q-)/2, respectively. On a boundary face f E Fr;, we set
[v]r := n x v, {Iv}} := v and [q]N = qn.
For a given partition Th of [J and an approximation order £, 2: 1, we intro-
duce the following discontinuous finite element spaces:

V h := {v E L2([J)3 : VIK E pC(K)3 VK E 4,},


(5)
Qh = {q E L2([J)3 : qlK E P£+l(K) VK E 4,},

where pk(K) denotes the space of polynomials of total degree at most k on


K.
Finally, denoting by II . Ils,D, for s 2: 0 and D a bounded domain in ]R2
or ]R3, the standard norm in the Sobolev space HS(D)d, d 2: 1, we define the
following DG norms with which we will measure the approximation errors:

Ilvll~(h) = IIc~vI16,n + IIJ-l-~Y' x v116,n + IIJ-l-h-~ [v]r116,h'


Ilqll~(h) = Ilc~Y'hqIl6,n + Ilch-~ [q]NI16,h'

2.2 DG Discretization of the Low-Frequency Problem (Insulating


Materials)

We consider the case of insulating materials, i.e., [Jo = [J since all the key
difficulties in the numerical treatment of (3) are already present in this par-
ticular case. The DG method for the discretization of (3) with [Jo = [J and
divergence-free source term j is based on the following mixed formulation of
the problem:
Y' X (J-l-1Y' x u) -cY'p=j in [J,
Y'. (cu) = 0 in [J, (6)
n x u = 0, p = 0 on r.
486 P. Houston et al.

Here, P is the Lagrange multiplier related to the divergence constraint. The


standard variational formulation of (6) is well-posed in Ho(curl; D) x HJ(D);
see, e.g., [3,14].
The mixed DG method for (6) then reads as follows: find (Uh, Ph) in V h XQh
such that
ah(Uh, v) + bh(v,Ph) = (j, v),
(7)
bh(uh, q) - Ch(Ph, q) =0
for all (v, q) E V h X Qh, where the discrete forms ah(', .), bh (·,·) and Ch(',')
are defined, respectively, by

ah(U, v) = (p,-hVh XU, \1h X v) - r


J:h
[u]r. {{p,-l\1h X v}} ds

- JFh
r [v]r. {{p,-l\1h x u}}ds+ JFr h
ap,-l[u]r. [v]rds,

bh(v,p) = -(cv, \1hP) + r {{cv}}· [P]N ds,


JFh
Ch(p, q) = r Cc[P]N' [q]N ds.
JF h

Here, and in the following, we denote by (.,.) the standard inner product
in L2 (D)d, d 2': 1, and use \1 h to denote the elementwise application of the
operator \1. Further, we use the notation I:h rpds := L.fEFh If rpds. The
form ah (', .) corresponds to the interior penalty discretization of the cur I-cur I
operator, the form bh (·,·) discretizes the divergence operator in a DG fashion,
and the form Ch (', .) is the interior penalty form that weakly enforces the
continuity of Ph.
The parameters a and C in LOO(Fh ) are the usual interior penalty stabiliza-
tion functions defined by

(8)
where D and 'Yare positive parameters independent of the mesh size.
The results contained in the following theorem have been proven and nu-
merically validated in [8].

Theorem 1. There is a parameter Dmin > 0 only depending on the shape


regularity of the mesh and the polynomial approximation degree £ such that,
for parameters D > Dmin and 'Y > 0 in (8), the DG method (7) possesses
a unique solution.
Moreover, assume that the analytical solution (u, p) of (6) satisfies the
smoothness assumptions cU E HS (D)3, p,-l \1 x U E HS (D)3 and P E Hs+1 (D),
for an exponent s > 1/2, and let (Uh,Ph) be the DG approximation on con-
forming meshes defined by (7), with D > Dmin and'Y > O. Then we have the
optimal a priori error bound
DC for the Time-Harmonic Maxwell Equations 487

Ilu - UhIIV(h) + lip - PhIIQ(h)


:s:: C hmin{s,C} [Ilmlls,n + 11{-t-1Y' X ulls,n + Ilplls+1,n] ,
with a constant C independent of the mesh size.
The analysis developed in [8] is valid for piecewise smooth coefficients {-t
and E and the error estimate in Theorem 1 holds for piecewise smooth solutions.
On the other hand, it requires the assumption of conformity of the meshes,
although numerical tests in [9] have shown that the method is robust on meshes
with hanging nodes as well.
Moreover, the following energy norm a posteriori error estimate has been
established and tested in [6].
Theorem 2. We assume that Y' . j = 0 holds, so that p == O. Let (Uh,Ph) be
the DC approximation on conforming meshes defined by (7), with a > amin
and"Y > O. Then there is a constant C > 0 independent of the mesh size, such
that
IIU - uhllv(h) + lip - PhIIQ(h) :s:: C ( L 'r/k) 1/2,
KETh
where the elemental error indicator 'r/K is given by
'r/k = hJtllj - Y' X + EY'Phll~,K + h~-lIITK(Uh) - TK(Uh)II~,8K
({-t-1Y' x Uh)
+hI/II{-t![uh]rII~,8K + hKII[mh]NII~,8K\r + hkllY'· (mh)II~,K
+IIE!Y'Phll~,K + hK11IE![Ph]NII~,8K'
(J (1/2,1] is the parameter of the embeddings Ho(curl; D) n H(div; D) '----+
E
H17(D)3 and H(curl; D) n Ho(div; D) '----+ H17(D)3 (see [2]), and TK(V) is the
numerical flux defined by
T (v) = { nK x ({{{-t-1Y' x v)} - {-t-1 a [v]r) on oK \ r,
K nK x ({-t-1Y' x v - {-t-1 a (nK x v)) on oK n r.

2.3 DG Discretization of the High-Frequency Problem


For problem (4), the interior penalty DC method is given by: find Uh E V h
such that
(9)
for all v E V h, where the discrete form ah(',') is the same as in Section 2.2.
The interior penalty stabilization function a E L=(:h) is defined again by (8),
with a chosen independently of the mesh size and the frequency.
The following a priori error estimates in the energy norm and in the L2_
norm have been proven in [4]. Their proof is based on techniques similar to
those of [12] and [11, Section 7.2]' for the energy error bound, and of [10,
Theorem 3.2]' for the L 2 -error bound, combined with novel results that allow
one to approximate a discontinuous function by a conforming one. This result
is instrumental in controlling the non-conformity of the DC method.
488 P. Houston et al.

Theorem 3. Assume that the analytical solution U of (4) satisfies the regular-
ity assumptions eU E HS (Q)3 and /-l-l \l x U E HS (Q)3, for s > ~, and let Uh

°
be the DC approximation on conforming meshes defined by (9). Then there is
a parameter O:min > only depending on the shape regularity of the mesh and
on the polynomial approximation degree £, and a mesh size ho > such that, °
for 0: ~ O:min and 0 < h ::; ho, we have the optimal a priori error bound

with a constant C > 0 independent of the mesh size.


Consequently, for 0: ~ O:min, the DC method (9) admits a unique solution
Uh E V h, provided that h ::; h o.
Finally, assume that the analytical solution U of (4) satisfies the addi-
tional regularity assumption U E HS+"(Q)3 for s > ~ and (J E (1/2,1]
the parameter of the embeddings Ho(curl; Q) n H(div; Q) "-t H"(Q)3 and
H(curl; Q) n Ho(div; Q) "-t H"(Q)3 (see [2]). Then, for 0: ~ O:min, there is
°
a mesh size h2 > such that, for 0 < h ::; h2' we have

with a constant C > 0 independent of the mesh size.

The analysis of [4] is based on duality arguments; thus, the result of Theo-
rem 3 can easily be extended to smooth material coefficients /-l and e. However,
the extension to piecewise smooth coefficients requires alternative mathemat-
ical tools; this is the subject of ongoing research.

3 Numerical Example

In this section we present a numerical example to highlight the practical


performance of the DG method introduced and analyzed in this article for
the numerical approximation of the high-frequency indefinite time-harmonic
Maxwell equations in (4), considering a model problem with a singular solu-
tion. Throughout this section, we take /-l = /-lo and e = co, the permeability
and permittivity of the free space, respectively, and select the interior penalty
parameter 0: in (8) as follows: 0: = 10 £2. As is standard in electromagnetic
computations, we scale the electric field by U ---+ /-loU and obtain a problem for
the scaled field (that we again denote by u) of the form

\l x \l x U - k2u = j in Q, n x u = 0 on r, (10)

with a rescaled right-hand side (again denoted by j) and the wave number
k = Wv/-loeo.
For simplicity, we restrict ourselves to the two-dimensional analogue of (10).
To this end, we let Q be the L-shaped domain (-1,1)2 \ [0,1) x (-1,0] and
DC for the Time-Harmonic Maxwell Equations 489

select j (and suitable non-homogeneous boundary conditions for u) so that the


analytical solution u to the two-dimensional analogue of (10) is given, in terms
of the polar coordinates (r, '19), by

u(x,y) = VS(r, '19), where S(r, '19) = (kr)2/3 sin(2'19/3). (11)

Here, the boundary conditions are enforced in the usual DC manner by adding
boundary terms in the formulation (9); see [5,7] for details. The analytical
solution given by (11) then contains a singularity at the re-entrant corner
located at the origin of D; in particular, we note that u lies in the Sobolev
space H 2 / 3 -e (D)2, E > O. This example represents a slight modification of the
numerical experiment presented in [4]; cf., also, [1].
We investigate the asymptotic convergence of the DC method on a sequence
of successively finer (quasi-uniform) unstructured triangular meshes for £ =
1,2,3 as the wave number k increases. To this end, in Tables 1, 2, 3 and 4 we
present numerical experiments for k = 1,2,4,6, respectively. In each case we
show the number of elements in the computational mesh, the corresponding
DC-norm of the error Ilu- Uhllv(h) and the numerical rate of convergence r. In
view of the scaling we introduced, we have taken Ilu-Uhllv(h) as (1Iu-UhI16,n+
!
IIV x (u - uh)116,n + Ilh- 2 [u - UhhI16,FJ We observe that (asymptotically)
1
2.

Ilu - Uhllv(h) converges to zero at the optimal rate O(h 2 / 3 - e ), for each fixed £
and each k, as h tends to zero, as predicted by Theorem 3. In particular, we
make two key observations: firstly, we note that for a given fixed mesh and
fixed polynomial degree, an increase in the wave number k leads to an increase
in the DC-norm of the error in the approximation to u As pointed out in [1],
where curl-conforming finite element methods were employed for the numerical
approximation of (10), the pre-asymptotic region increases as k increases; this
is particularly evident when k = 6, cf. Table 4. Secondly, we observe that
the DC-norm of the error decreases when either the mesh is refined, or the
polynomial degree is increased as we would expect; this is also the case when
the DC-norm of the error is compared with the total number of degrees of
freedom employed in the underlying finite element space, for each fixed k; for
brevity these results have been omitted.
Finally, we end this section by considering the rate of convergence of the
error in the approximation to u measured in the L2-norm. While for smooth
solutions the optimal L2-order has been confirmed numerically (see [4]), the
additional regularity assumptions for the L2 estimate in Theorem 3 do not
hold in the example considered here. Notwithstanding this, in Figure 1 we
plot the L 2-norm of the error in the approximation to u, with the square root
of the number of degrees of freedom in the finite element space V h, for k = 1
and k = 6. We observe that (asymptotically) Ilu - uhllo,n converges to zero at
the rate O(h2/3), for each fixed £ and k, as in the case of the DC-norm of the
error.
490 P. Houston et al.

Table 1. Convergence of Ilu - uhllv(h) with k = 1.

£=1 £=2
Elements Ilu - uhllv(h) r Ilu - uhllv(h) r Ilu - uhllv(h) r
24 1.525e-1 8.881e-2 6.078e-2
96 8.875e-2 0.78 5.374e-2 0.73 3. 744e-2 0.70
384 5.393e-2 0.72 3.331e-2 0.69 2.337e-2 0.68
1536 3.348e-2 0.69 2.085e-2 0.68 1.467e-2 0.67
6144 2.096e-2 0.68 1. 31Oe-2 0.67 9.227e-3 0.67

Table 2. Convergence of Ilu - uhllv(h) with k = 2.

£=1 £=2 £=3


Elements Ilu - uhllv(h) r Ilu - uhllv(h) r Ilu - uhllv(h) r
24 2.272e-1 1.366e-1 9.490e-2
96 1.36ge-1 0.73 8.428e-2 0.70 5.904e-2 0.69
384 8.464e-2 0.69 5.262e-2 0.68 3.700e-2 0.67
1536 5.290e-2 0.68 3.303e-2 0.67 2.326e-2 0.67
6144 3.322e-2 0.67 2.078e-2 0.67 1.464e-2 0.67

Table 3. Convergence of Ilu - uhllv(h) with k = 4.

£=1 £=2 £=3


Elements Ilu - Uh Ilv(h) r Ilu - Uh Ilv(h) r Ilu - uhllv(h) r
24 8.727e-1 3.900e-1 2.274e-1
96 3.637e-1 1.26 1.853e-1 1.07 1.156e-1 0.98
384 1. 793e-1 1.02 9.786e-2 0.92 6.462e-2 0.84
1536 9.670e-2 0.89 5.622e-2 0.80 3.845e-2 0.75
6144 5.611e-2 0.79 3.395e-2 0.73 2.363e-2 0.70

4 ConcI usions

In this paper, we have reviewed two interior penalty discontinuous Galerkin


methods for the numerical approximation of the time-harmonic Maxwell equa-
tions in both the low-frequency and high-frequency regimes. The predicted per-
formance of the DG method for the indefinite problem in the high-frequency
case has been confirmed in a new set of numerical experiments carried out on
a model problem with a singular solution.
DG for the Time-Harmonic Maxwell Equations 491

Table 4. Convergence of Ilu - uhllv(h) with k = 6.

£=1 £=2 £=3


Elements Ilu - uhllv(h) r Ilu - uhllv(h) r Ilu - uhllv(h) r
24 2.535 8.535e-1 6.884e-1
96 7.305e-1 1.79 5.972e-1 0.52 2.048e-1 1.75
384 3.230e-1 1.18 1.641e-1 1.86 1.002e-1 1.03
1536 1.87ge-1 0.78 8.383e-2 0.97 5.461e-2 0.88
6144 8.275e-2 1.18 4.722e-2 0.83 3.207e-2 0.77

:::::::==J 0.67

k=l

yfDegrees of Freedom
Fig. 1. Convergence of Ilu - uhllo,n for k = 1 and k = 6

References
1. M. Ainsworth and J. Coyle. Hierarchic hp-edge element families for Maxwell's
equations on hybrid quadrilateral/triangular meshes. Comput. Methods Appl.
Meeh. Engrg., 190:6709-6733, 2001.
2. C. Amrouche, C. Bernardi, M. Dauge, and V. Girault. Vector potentials in
three-dimensional non-smooth domains. Math. Models Appl. Sci., 21:823-864,
1998.
3. L. Demkowicz and L. Vardapetyan. Modeling of electromagnetic absorp-
tion/scattering problems using hp-adaptive finite elements. Comput. Methods
Appl. Meeh. Engrg., 152:103-124, 1998.
4. P. Houston, I. Perugia, A. Schneebeli, and D. Schotzau. Interior penalty method
for the indefinite time-harmonic maxwell equations. Technical Report PIMS-
03-15, Pacific Institute for the Mathematical Sciences, Universify of British
Columbia, 2003.
492 P. Houston et al.

5. P. Houston, I. Perugia, and D. Schotzau. Mixed discontinuous Galerkin approx-


imation of the Maxwell operator. Technical Report 02-16, University of Basel,
Department of Mathematics, 2002. Accepted for publication in SIAM J. Numer.
Anal.
6. P. Houston, I. Perugia, and D. Schotzau. Energy norm a posteriori error estima-
tion for mixed discontinuous Galerkin approximations of the Maxwell operator.
Technical Report 2003/16, University of Leicester, Department of Mathematics,
2003.
7. P. Houston, I. Perugia, and D. Schotzau. hp-DGFEM for Maxwell's equations. In
F. Brezzi, A. Buffa, S. Corsaro, and A. Murli, editors, Numerical Mathematics
and Advanced Applications ENUMATH 2001, pages 785-794. Springer-Verlag,
2003.
8. P. Houston, I. Perugia, and D. Schotzau. Mixed discontinuous Galerkin approx-
imation of the Maxwell operator: Non-stabilized formulation. Technical Report
2003/17, University of Leicester, Department of Mathematics, 2003.
9. P. Houston, I. Perugia, and D. Schotzau. Nonconforming mixed finite element
approximations to time-harmonic eddy current problems. Technical Report
2003/15, University of Leicester, Department of Mathematics, 2003.
10. P. Monk. A finite element method for approximating the time-harmonic Maxwell
equations. Numer. Math., 63:243-261, 1992.
11. P. Monk. Finite element methods for Maxwell's equations. Oxford University
Press, New York, 2003.
12. P. Monk. A simple proof of convergence for an edge element discretization of
Maxwell's equations. In C. Carstensen, S. Funken, W. Hackbusch, R. Hoppe,
and P. Monk, editors, Computational electromagnetics, volume 28 of Lect. Notes
Comput. Sci. Engrg., pages 127-141. Springer-Verlag, 2003.
13. I. Perugia and D. Schotzau. The hp-local discontinuous Galerkin method for
low-frequency time-harmonic Maxwell equations. Math. Comp., 72:1179-1214,
2003.
14. L. Vardapetyan and L. Demkowicz. hp-adaptive finite elements in electromag-
netics. Comput. Methods Appl. Mech. Engrg., 169:331-344, 1999.
Mixed hp-Discontinuous Galerkin Finite
Element Methods for the Stokes Problem in
Polygons

Paul Houston 1 , Dominik Schotzau2 and Thomas P. Wihler 3

1 Department of Mathematics, University of Leicester


Leicester LEI 7RH, UK, Email: [email protected]
Supported by the EPSRC (Grant GR/R766I5)
2 Mathematics Department, University of British Columbia
Vancouver BC V6T IZ2, Canada, Email: [email protected]
3 School of Mathematics, University of Minnesota
Minneapolis MN 55455, USA, Email: [email protected]
Supported by the Swiss National Science Foundation under project
PBEZ2-10232I

Summary. We consider mixed hp-discontinuous Galerkin finite element methods


(DGFEM) for Stokes flow in general polygons. In particular, we show that, on ge-
ometrically refined meshes, the hp-DGFEM yields exponential rates of convergence
for problems with piecewise analytic input data. Numerical results confirming the
exponential convergence rates are presented.

1 Introduction

Over the last few years, several mixed discontinuous Galerkin finite element
methods (DGFEM) have been proposed for the discretization of incompress-
ible fluid flow problems; see, e.g., [4, 6, 7, 9, 12, 16] and the references therein.
The main motivations that led to these schemes are that mixed DGFEM pro-
vide robust and high-order accurate approximations, particularly in transport-
dominated regimes, and that they are considerably flexible in the choice of
velocity-pressure combinations, without excessive numerical stabilization. For
example, no extra stabilization is required to use optimally matched combi-
nations where the approximation degree for the pressure is of one order lower
than that of the velocity; this result was first established in [11] in the context
of linear elasticity of nearly incompressible materials.
The work in [13] presented a unifying framework for the analysis of mixed
hp-DGFEM for Stokes flow. For discontinuous QP - QP-l elements on hexa-
hedral meshes, the results there ensure (slightly suboptimal) error bounds for
the p-version of the DGFEM, where convergence is obtained by increasing the
polynomial approximation order p on a fixed mesh. However, these bounds
result in algebraic rates of convergence and are restricted to piecewise smooth
494 P. Houston et al.

solutions - an assumption that is unrealistic in general polygons, due to the


presence of corner singularities.
In this note, we report on the recent results in [14] that extend the approach
of [13] to mixed hp-DGFEM for Stokes flow in general polygons. In particular,
we show that, on geometrically refined meshes, the hp-DGFEM yields expo-
nential rates of convergence for problems with piecewise analytic input data;
see [18] for similar results in the context of diffusion problems. We further
present numerical results that confirm the exponential convergence rates as
predicted in [14].

2 The Stokes problem in polygons


In this section, we introduce the Stokes problem and use the results of [10] to
describe the regularity of its solution for piecewise analytic input data.

2.1 The Stokes problem

Let D C lR 2 be a bounded polygonal domain with outward unit normal vector


n on the boundary 8D. Then, for a given forcing term f E H-1(D)2 and
a Dirichlet datum g E H! (8D)2 satisfying the compatibility condition Ifn g .
nds = 0, the Stokes problem is to find a velocity field u E H1(D) and
a pressure p E £6(D) := £2(D)jlR such that

-Llu + \1p = f in D,
\1 . u = 0 in D, (1)
u = g on 8D.
This system is uniquely solvable; see, e.g., [5, 8] for details.

2.2 Analytic regularity in polygons

In [10] the regularity of the solution (u, p) to the Stokes equations with piece-
wise analytic data f and g was described in terms of the countably normed
Sobolev spaces introduced by Babuska and Guo for closely related diffusion
and elasticity problems (see [2, 3, 15]' and the references cited therein). To

°
define these spaces, let {A i };'!1 denote the vertices of the domain D. To each
vertex A we assign a weight f3i :::: and store these numbers in the M-tuple
f3 = (f31,"" f3M)' We define f3 ± j := (f31 ± j, ... , f3M ± j) and use the short-
hand notation C 1 > f3 > C2 to mean C 1 > f3i > C2 for i = 1, ... , M. Writing
ri(x) = min{l, lx-Ail}, we define the weight function iP{3(x) := I1~1 ri(x){3i,
and introduce the semi-norms
k

lul~k,l(n) := L IliP{3+l al-I Dau lli2(n), k :::: l :::: O.


f3 lal2: 1
Mixed hp-DGFEM for the Stokes problem in polygons 495

We denote by H~,l (0) the completion of Coo (0) with respect to the norm

Ilull~~'I(st) := Il ulltl-l(st) + lul~~'l(st)' l ;:: 1,


k

Ilull~~,o(st) := L IlcPf3+laIDaulli2(st),
lal=O
Here, we denote by II· IIL2(st) the usual L2-norm. Similarly, II· IiH'(st) is the
norm on the standard Sobolev space HI(O).

Definition 1. For an M -tuple 13 = (131,"" 13M) and l ;:: 0, the countably


normed space B~ (0) consists of all functions u for which u E H~,l (0) for
k ;:: land

lal = k ;:: l,
for constants C > 0, d ;:: 1 independent of k. Moreover, for l ;:: 1, the space
B~-! (80) is the space of traces of functions in B~(O).
Functions in B~(O) (or their traces) are referred to as piecewise analytic
functions. Indeed, they are analytic in any interior domain Oint C 0 with
Oint C 0\ {A}i'!l' but develop singularities at the corners {Adi'!l' Globally,
we have B~(O) C H I- 1 (0), but B~(O) ct
HI(O). The following regularity
result was proved in [10].

Theorem 1. There exists a weight vector 0 ::; f3min < 1 depending on the
opening angles of 0 at the vertices {Ai }i'!l such that for weight vectors 13 with
3
f3min < 13 < 1 and piecewise analytic data (f, g) E Bg(0)2 X BJ (80)2, the
solution (u,p) of the Stokes system (1) satisfies (u,p) E B$(0)2 x B~(O).
3
In the rest of the paper, we assume that (f, g) E Bg(0)2 X BJ (80)2 for a
weight vector 13 with f3min < 13 < 1, in order to ensure the piecewise analyticity
of the solution (u,p), as stated in Theorem 1.

3 Discontinuous Galerkin methods

In this section, we introduce discontinuous Galerkin methods for the Stokes


problem (1) and review their well-posedness, using the recent results in [13].

3.1 Meshes

Throughout, we assume that the domain 0 can be subdivided into shape-


regular affine meshes Th = {K} consisting of parallelograms K. For each K E
Th , we denote by nK the outward unit normal vector to the boundary 8K,
496 P. Houston et al.

and by hK the elemental diameter. Further, we assign to each element K E '4,


an approximation order kK 2': 1. The local quantities hK and kK are stored in
the vectors h = {hK}KETh and k = {kK}KETh , respectively. An interior edge
of '4, is the (non-empty) one-dimensional interior of EJK+ n EJK-, where K+
and K- are two adjacent elements of '4,. Similarly, a boundary edge of Th is
the (non-empty) one-dimensional interior of EJK n EJQ which consists of entire
edges of EJK. We denote by [I the union of all interior edges of Th , by [D the
union of all boundary edges, and set [ = [I U [D. We allow for meshes with
I-irregular hanging nodes, and further assume that there is a constant t£ > 0
such that t£kK :::; kK' :::; t£-lk K , whenever K and K' share a common edge.

3.2 Averages and jumps

Next, we define average and jump operators. To this end, let K+ and K-
be two adjacent elements of '4, and x be an arbitrary point on the interior
edge e = EJK+ n EJK- C [I. Moreover, let q, v, and r. be scalar-, vector-, and
matrix-valued functions, respectively, that are smooth inside each element K±.
By (q±, v± , r.±) we denote the traces of (q, v, r.) on e taken from within the
interior of K±, respectively. Then, we define the averages at x E e by:

Similarly, the jumps at x E e are given by

[q] = q+ nK+ + q- nK-, [v] = v+ . nK+ + v- . nK-,


[v] = v+ I8i nK+ + v- I8i nK-, [r.] = r.+nK+ + r.-nK-·
On boundary edges e C [D, we set {{ q)} = q, {{v)} = v, {{r.)} = r., as well as
[q] = qn, [v] = v . n, [v] = v I8i n, and [r.] = r.n.

3.3 Mixed hp-DGFEM

Given a mesh Th and a degree vector k = {k K }, kK 2': 1, we wish to approxi-


mate the Stokes problem (1) by finite element functions (Uh,Ph) E Vh x Qh,
where

V h = {v E L2(Q)2: VIK E QkK(K)2, K E '4,},


Qh = {q E L6(Q) : qlK E QkK-l(K), K E Th }.

Here, Qk(K) is the space of polynomials of degree at most k in each variable


on K. Thereby, we consider the following mixed method: find (Uh,Ph) E Vh x
Qh such that
{
Ah(Uh, v) + Bh(v,Ph) = Fh(v),
(2)
-Bh(Uh, q) = Gh(q)
Mixed hp-DGFEM for the Stokes problem in polygons 497

for all (v, q) E V h X Qh. The forms Ah and Bh are discontinuous Galerkin forms
that discretize the Laplacian and the incompressibility constraint, respectively,
with corresponding right-hand sides Fh and G h . These forms are given by

Ah(u, v) = L 'Vh U : 'Vh vdx -1 ({{'VhV}} : [u] + {{'Vh U }} : [v]) ds


+1 c[u] : [v] ds,

Bh(v, q) =- L q'Vh . v dx + 1 {{q}}[v] ds,

Fh(v) =
Jn
r f·vdx- Jevr (g0n):'Vh vds + Jevr cg.vds,
G h (q) = - rq
Jev
g . n ds.

Here, 'V h and 'V h' denote the discrete gradient and divergence operator, re-
spectively, taken element-wise. The function c E LOO(E) is the so-called dis-
continuity stabilization function that is chosen as follows. Define the functions
hE LOO(E) and k E LOO(E) by

min{hK' hK'}' x E e = 8K n 8K' c EI ,


h(x) := {
hK, x E e = 8K n 8n c Ev ,

x Ee = 8K n 8K' c EI ,
k(x) := {
x Ee = 8K n 8n c Ev.
Then we set
c = ,h- 1k 2 ,
with a parameter, > 0 that is independent of hand k.
Remark 1. The form Ah corresponds to the so-called symmetric interior penalty
(IP) discretization of the Laplace operator; see [1] and [13] where the presen-
tation and analysis of several different DG methods were unified for diffusion
problems and the Stokes system, respectively. All the results presented in this
paper hold true verbatim for all the mixed DG methods investigated in [13].

Remark 2. For piecewise analytic data f and g as in Theorem 1, the forms Fh


and Gh are well-defined.

3.4 Well-posedness

Well-posedness of the discrete system (2) was established in [13]. Indeed, by


endowing V h with the broken norm
498 P. Houston et al.

the forms Ah and Bh are continuous on V h xV h and V h X Qh, respectively,


with continuity constants C > 0 independent of hand k. Furthermore, there
exists a parameter 'Ymin > 0 independent of hand k such that for any 'Y :::: 'Ymin
there exists a coercivity constant C > 0 independent of hand k with

Finally, the following discrete inf-sup condition holds true:

inf sup Bh(v, q) > Clkl- I > 0,


O#qEQh O#vEV h IlvllhllqIIL2(D) -
with a constant C > 0 that is independent of hand k. Here, Ikl :=
maxKETh{kK }.
The above properties of the forms Ah and B h , combined with the continuity
of the forms Fh and C h , guarantee the well-posedness of the formulation (2)
for 'Y :::: 'Ymin·

4 Exponential rates of convergence


In this section, we present the main result of [14], namely exponential rates of
convergence for mixed hp-DGFEM on geometrically refined meshes.

4.1 Geometric meshes


We first define geometric meshes on Q= (0,1)2.
Definition 2. Fix n E No and (J" E (0,1). On Q, the geometric mesh .dn,a
with n + 1 layers and grading factor (J" is created recursively as follows: If
n = 0, .do,a = {Q}. Given .dn,a for n:::: 0, .dn+l,a is generated by subdividing
the square K with 0 E K into four smaller rectangles by dividing its sides in
a (J" : (1 - (J") ratio.
An example of a geometric mesh .dn,a on Q is shown in Figure 1. We denote
the elements in the basic geometric mesh by {Kij }, as indicated. We say that
the elements K Ij , K 2j and K3j constitute layer j for j :::: 2. The element at
the origin is denoted as K II .
Definition 3. A geometric mesh Tn,a on the polygon fl c ]R2 is obtained by
mapping the geometric meshes ..dn,O" on Q affinely to a vicinity of each convex
corner of fl. At reentrant corners three suitably scaled copies of ..dn,a are used
(as shown in Figure 2). The remainder of fl is subdivided with a fixed affine
and quasi-uniform partition.
In Figure 2 this local geometric refinement is illustrated. For ease of ex-
position, we only consider mesh patches that are identically refined with the
same parameters (J" and n, although different grading factors and numbers of
layers may be used for the different corner patches.
Mixed hp-DGFEM for the Stokes problem in polygons 499

Fig. 1. The basic geometric mesh Lln,O" Fig. 2. Local geometric refinement to-
with n = 3 and (]" = 0.5 wards the vertices of n. In all corners,
n = 3 and (]" = 0.5

Definition 4. A polynomial degree distribution k on a geometric mesh Tn,a is


called linear with slope JL > 0 if the elemental polynomial degrees are layerwise
constant in the geometric patches and given by k j := max(2, lJLjJ) in layer j,
j = 1, ... , n + 1. In the interior of the domain the elemental polynomial degree
is set to max(2, lJL(n + 1)j).

4.2 Exponential convergence


Our main result is the exponential convergence of the mixed hp-DGFEM for
problems with piecewise analytic data. Its detailed proof can be found in [14].
Theorem 2. Assume that the analytical solution (u,p) of the Stokes problem
(1) is piecewise analytic as stated in Theorem 1. Further, let (Uh' Ph) E V h XQh
denote the DGFEM approximation defined in (2) with'Y ::::: 'Ymin obtained on
geometric meshes Tn,a. Then there exists a parameter JLo = JLo(eJ, (3) > 0, such
that for linear degree vectors k with slope JL ::::: JLo, there holds
Ilu - uhllh + lip - Phllp(Jt) ::; C exp( _bN 1/ 3 ),
with constants C, b > 0 independent of N = dim(V h) R:j dim( Qh).
Remark 3. If the polynomial degree is chosen to be constant throughout the
mesh, i.e., kK = k for all K E T h , exponential convergence is still obtained by
choosing k proportional to the number of layers n.

5 Numerical experiment
The goal of this section is to numerically confirm the exponential convergence
result stated in Theorem 2. To this end, let [J be the L-shaped domain shown
500 P. Houston et al.

in Figure 3. As in [17, p. 113]' we select the right-hand side f = 0 and take the
Dirichlet boundary datum g in such a way that, in the polar coordinates (r, <p),
the exact solution (u,p) to the Stokes problem (1) is given by

u(r
,<p
) = rA ((1 +<p)W'(<p) -
sin(
>.) sin(<p)W(<p) + cos(<p)W'(<p))
(1 + >.) cos(<p)W(<p) ,
p = _r A- 1 [(1 + >.)2W'(<p) + W"'(<p)]/(l - >'),
where

W(<p) = sin((l + >.)<p) cos(>.w)/(l + >.) - cos((l + >.)<p)


- sin((l - >.)<p) cos(>'w)/(l - >.) + cos((l - >.)<p).
Furthermore, w = 3rr/2, and the exponent>. is the smallest positive solution of
the equation sin(>.w) + >.sin(w) = 0; thereby, >. ~ 0.54448373678246.
We emphasize that the solution (u,p) is in fact the strongest corner sin-
gularity of the Stokes operator in the domain V; it is piecewise analytic, that
is analytic in V \ {O}, but both V'u and p are strongly singular at the ori-
gin. Indeed, here u ~ H2(V)2 and p ~ Hl(V); thus, this example reflects the
typical (singular) behavior that solutions of the Stokes problem exhibit in the
vicinity of reentrant corners.

10-',L-----,~-~-~15o__-~-~,------,J

(degrees 01 freedom) 12

Fig. 3. L-shaped domain J! Fig. 4. Performance of the mixed hp-DGFEM

Figure 4 shows the performance of the mixed hp-DGFEM for the above
problem, on meshes that are geometrically refined towards the origin and for
polynomial degree distributions that are linearly increasing away from the
origin. In our computations we used the grading factor (J' = 0.5 and the linear
slope J.l = 1. The interior penalty parameter 'Y was chosen as 'Y = 10. The
exponential convergence rate, according to Theorem 2, is clearly visible. In
addition, we observe that the asymptotic regime is already achieved even with
a small number of degrees of freedom.
Mixed hp-DGFEM for the Stokes problem in polygons 501

References

1. D.N. Arnold, F. Brezzi, B. Cockburn, and L.D. Marini. Unified analysis of


discontinuous Galerkin methods for elliptic problems. SIAM J. Numer. Anal.,
39:1749-1779,2001.
2. 1. Babuska and B.Q. Guo. Regularity of the solution of elliptic problems with
piecewise analytic data, 1. SIAM J. Math. Anal., 19:172-203, 1988.
3. 1. Babuska and B.Q. Guo. Regularity of the solution of elliptic problems with
piecewise analytic data, II. SIAM J. Math. Anal., 20:763-781, 1989.
4. G.A. Baker, W.N. Jureidini, and O.A. Karakashian. Piecewise solenoidal vector
fields and the Stokes problem. SIAM J. Numer. Anal., 27:1466-1485, 1990.
5. F. Brezzi and M. Fortin. Mixed and hybrid finite element methods. In Springer
Series in Computational Mathematics, volume 15. Springer-Verlag, New York,
1991.
6. B. Cockburn, G. Kanschat, and D. Schotzau. The local discontinuous Galerkin
method for the Oseen equations. Math. Comp., to appear.
7. B. Cockburn, G. Kanschat, D. Schotzau, and C. Schwab. Local discontinuous
Galerkin methods for the Stokes system. SIAM J. Numer. Anal., 40:319-343,
2002.
8. V. Girault and P.A. Raviart. Finite Element Methods for Navier-Stokes Equa-
tions. Springer-Verlag, New York, 1986.
9. V. Girault, B. Riviere, and M.F. Wheeler. A discontinuous Galerkin method
with non-overlapping domain decomposition for the Stokes and Navier-Stokes
problems. Technical Report 02-08, TICAM, UT Austin, 2002.
10. B.Q. Guo and C. Schwab. Analytic regularity of Stokes flow in polygonal do-
mains. Technical Report 2000-18, SAM, ETH Zurich, 2000.
11. P. Hansbo and M.G. Larson. Discontinuous finite element methods for incom-
pressible and nearly incompressible elasticity by use of Nitsche's method. Com-
put. Methods Appl. Mech. Engrg., 191:1895-1908, 2002.
12. O.A. Karakashian and W.N. Jureidini. A nonconforming finite element method
for the stationary Navier-Stokes equations. SIAM J. Numer. Anal., 35:93-120,
1998.
13. D. Schotzau, C. Schwab, and A. Toselli. Mixed hp-DGFEM for incompressible
flows. SIAM J. Numer. Anal., 40:2171-2194, 2003.
14. D. Schotzau and T. P. Wihler. Exponential convergence of mixed hp-DGFEM
for Stokes flow in polygons. Numer. Math., to appear.
15. C. Schwab. p- and hp-FEM - Theory and Application to Solid and Fluid Me-
chanics. Oxford University Press, Oxford, 1998.
16. A. Toselli. hp-discontinuous Galerkin approximations for the Stokes problem.
Math. Models Methods Appl. Sci., 12:1565-1616, 2002.
17. R. Vermrth. A Review of a Posteriori Error Estimation and Adaptive Mesh-
Refinement Techniques. B.G. Teubner, Stuttgart, 1996.
18. T.P. Wihler, P. Frauenfelder, and C. Schwab. Exponential convergence of the
hp-DGFEM for diffusion problems. Comput. Math. Appl., to appear.
A Postprocessing of Hopf Bifurcation Points

Dasa Janovska 1 and Vladimir Janovsky2

1 Institute of Chemical Technology, CZ-166 28 Prague 6, Czech Republic,


[email protected]
2 Faculty of Mathematics and Physics, Charles University, CZ-186 00 Prague 8,
Czech Republic, [email protected]

Summary. The aim is to initialize a branch of periodic orbits emanating from


a Hopf bifurcation point. The second-order predictor of the branch is developed.
The problem is discussed in the context of the MATLAB toolbox CLJ1ATCONT.

1 Introduction

We consider dynamical system

u(t) = F(u(t), A), (1)

where F : ]Rn x ]Rl -7 ]Rn is a sufficiently smooth mapping; u E ]Rn is a vector


of state variables and A is a real parameter. Let (UH' AH) be a Hopf bifurcation
point. Under generic assumptions, see e.g. [9], Theorem 3.3, there is a branch
of periodic orbits (cycles) that emanate from this bifurcation point. One can
decide about stability of the cycles provided that the first Lyapunov coefficient
11 is nonzero.
The packages [4], [10] or [1] can detect a Hopf bifurcation point (UH,AH)
and continue the mentioned branch of cycles.
In order to compute T-periodic solutions of (1), one usually rescals time
and looks for I-periodic solutions to

-u(t)+TF(u(t),A) =0, O~t~l, (2)


u(O) - u(l) = O. (3)

T is the period of the actual motion and w == ~ is its frequency. The quoted
packages use orthogonal collocations, see [9]' to discretize the problem (2) &
(3).
The continuation is initialized by a cycle with a small amplitude h, see e.g.
[1], function initJLLC:

(4)

where !;,H E en and wH > 0 satisfy


(5)
A Postprocessing of Hopf Bifurcation Points 503

It reflects the well known fact that the spectrum of the differential Fu (UH, AH)
contains a purely imaginary eigenpair. The tuple (uO(t), T == ~~, A == AH) is
considered to be an approximate solution to (2) & (3).
As the predictor for the continuation, it is natural to take

(6)
namely, the differential of function UO with respect to h. Following this strategy,
differentials of the components T and A with respect to h are zeroes. Therefore,
the predictor for the pathfollowing for the first cycle reads as (vO (t), 0, 0) .
The aim of this contribution is to suggest a more sophisticated cycle ini-
tialization then just (4) and (6). The technique is based on Lyapunov-8chmidt
reduction at Hopf bifurcation point (UH, AH), see e.g. [11], [5]. We consider the
version, which is characterized by making use of bordering operators, see [8]
and [13].
In Section 2, we review the technique. In Section 3, we give the 2nd-order
formula to initiate the branch. Finally, in Section 4, we will report on a nu-
merical experiment.

2 Preliminaries
We review the main points of [8]. Instead of solving (2)&(3) we seek for 27r-
periodic solutions of (1); the reasons are just historical. The problem is formu-
lated as functional equation <I> (u, A, w) = 0 on proper function spaces:

<I>(U,A,W) == -wU+F(U,A), T -_ 27r , (7)


W

<I>: U x 11 x JR -+ V, U = C 1(8 1,JRn ), V = CO(81,JRn ) , (8)


where 8 1 == JRj27rZ. The roots of <I> are, locally and up to a phase shift, one-
to-one with the roots of a scalar bifurcation equation

(9)

x E JR is a reduced state variable u E U. The root

(x = 0, A - AH = 0, w - WH = 0)
of (9) is related to the root (u H, AH, wH) of <I>. The nontrivial solutions x i= 0
of (9) are linked with cycles, the trivial solutions x = 0 are linked with steady
states.
It is natural to develop the periodic solutions u = UH + v E U to Fourier
series
+00
u = UH + [v]o + L
([V]keiks + [V]_k e- ikS ) , (10)
k=1
504 D. J anovska, V. Janovsky

where [V]k : U ----) en is the k-th Fourier coefficient of v E U.


The reduction procedure depends on a choice of two vectors M E enx1
and LEe 1 x n. The requirement is that the bordered matrix

is regular. This is generically satisfied.


We will consider two particular variants: The classical Lyapunov-Schmidt
reduction, see e.g. [5], which corresponds to the choice

(12)
where eH is the unit eigenvector, see (5), and r/H is an adjoint eigenvector,

(13)

with the scaling rJ'HeH = l.


As the second variant, we just set

(14)

According to [13]' Remark 4, the option (14) leads to an alternative formula


for the first Lyapunov index 11 .
Actually, the periodic orbits are linked with solutions (x, A - AH, W - WH)
of the factored bifurcation equation

(15)

We consider the truncated Taylor expansion of (15)

61 cPxxx x
2
+ cPXA (A - AH) + cPxw (w - WH) + h.o.t. = O. (16)

It consists of low-order terms of the expansion (16), see [5], p.86. The corre-
sponding cycles u E U are approximated by the accordingly truncated series
(10). In (16), it is understood that the differentials of cP are evaluated at the
origin, i.e. cPxxx = cPxxx (0,0,0), etc. An algorithm for computing of the chain
of relevant differentials is supplied in [8]' and in a larger extend in [13].
In next Section 3, we resume the asymptotic formula for a cycle (it is
already done in [8]) and provide a formula for differential of the cycle with
respect to x. The latter formula is the cycle 'velocity' and serves as predictor
step of the cycle continuation.
We rescale time back to period one, see (3).
A Postprocessing of Hopf Bifurcation Points 505

3 The algorithm

We set h := x, where x is the reduced state, see (9).


The solution (u(t), T, A) to (2)&(3) could be represented as

u(t) = UH + 2h~(exp(27fit) [vxh) + 8A [v>.lo +


+h2 (~[vxxlo + ~(exp( 47fit) [vxxh )) + O( h 3 ) , (17)
27f
T = -,
W
W = WH + 8w + O(h4) , (18)
A = AH + 8A + O(h4) (19)

for small parameter h. The increments 8A and 8w are the solution of a linear
system
B (8A)
8w
= _ h2
6
(~¢xxx)
'0¢xxx
, (20)

The objects [vxh, [v>.lo, [vxxlo, [vxxh are vectors in en and ¢xxx, ¢x>., ¢xw are
complex constants, see [8l, p. 1167. They are computed at a particular Hopf
bifurcation point UH, AH, WHo
Truncating the higher order terms in (17), (18) and (19) we get the 2nd-
oder cycle approximation.
Let us differentiate the cycle (u(t), T, A) with respect to h. The resulting
differentials (lh lh
u(t), d~ A, T) can be considered as cycle velocity. For the
components of cycle velocity we obtain the following asymptotic formulae:

d .
dh u(t) = 2 ~(exp(2mt) [vxh) + 8A [v>.lo +
+2h (~[vxxlo + ~(exp(47fit) [vxxh )) + O(h2), (21)

d
dh T = -27f w2
lh w '
d
dh W = 8w + O(h
3
), (22)
d
dh A = 8A + O(h ),
3
(23)

where the increments £lA, 8w satisfy

B (8A) =
8w
_!:3 (~¢xxx)
'0¢xxx
(24)

Truncating the higher order terms in (21), (22) and (23) we obtain the
Ist-oder cycle velocity approximation.

Remark 1. Assuming the bordering (12) or (14), it comes out that [vxh = ~H.
Actually, for a generic bordering, [vxh is a positive multiple of ~H.
506 D. Janovska, V. Janovsky

Truncating in the expansions (17), (18) and (19) the 2nd-order, and in the
expansions (21), (22) and (23) the 1st-order terms, we get

uO(t) == UH + 2h 3{(exp(2nit) [vxh) , vO(t) == 2 3{(exp(2nit) [vxh). (25)

The claim is that choosing L properly, we get (4) and (6). In fact, due to
Remark 1, the choice e.g. L == 2~H or L == 2T}H would do.

4 Numerical tests
We consider
a
j; = XJL+(fJ +x - k) - JL-,),XYZ,

if = y( EX - ')'Z - p) ,
i = Z ((J')'y - 5) ,
which depends on ten parameters. This dynamical system models corruption
in democratic societies, see [12]. In [6]' p. 317, the system was investigated for
the parameter setting

a == 1.5, fJ == 0.5, k == 1, p == 0.1, JL+ == 1, JL- == 10, ')' == 1, (J == 2,


leaving two parameters E and 5 free. There were detected Generalized Hopf
bifurcation point (G H) as codim = 2 organizing center and a branch of Hopf
bifurcation points containing this GH. We consider the particular Hopf point
with ordinate E = 0.2. Coordinates of this point are UH = [0.7623; 0.3588;
0.0524] and 5H = 0.7176. On Fig.1, there is a branch of limit cycles emanating
from (UH, 5H). There is a turning point on the branch (LPG); the detection of
bifurcation points was switched off. The zoom of two first cycles is shown on
Fig.2. It illustrates the action of initialization procedure ini tJLLC, see Section
1.
All computations were performed using CLMATCONT, see [1] and [2]. The
toolbox works really nicely.
In order to compare the action of ini tJLLC with the 2nd-oder predictor,
we coded the algorithm from Section 3 as an m-file called ini tJLLC.new. It
is supposed to replace initJLLC in the cycle continuation. On Fig.3, there
is a sequence of the 2nd-oder cycle approximations due to ini tJLLC.new for
selected h's. The bordering (12) is considered. Fig.4 compares the actions of
ini t ..RLC (dotted) with the actions of ini tJLLC.new. Note that the dotted
cycles do not advance with parameter 5.
The parameters of both ini tJLLC and ini t..R..l.C.new are the already men-
tioned h and collocation data setting. The latter will be fixed qua nstol = 10,
neal = 4. Let us consider the particular Hopf point (UH' 5H).
Using ini t...H..l.C, one needs to set h ::; 0.0003 to achieve a successful
continuation. Choosing h too small, say h ::; 0.00001, the continuation is
A Postprocessing of Hopf Bifurcation Points 507

0.8

0.7

0.6

0.5
>.
0.4

0.3 1
0.9
0.2 0.8
0.7
0.1 0.6
0.717 0.5
x

Fil!. 1. Continuation of the limit cvcle via orthol2:onal collocations

0.361

0.3605

0.36

>-0.3595
0
0.359

0.3585 0.766
0.765
0.358
0.764

0.3575 0.763
0.7176 0.762 X
0.761

delta

Fig. 2. Zoom - the initial cycle (solid), the next cycle (dotted)

not initialized either due to the round-off errors. On the other hand, ap-
plying ini t ..ILLenew with bordering (14), the continuation is successful for
h ::; 0.002.
508 D. J anovska, V. Janovsky

0.16

0.14

0
0.12

,.,
f)
0.1

0.08

0.08

0.04
0.7
0.02
0.6

-0.02
0.4
X
0.3

0.2
0.73
delta

Fig. 3. The 2nd-order approximations of cycles; h=0.01:0.01:0.09, h=0.1:0.1: 0.3

0.08

0.Q7

,.,
0.06

0.05

0.04
0.42

0.03

0.36
0.02
0.7176
X

delta

Fig. 4. The lst-oder (dotted) vs the 2nd-oder (solid) predictors; h=0.08, h=0.09,
h=O.l
A Postprocessing of Hopf Bifurcation Points 509

Conclusion: Using the 2nd-order predictor we can afford to take a larger


amplitude h. It means, we can skip over the initial stages of the continuation.

Acknowledgments: The authors acknowledge with pleasure the support of


the Grant Agency of the Czech Republic (grant # 201/02/0844, grant #
201/02/0595) and support of the projects of the Czech Ministry of Educa-
tion: MSM 113200007 and MSM 223400007

References
1. Dhooge, A., Govaerts, W., Kuzetsov, Yu.A., Mestrom, W., Riet, A.M.:
CL.-MATCONT: A continuation toolbox III Matlab. Gent University,
http://allserv.rug.ac.be/ ajdhooge/
2. Dhooge, A., Govaerts, W., Kuzetsov, Yu.A. (2003): MATCONT: A Matlab pack-
age for numerical bifurcation analysis of ODEs. ACM Transactions on Mathe-
matical Software, 31, 141-164
3. Doedel, E.J., Govaerts, W., Kuzetsov, Yu.A.: Computation of periodic solution
bifurcations in ODEs using bordered systems. SIAM J. Numer. Anal., to appear
4. Doedel, E.J., Kernevez, J. (1986): AUTO: Software for continuation problems in
ordinary differential equations with applications. California Institute of Technol-
ogy, Applied Mathematics
5. Golubitsky, M., Schaeffer, D.G. (1985): Singularities and Groups in Bifurcation
Theory. Vol. 1. Springer-Verlag, New York
6. Govaerts, W. (2000): Numerical methods for bifurcation of dynamical equilibria.
SIAM, Philadalphia
7. Govaerts, W., Kuzetsov, Yu.A., Sijnave, B. (2000): Numerical methods for the
generalized Hopf bifurcation. SIAM J. Numer. Anal., 38, 329-346
8. Janovsky, V., Plechac, P. (1996): Local numerical analysis of Hopf bifurcation.
SIAM J. Numer. Anal., 33, 1150-1168
9. Kuzetsov, Yu.A. (1995): Elements of Applied Bifurcation Theory. Springer Ver-
lag, New York
10. Kuzetsov, Yu.A., Levitin, V.V. (1977): CONTENT: Integrated en-
vironment for analysis of dynamical systems. CWI, Amsterdam,
ftp://ftp.cwi.nl/pub/CONTENT
11. Marsden, J.E., McCracken, M. (1976): The Hopf Bifurcation and Its Applica-
tions. Springer-Verlag, New York
12. Rinaldi, S., Feichtlinger, G., Wirl, F. (1998): Corruption dynamics in democratic
societies. Complexity 3, 53-64
13. Xu, H., Janovsky, V., Werner, B. (1998): Numerical computation of degenerate
Hopf points. ZAMM Z. Angew. Mech., 78,807-821
Givens. Reduction of Quaternion-Valued
Matrices to Upper Hessenberg Form

Drahoslava Janovskci l and Gerhard Opfer2

1 Institute of Chemical Technology, CZ-166 28 Prague 6, Czech Republic,


janovskd@vscht. cz
2 University of Hamburg, Department of Mathematics, BundesstraBe 55
D-20146 Hamburg, Germany, [email protected]

Summary. The number of real operations necessary to reduce a quaternion-valued


matrix into a similar upper Hessenberg matrix by making use of quaternion-valued
Givens' transformation matrices is relatively large. Two possibilities of how to reduce
the columns of a quaternion-valued matrix more effectively are presented.

1 Introduction

In the literature, several applications of quaternions and quaternion-valued


matrices can be found. In most of them, the quaternion-valued matrix has to be
factored. One conceivable reason is that factorizations yield more time-efficient
processing schemes. Another reason is that factorizations can be chosen so as
to provide necessary geometric relationships in certain applications.
Let us give an example, which arises in quantum mechanics, see [1], [2].
In particular, solving the Secular Equation, we have to find an eigensystem of
a matrix A. If the nonrelativistic or spin-orbitless problem is considered, the
matrix A is real and symmetric. The effect of including the spin-orbit coupling
into the model is to replace each scalar matrix element with a 2 x 2 complex
matrix of the form

As a consequence, the matrix A becomes complex, in general nonhermitean


and its size is doubled, i.e. the smallest eigenvalue (rv 10- 9 ) occurs twice. In
numerical calculations, computational noise is dramatically increased.
Let iii be the space of such complex 2 x 2 matrices:

iii:= {h = (_~ ~) : a = al +ia2, f3 = a3 +ia4, al,a2,a3,a4 E 1ft}

and let IHI be the space of quaternions:


Givens' Reduction of Quaternion-Valued Matrices 511

Since 1HI and illi are isomorphic, see [4], we can replace 2 X 2 complex matrices
with corresponding quaternions and use quaternion arithmetic in calculations.
The arithmetic of quaternions is, of course, more complicated, but it allows us
to economize the storage and to increase the accuracy of results, see [1].
In this paper, our aim is to develop an algorithm for a reduction of a
given quaternion-valued matrix into its upper Hessenberg form. At first, we
shortly review the algebra of quaternions, see [6]. We also briefly mention the
algorithm of Given's reduction of a quaternion-valued vector x E 1HI 2. It can be
found in [3], where also all necessary facts are proved in detail. Then, Givens'
transformation matrices for a vector x E lHI n are defined. We discuss some
possibilities of reducing the number of necessary real operations. Finally, the
number of real operations for the reduction of quaternion-valued matrices into
a similar matrix in upper Hessenberg form will be calculated. Our aim is to
show in particular that it is necessary to develop an algorithm, which reduces
the large number of operations. A numerical example of Givens' reduction of
a quaternion-valued matrix into a similar upper Hessenberg form is presented.

2 Short review of the algebra of quaternions

Let 1HI = JR.4 be equipped with the ordinary vector space structure and with
an additional multiplicative operation 1HI x 1HI ---+ 1HI which most easily can be
defined by a multiplication of the four basis elements

(1,0,0,0) = 1, (0,1,0,0) = i, (0,0,1,0) = j, (0,0,0,1) = k


i 2 = j2 = k 2 = ijk = -1. (1)
An element x = (XI,X2,X3,X4) E 1HI has the representation

(2)
where XI,X2,X3,X4 E JR., ~x = Xl is the real part of x. We will identify
the quaternion x = (XI'O,O,O) with the real number Xl, the quaternion X =
(Xl, X2, 0, 0) will be identified with the complex number Xl + iX2. For X
(XI,X2,X3,X4) E 1HI, Y = (YI,Y2,Y3,Y4) E 1HI it follows from (1) that
XY = (XIYI - X2Y2 - X3Y3 - X4Y4) 1 + (XIY2 + X2YI + X3Y4 - X4Y3) i

+(XIY3 - X2Y4 + X3YI + X4Y2) j + (XIY4 + X2Y3 - X3Y2 + X4YI) k.


Obviously, in general, the multiplication is not commutative.
Given X according to (2), the conjugate x of X is defined to be

(3)
Let us note that conjugation obeys the rules

XY = Y x, x=X.
512 D. Janovska, G. Opfer

We define the absolute value of x by

Ixl = Jxr +x~ +x~ +x~. (4)

We will use the following properties of Ixl:

Ixyl = Iyxl = IxIIYI· (5)


Here, we use the convention that a real number x may be identified with the
quaternion (x, 0, 0, 0).
The space 1HI is a normed vector space over 1HI, where the norm is introduced
in (4).
Let us remark, that for any x E 1HI\ {O} an inverse quaternion X-I is defined,

-1 X-
X = Ix1 2 ' (6)

(7)
an equation which will be used later.
Two quaternions x and yare called equivalent, denoted by x '" y, if y =
a-lxa, for some a E 1HI\ {O}. For fixed x E 1HI the set

[xl = {y E 1HI: y = a-lxa for a E 1HI\ {O}} (8)


is called equivalence class of x.

Lemma 1. Two quaternions x and yare equivalent if and only if

~x = ~y and Ixl = Iyl. (9)

is the only complex element in [xl with non negative imaginary part.

Let x = (Xl, X2, ... , xn)T E lHI n and define the norm Ilxll of x by
n

II xii = L IXj 12 . (10)


j=l

The space lHIn becomes a normed vector space over 1HI with the norm defined
in (10). For x E lHIn, we denote by x* the transpose of the entrywise conjugate
ofx.
Givens' Reduction of Quaternion-Valued Matrices 513

We define the conjugate transposition of the matrix B = (b ij ) E lHI mxn as


B* = (b ji ) E lHI nxm . The square quaternion valued matrix B E lHI nxn is called
Hermitean if B = B*, and B is positive definite if it is Hermitean and

x* Bx >0 \:j x E lHIn\{o}.


A matrix B E lHI nxn is said to be unitary if B*B = I.
Theorem 1. A matrix B E lHInxn is unitary if and only if IIBxl1 = Ilxll for
all x E lHIn.

Definition 1. Let B E lHInxn. If there exist a vector x E lHI n \ {O} and a quater-
nion A E 1HI such that
Bx = XA, (11)
we call A an eigenvalue of B and x an eigenvector corresponding to A.

The number of eigenvalues of a quaternion valued matrix B E lHI nxn is, in


general, not finite. If A is an eigenvalue, one can easily show that the whole
equivalence class [Aj consists of eigenvalues. If A is real, then [Aj == {A}, if A
is not real, then according to Corollary 1 there exists exactly one complex
number:>: E [Aj with positive imaginary part. It can also be proved that the
number of non equivalent eigenvalues is at most n.

Theorem 2. Let A E lHI nxn be Hermitean. Then A has only real eigenvalues
and their number is n.

Theorem 3. For any unitary quaternion valued matrix A, the eigenvalues A


satisfy
IAI = 1.

3 Givens' transformation of a vector x E IHI2

Let x E 1HI 2\ {O} be given. Let G be a matrix of the form

G = (_~ ~) E 1HI 2x2 . (12)

Our aim is to find c and s such that


(i) G is unitary ,
(ii) G*x = uel, where u E lHI\{O} , el = (1, O)T.
The solution is contained in the following theorem.

Theorem 4. Let x = (Xl,X2)T E 1HI2\{O} be given. Define

G = ( -s~ s) E 1HI2x2 ,
C
514 D. Janovska., C. Opfer
X2
s=-(J'~, I(J'I = 1,

where (J' is arbitrary in case Xl = a X2,a E (JR)\{O}. Otherwise we have to


choose (J' E E, where

aXl + f3x2
E = { (J' E 1HI : (J' = I
aXI
f3 I' a, f3
+ X2
E JR, lal + 1f31 > 0} .

Then G is a unitary matrix,

G*x = U = (J'llxll(l,O?
and there are no other unitary quaternion-valued matrices satisfying this con-
dition.

Proof· See [3].


Remark 1. The set E is not empty, since it contains the subset {±sgnxI, ±
sgnx2}. But it is different from the whole unit sphere, since ±1 do not, in
general, belong to E. Let us note that s is real for the choice (J' = ±sgnx2 and
c is real if (J' = ±sgnxl.

4 Givens' reduction of a vector x E IHIn

Let x = (Xl, ... ,xn)T E lHIn\{o}. For 1:<:; i < j :<:; n, we define the Givens'
rotation matrix G; E lHInxn as follows:

1 o
. 1
Ci . . . . . . . . . Si

1 (13)
-Si ......... Ci ~ j
1
o .1
T T
j
where
(14)

(J'i E 1HI, l(J'il = 1, (J'i is arbitrary if Xi = aXj, a E JR\{O}. Otherwise, (J'i E E i ,

aXi + f3Xj
Ei = {(J' E 1HI : (J' = I
aXi + f3 Xj I' a, f3 E JR, lal + 1f31 > O}. (15)
Givens' Reduction of Quaternion-Valued Matrices 515

Then G~ is a unitary matrix and

Let us suppose that our aim is to reduce all components of a given


quaternion-valued vector except the first one. We will discuss two possibili-
ties of the ordering of such a reduction.
Let us reduce components in the order from bottom to top. In order to
decrease the number of operations, all matrices necessary for the reduction
can be multiplied to obtain just one matrix of the reduction.

Theorem 5. For a given vector x = (Xl, ... ,Xn)T E lHI n , Xn =1= 0, let

(16)

Then, there exists CY E 1HI, ICYI = 1 such that

G*x = (cyllxll,O, ... ,O)T E lHIn.

The unitary matrix

has upper Hessenberg form.

Proof. All matrices G~ in (16) are unitary, so is their product.

°
n
Let us set (!i = L IXj 12 , i = 1, ... ,n - 1. Since Xn =1= 0, also (!i =1= for
j=i
all i.
Let us start from the first reduction. We construct G~-l to reduce the last
component of x E lHI n :

where
Un-l = CYn-l VIXn_112 + IXnI 2 ,
Cl:n-1Xn-l + f3n-l x n
CYn-l = I
Cl:n-1Xn-l + f3n-l x n I' Cl:n-l,f3n-l E ffi., lCl:n-ll + lf3n-ll > o.
In the next step we continue:

where
516 D. Janovska, G. Opfer

a n -2 X n-2 + f3n-2(Jn-1(2n-l
lan -2 X n-2 + f3n-2(Jn-li?n-ll
a n -2, f3n-2 E JR., la n -21 + lf3n-21 > o.
Now, we form the product (G~=i)*(G~-l)* and continue by reducing U n -2.
By induction we obtain the following explicit formulae for the matrix G*
(gkj) E lHI nxn :

gl,j j = 1, .. . ,n;

gi+l,i i = 1, ... ,n - 2;

_ _ Xk-l(Jk-l(JkXj k_ _
gk,j - , - 2, ... ,n 1, j = k, ... ,n;
i?k-li?k

Xn~
gn,n-l =
i?n-l
Xn-l~
gn,n
i?n-l
gi,j 0, i = 3, ... , n, j = 1, ... , i - 2;

where
an-l Xn-l + f3n-l Xn
(In-l =
lan-l Xn-l + f3n-l xnl '

ai Xi + f3i(Ji+1i?i+l f3
>0
= Iai Xi + f3i(Ji+1i?i+1 I'
1T1l
(Ji ai, i E IN.., lail+lf3il for all i = n-2, ... , 1,

i.e., the matrix G* has upper Hessenberg form and

Remark 2. For a given quaternion-valued vector x E lHIn we can construct


a Householder transformation matrix H, which also reduces all components of
x except the first one: Hx = uel, u E 1HI, el = (1,0, ... , O)T, but this matrix
H is full, in general.

For a given x = (Xl, ... ,xn)T E lHIn, Xn # 0, let us choose (Ji-l = sgnxi at
each step of the reduction. Such (Ji always belongs to the set E i , see Remark
1. If we set (J = (In-l we obtain

(Ji=(J for i=1, ... , n - 1

and G*x = ((Jllxll, 0, ... , O)T = (sgnxn Ilxll, 0, ... , O)T, where
Givens' Reduction of Quaternion-Valued Matrices 517

CTXl CTX2 CTX3 CTX4 CTXs CTX n

TIXTITIXTI Ilxll Ilxll Ilxll Ilxll


(!2 X1X2 X1X3 X1X4 xlxS X1X n
------
(!l (!1(!2 (!1(!2 (!1(!2 (!1(!2 (!l (!2

(!3 x2 x 3 X2 X 4 x2 x 5 x2 x n
0
G*= (!2 (!2 (!3 (!2(!3 (!2(!3 (!2 (!3

(!n-l x n -2 x n-l xn-2xn


0 0
(!n-2 (!n-2(!n-l (!n-2(!n-l
xn(J xn-1CT
0 0
(!n-l (!n-l

The elements below the main diagonal are real.


If we reduce x in the order from top to bottom, a theorem similar to
Theorem 5 can be proved. In particular,

the resulting G* is not an upper Hessenberg matrix, but it has "nearly trian-
gular form":
* * * * *
* * 0 0 0
* * * 0 0

o
* * * ... *
the diagonal elements except the first one are real.

5 A reduction of an arbitrary quaternion-valued matrix


into upper Hessenberg form

In order to reduce an arbitrary matrix Y = (Yij) E lHI nxn to upper Hessenberg


form there are (n - 2)(n - 1)/2 elements to be reduced. They are usually
reduced step by step for example in the order

Y31, Y41,···, Ynl; Y42, Y52,···, Yn2;···; Ynn-2·

Corresponding Givens transformations G;, see (13), are then applied to k-th
column ofY, k = 1, ... ,n - 2:

x := (Ylk,.'" Yik, ... , Yjk, ... , Ynk)T, i = k + 1; j = k + 2, ... , n.


518 D. Janovska, G. Opfer

Let 1 :::; io < ]0 :::; n be fixed. Let k-th column of the matrix Y plays the
role of x in (14), (15).
The corresponding Givens' matrix for the reduction of the element Yjo,k
has the form

Gio = (g.) E JH[nxn (9ioi o 9i OjO )


JO tJ 'gjoio gjojo

gij iSij otherwise.

Here (for a properly chosen O'io),

Let us count the number of real operations necessary for the reduction of
an arbitrary matrix Y E JH[nxn to upper Hessenberg form. Moreover, let our
resulting upper Hessenberg matrix be similar to the original one, so that we
can use the reduced form to compute the eigenvalues of the matrix Y. Let us
remark that 1 addition of two quaternions needs 4 real flops, 1 multiplication
of two quaternions needs 28 real flops. For the reduction of one element of Y
we have to perform two matrix multiplications:

4n multiplications
V = (G~~)*Y of quaternions
2n additions
V Gio 4n multiplications
of quaternions
Jo 2n additions

All together, the reduction of one element of Y needs 240 n real flops. We can
substantially reduce the number of operation (to 144 n real flops) if s or cis
real.
To obtain the upper Hessenberg form we have to reduce (n - 2) + (n - 3) +
(n-2)(n-1)
.. +2+1 = 2 elements. Let r be the product of the corresponding
Givens' matrices:

r = G~G~ ... G;G~G~ ... G; ... G~-2.

We denote the resulting upper Hessenberg matrix by W. After all reductions


we obtain

W = r*Yr , W rv Y, W is an upper Hessenberg matrix.


Example 1. The elements of the following 5 x 5 quaternion-valued matrix Y
are chosen randomly (integer elements in (-5,5)). Let Y =
YII Y12 YI3 Y14 VIS Y21 Y22 Y23 Y24 Y25 V3l Y32 V33 Y34 Y3S Y41 Y42 Y43 Y44 Y45 US1 Y52 YS3 Y54 Y55j

( 5 4 3 -3 -2 0 a 1 4 -5 3 3 -4 -2 2 4 1 -1 3 3 -5 -2 5 -2 -5
o 2 3 4 0 0 0 -4 -1 0 -3 -3 1 0 -4 -5 4 -1 4 -1 -1 -2 -2 -4 a .
-4 5 4 2 -4 -3 3 -4 2 4 2 -3 -3 3 3 -4 0 -1 3 -1 3 0 3 -2 1
-4 3 -4 -1 0 4 2 1 -4 3 -3 0 4 -2 0 2 0 4 4 -3 -2 -3 4 2 2
Givens' Reduction of Quaternion-Valued Matrices 519

Then W := r*yr =
Wll W12 WI3 W14 w15

( 5.0000 1.3205 -2.9935 1.77072.3056 10.0000 0,0833 -0.8560 -3,3842 -0.4310


02.7692 1.6199 -5.2047 1.0482 2.0000 3.5534 4.3072 4.6697 1.8606
-4.00006.5897 -0.8168 0.7781 3.7071 -6.0000 1.9352 -3.9121 -1.1416 -3.9023
-4.0000 1.5513 5.8488 0.7602 0.8553 4.0000 -0.1335 4.6311 -0.2449 0.6140

W43 w44 w45 w51 w52 w 53 w54 W55)

o -4.6098-4.4234-0.9866-1.7903 0 o -2.4908-2.4512 1.5053 0 0 0 -2.0766 0.7912


o -3.7619 4.5482-2.9750-2.0253 o -2.2011 2.3583-2.0050 0 0 0 2.9598 0.6758 .
o -3.4546-2.0860 0.5185 3.4334 0.7713 3.6406-2.9308 0 0 0 -4.8980 4.6884
o -6.2755 1.5631-0.8724-2.2988 a 6.8124 1.1480-1.3262 0 0 a -5.2663-0.0303

Thus, the structure of W is

* * * *
* * * *
(
o * * *
00* *
000 *
just an upper Hessenberg matrix.

6 Conclusions
We have developed an algorithm for reducing a quaternion-valued matrix into
a similar upper Hessenberg matrix by making use of quaternion-valued Givens'
transformation matrices. The algorithm was written and tested in MATLAB.
It is a generalization of algorithm given in [3]. The number of real operations
necessary to perform this reduction is still too large. We have suggested to
reduce the number per each column in Section 4.
We are convinced that a good way to reduce the number of real operations
is to try to apply Fast Givens' algorithm, which is used in the real case.
A complex version of Fast Givens' algorithm can be found in [5] (written
in Chinese). This is the only paper dealing with Fast Givens' transformations
applied to complex vectors, which we were able to find.

Acknowledgement
The authors acknowledge with pleasure the support of the Grant Agency of
the Czech Republic (grant # 201/02/0844, grant # 201/02/0595) and support
of the project MSM 223400007 of the Czech Ministry of Education.

References
1. Dongarra, J.J., Gabriel, J.R., Koelling, D.D., Wilkinson, J. H. (1984): Solving
the Secular Equation Including Spin Orbit Coupling for Systems with Inversion
and Time Reversal Symmetry. J. Comput. Phys., 54, 278-288
520 D. Janovska, G. Opfer

2. Dongarra, J.J., Gabriel, J.R., Koelling, D.D., Wilkinson, J. H. (1984): The


Eigenvalue Problem for Hermitian Matrices with Time Reversal Symmetry.
Linear Algebra Appl., 60, 27-42
3. Janovska, D., Opfer, G., (2003): Givens' transformation applied to quaternion-
valued vectors. BIT
4. Van der Waerden, B.L. (1960): Algebra I, 5. Auti. Springer, Berlin, Gottingen,
Heidelberg
5. Xu, L. (1988): Eine schnelle Givens Transformation fUr komplexe Matrizen (in
Chinese). Journal of East China Normal University (Natural Science), 3, 15-21
6. Zhang, F. (1997): Quaternions and matrices of quaternions. Linear Algebra
Appl. 251, 21-57
Model of Compressible Flow and Transport
in a Time-Dependent Domain*

Pavel Jininek, Jifl Maryska and Jan Sembera

Technical University of Liberec, HaJkova 6, 461 17 Liberec, Czech Republic


jan. [email protected]

Summary. In the contribution, an introduction of the model of compressible flow,


transport of mass and energy, and production of energy in a time-dependent domain,
which is being developed at Technical University of Liberec, is done. The main focus
is concentrated on the finite volume model of advection-diffusion mass transport. The
explicit upwind scheme is used, therefore the computational time step is restricted by
the stability condition and the phenomenon of numerical diffusion may be significant.
A scheme for reduction of numerical diffusion is proposed and numerical tests are
presented.

1 Introduction

The final aim of our modelling is to predict production of nitrogen oxides from
an internal combustion engine. For this purpose, a precise chemical reaction
model should be developed. Its main inputs would be flow, temperature, and
pressure fields and their development in time. This contribution is devoted to
a model of relevant physical processes, which is being developed at Technical
University of Liberec and whose outputs would form such inputs to the precise
chemical reaction model. It is being built as a model of compressible flow,
transport of mass and energy, and production of energy in a time-dependent
domain.
A short overview of the model is done in the next section. The rest of the
paper a bit more precisely discusses the finite volume mass transport model,
a method of reduction of numerical diffusion, and its testing.

2 Overview of the model

The volume and shape of the cylinder of engine changes in time. To simplify the
problem, we discretize it in time and in each time step we split the solution into
two stages, isochoric and adiabatic one. All modelled processes are computed
in the isochoric stage, supposed to take place in a fixed domain:
* This work was supported with the subvention from Ministry of Education
of the Czech Republic, project code 242200001.
522 P. Jininek et al.

- production of mass and energy by chemical reactions,


- compressible flow of gas mixture,
- mass and energy transport.
All of them are discretized in time and space and solved by either finite volume
or finite element method. The spatial computational mesh, built up of trilateral
prismatic elements/volumes in layers, is common to all models (see Fig. 1).
The adiabatic stage models an immediate change of volume. Its key pro-
cedure is change of computational mesh. A more detailed description of the
setting of the model can be found in [5].

Fig. 1. A schematic example of computational mesh in four time steps

2.1 Model of chemical processes

All chemical processes are described by the set of stechiometric equations where
each one can be written as

where ai and bj are stechiometric coefficients of reagents Ai and products B j ,


respectively. During this reaction, the heat q is produced. It can be positive
or negative depending on the type of such reaction. Local kinetic equations
for a computation of mass and energy production can be expressed by linear
ordinary differential equations

dCi
-=-Rm·a· dT = R-.!L
dt ' " dt Cv'
where mi is the molar mass of reagent A and C v is the heat capacity of the
gas mixture. There are an application of the simplifying assumption that the
reaction rate R depends only on the mass fraction of a chosen gas compo-
nent and results of its calibration presented in [4]. One calibration result is
illustrated on Fig. 2.
Model of Compressible Flow and Transport in a Time-Dependent Domain 523

150 250 350 450 550 150 250 350 450 550
Rotation of crankshaft (deg] Rotation of crankshaft [deg]

Fig. 2. Results of calibration of simplified reaction model. Bold lines are computa-
tional results, thin lines are measured data.

2.2 Fluid flow model

The model of flow of compressible gas mixture is governed by the set of N avier-
Stokes equations, the continuity equation, and the state equation of perfect
gas:
~~ + (v . \7)v = vL1v + ii\7(\7 . v) - 1. \7p,
a e
a~ + \7 . (ev ) = 0,
p = RTe.
Here, v is the velocity vector, p pressure of gas, e its density, and v and ii are
viscosity coefficients, R is the molar gas constant, and T is temperature.
Nonlinear system is discretized by mixed hybrid finite element method and
linearized. Formulation of the model is set in [5], global behaviour tests were
succesfully performed, further testing is being in process.

2.3 Model of mass and energy transport

Let [l C R3 represents the domain of the interior of the engine cylinder (since
all processes are solved as isochoric processes in the fixed domain, we also omit
the time dependence of [l in our notation). The boundary of [l is divided into
two disjoint parts r in and rex. The part r in represents the inlet part of the
boundary, the remaining part is denoted by rex. The problem is solved in the
time interval (0, t)
The mass transport is governed by the set of mass balance equations for
each component of a gas mixture [1]:

8(e i) .
----at
C
+ \7 . (fCi + Ji) + Ci/- = cVi+ in [l, i = 1, ... , N (1)

in a given flow field f(x, t) for unknown functions e(x, t), Ci(X, t) (i = 1, ... , N).
N is the number of components of gas mixture, 1'+ denotes the density of
sources (with defined mass fractions Ci,*) and 1'- is the (positive) density of
sinks. The diffusion flux ji is given using the effective diffusivity Vi by the
Fick's-law-like relationji = -eVi\7ci slightly corrected as proposed by Sutton
and Gnoffo in [6].
524 P. Jininek et al.

The energy transport is derived from the balance of the internal energy,
which can be written with respect to the temperature as [1]

Here, T* is the temperature of inflowing fluid. The terms n rev and nirr express
reversible and irreversible rates of conversion between mechanical and internal
energy and can be written as

n rev = - ~: (\7. v), nirr = ~ ~ [avj + aVi _ ~(\7 .


~
Cv i,j=l ax· axJ 3
v)] aVj,
ax
t ,

where /-l is viscosity coefficient. In these formulas, the ideal behaviour of gas
and the Newtonian fluid are supposed.
The boundary conditions are set of Dirichlet type at inlet part of the bound-
ary:

Ci(X, t) = Ci,D(X, t), i = 1, ... , N, }


e(x, t) :: eD(X, t), x E Tin, t E (0, f) (2)
T(x, t) - TD(X, t),

and of Neumann type at the rest part of boundary:

\7Ci(X, t) . n(x) = 0, i = 1, ... ,


\7T(x, t). n(x) = 0,
N} x E
r.
ex, t E
(0 j:\
, t;, (3)

where n(x) is the outward normal to Tex.


The initial conditions are set as

e(x,O) = eo(x), Ci(X,O) = CO,i(X), i = 1, ... , N, T(x,O) = To(x). (4)

3 Numerical model of mass transport

In this section, we focus on the numerical model of mass transport. The energy
transport model is very similar.
Before starting with the numerical scheme, let us make a note about the
space decomposition. The presented model use the meshes consisting of tri-
lateral prisms with parallel bases. The base of the cylider is decomposed into
the set of triangles and then is the triangulation extruded along the z axis up
to the height of the cylinder. It should be noted that the 2-D mesh must sat-
isfy some conditions needed for a consistency of the numerical scheme. These
conditions can be found e.g. in [2] as the definition of admissible mesh.
Using the notation from [2], let us introduce the set of control volumes T,
the set of their faces £ and the set of points P such that each point XK E P can
Model of Compressible Flow and Transport in a Time-Dependent Domain 525

be uniquely assigned to One control volume K E T. Suppose the mesh {T, E, P}


to approximate domain [l and to be admissible. Let us denote m(K) volume
of K E T, m(cr) area of cr E E, N(K) the set of adjacent control volumes of
K E T, EK the set of sides common to the volume K. For adjacent control
volumes K and L let KIL E EK be their common side. Let dK,a- be the distance
between the point XK and the side cr. Let dK1L denote the distance between
XK and XL.

°
The time discretization is realized by the ascending sequence of time values
(tn)nENo, to = with the time step Lltn = tn+l - tn.
Explicit finite volume scheme of the problem (1)-(4) can be written as
follows:

+ L.-t
'"' pni,K,a + '"' inK,a cni,a,+
~
(5)
a-EEK a-EEK
= m(Khxc?'K,+, i = 1, ... , N, K E T,

where I2f K = I2K cf K means partial density of ith specie in the finite volume K.
We use ~uch a notation that e.g. I2x approximates I2(XK, t n ). Mass fractions
in the advective and source term are expressed following the upwind scheme
as

if i'K,a- :::: 0,
if i'K a- < 0, if IX :::: 0, (6)
cr ~ KIL ~ u[l, if IX < 0.
if i'K,a- < 0, cr C lin,

The diffusion term is approximated by

for cr = KIL ~ u[l,

for cr C lin, (7)


for cr c rex.

Final decomposition into the density and mass fraction is computed as follows:
N
I2 nK+ 1 -- 'L..t n +1
" ' I2 i,K'
i=l

The scheme is conservative. The main disadvantage of the proposed model is


the strong restriction for the selection of the time step Lltn. A simple analysis
shows that the condition
526 P. lininek et al.

for all K E T and i = 1, ... ,N assures stability ofthe scheme in incompressible


stationary flow field. It leads to the h 2 -stability of the scheme (h has meaning
of the largest diameter of finite volumes of the mesh).
One more disadvantage of the scheme arises from using of the upwind ap-
proximation of the advective term: numerical diffusion. The analysis of 1-D
scheme in [3] shows that the 1-D upwind numerical solution of the advection-
diffusion equation with diffusivity V with equidistant finite volume mesh ap-
proximates the exact solution with the diffusivity V + V num , where

V num = ~vh (1 - v ~t) . (8)

Here v is the velocity and h is the length of finite volumes.


Our approach to the elimination of numerical diffusion is just to subtract
the estimated numerical diffusivity from all physical coefficients. The numer-
ical diffusivity is estimated on each inter-volume face of the mesh separately.
The substraction cannot be performed arbitrarily but a non-negativity of the
resulting coefficients must be achieved, i.e. in the scheme, the coefficient

Vred = max{V - V num , O} (9)

is applied instead of V.
Additionally, the estimate (8) of V num should be extended to more-
dimensional case. It is made by using the same formula (8) and estimating
the 1-D parameters v and h in the following way: On each face (J = KIL, v is
set equal to the magnitude of the dominant velocity in small neighbourhood
and h is estimated as the length of the projection of the vector (XL - XK) to
the direction of the dominant velocity.

4 Tests of numerical diffusion

4.1 1-D test

Our 1-D test problem of advection-diffusion transport of coloured fluid is

8c 8c 82c
8t (x, t) + v 8x (x, t) - V 8x 2 (x, t) = 0, (x, t) E (-00, +00) x (0, l) (10)
c(x, 0) = M 8(x - xo), (11)

where 8(x) denotes the Dirac function. The analytic solution of the problem
is
M [(X-xo -vt)2]
can (x, t) = c;r:;u exp - V ' (12)
2y7rVt 4 t
where M is the initial mass of coloured fluid in the domain.
Model of Compressible Flow and Transport in a Time-Dependent Domain 527

Fig. 3. I-D test mesh

The solution of the problem (10)-(11) is compared with numerical solution


of its 2-D finite-domain approximation:
oc
ot (x, y, t)+v-\lc(x, y, t)-Vi1c(x, y, t) =0, (x, y, t) E (0, d) x (0, w) x (0, t) (13)

8c(x,O,t)
8y
= 8c(x,w,t)
8x
°
= ,x E (0 "d) t E (0 ,"J
f\ c(x
0) _ {I ,x,
( y) E K 0
c(O, y, t) = 8c(~~,t) = 0, Y E (0, w), t E (0, t) ,y, - 0, (x, y) tf- Ko '
(14)
where Ko is the triangle including point (xo, w/2). The problems are compa-
rable, if parameter M in (11) has meaning of area of the finite volume Ko (see
Fig. 3).
The tests were performed using the following parameters: d = 20, Xo =
d/4 = 5, h = diN (where N is the number of control volumes). The mesh
consisted of equilateral triangles, therefore w = h cos ~.
In Tab. 1, there are results of several computations for comparison summa-
rized. The error E is computed as the L1 norm of the difference of analytical
solution (12) and the numerical solution of (13)-(14): E = Ilcnum(x, y, t) -
can (x, t)IIL1((O,d)x(O,w)), where Cnum was computed numerically either without
[E(without)] or with [E(with)] the proposed reduction of numerical diffusion
(9).

4.2 2-D test

The other presented test problem is a 2-D extension of the previous one. It is
given as
oc
at (x, t) + \l. (c(x, t)v) - V\l2c(x, t) = 0, (x, t) E R2 X (0, t) (15)
c(x,O) = M 8(lx - xol) (16)

with the analytical solution can (x, t) = 4'%t


exp [_lx-~;~vtI2]. The solution
is compared with numerical solution of the finite-domain approximation of the
problem (15), (16) defined as follows: Governing equation (15) is supposed
to hold in the domain [l shown on Fig. 4. The domain is decomposed to
the set of 800 equilateral triangles (their side length is 1). The point Xo =
(5, 5v3/2). The boundary and initial conditions of the finite-domain problem
approximation are:
528 P. Jininek et al.

Table 1. Comparison of numerical results of 1-D tests for various input parameters
N (number of elements), v (velocity) and V (diffusivity) performed without and with
the proposed reduction of numerical diffusion

E(without)
N v V E(without) E(with) E(with)

20 1 0.1 8.49.10- 3 7.61.10- 4 11.15


20 1 0.2 6.54.10- 3 5.86.10- 4 11.15
20 1 0.4 4.40.10- 3 3.58.10- 4 12.28
40 1 0.1 3.21.10- 3 2.06.10- 4 15.58
40 1 0.2 2.23.10- 3 1.33.10- 4 16.78
40 1 0.4 1.35.10- 3 7.24.10- 5 18.68
80 1 0.1 1.11 .10- 3 4.83.10- 5 22.96
80 1 0.2 6.97.10- 4 2.35.10- 5 29.61
80 5 0.2 1.73.10- 3 8.98.10- 5 19.22
80 5 0.4 1.26.10- 3 5.76.10- 5 21.87
80 10 0.4 1.73.10- 3 8.98.10- 5 19.22

Vc(x, t) . n(x) = 0, X E rex, t E (0, l)


c(x, t) = 0, x Erin, t E (0, l) c(x,O) =
{I,0, xx rf.E Ko
Ko ' (17)

where Ko is the finite volume including point Xo and the boundary is split into
inlet and rest part due to direction of velocity.

Fig. 4. 2-D test mesh

The solution is computed for three different homogeneous velocity fields


differing by direction of the velocity (on Fig. 4, they are denoted as VI, V2 and
V3), magnitude of velocity and diffusion coefficients. The error is computed
analogically to the 1-D case: E = Ilcnum(x, t) -can(x, t)IILI(st). The comparison
results are collected in Tab. 2. The notation is the same as in 1-D case.
From Tab. 2 it can be seen that for advection-dominated transport, our
method of reduction of numerical diffusion is not very efficient: In case of the
Model of Compressible Flow and Transport in a Time-Dependent Domain 529

Table 2. Numerical results of 2-D tests for various magnitudes and directions of
velocity v and diffusion coefficients 1)

E(without)
Ivl direction 1) E(without) E(with) E(with)

1 lor2 0.2 5.66.10- 5 7.81.10- 6 7.25


1 3 0.2 7.83.10- 5 1.74.10- 5 4.50
1 lor 2 0.1 9.21.10- 5 1.59.10- 5 5.79
1 3 0.1 1.26.10- 4 5.10.10- 5 2.47
5 lor 2 0.2 1.42.10- 4 6.97.10- 5 2.04 (34% red.)
5 3 0.2 2.05.10- 4 1.65.10- 4 1.24 (63% red.)

last two tests, the numerical diffusion coefficients respective to 34% or 63%
mesh sides were smaller than the physical diffusion coefficient and could be
fully substracted, the resting diffusion coefficients were set to 0 due to (9).

5 Conclusions

In the contribution, especially the advection-diffusion model of mass transport


of gas mixture was discussed. The stability condition and the simple method
for reduction of numerical diffusion were presented. Numerical results show
that the proposed method of reduction of numerical diffusion works quite well
in I-D and 2-D cases. Since it is not efficient in advection-dominated transport
problems, we are looking for a more sophisticated and successful one. Therefore
we plan to concentrate on development of a more acurate numerical scheme
for this purpose.
Also other parts of the model are being developed. Especially calibration
tests of production of energy are continuing with more natural conditions of
ignition and reaction parameters and local tests of the flow model are being
performed.

References
1. Bird, R.B., Stewart, W.E., Lightfoot, E.N. (2002): Transport phenomena. Wiley,
New York
2. Eymard, R., Gallouet, T., Herbin R. (1994): Finite volume methods. In: Ciar-
let P.G., Lions, J.L.(eds) Handbook of Numerical Analysis, Vol. VII, 713-1020.
North Holland, Amsterdam
3. Odman, M.T. (1997): A Quantitative Analysis of Numerical Diffusion Introduced
by Advection Algorithms in Air Quality Models. Atmospheric Environment, 31,
No. 13, 1933-1940
530 P. Jiranek et al.

4. Sembera, J., Maryska, J. (2002): On the Local Model of Energy Production Inside
a Combustion Engine. In: Prihoda, J., Kozel, K. (eds) Proceedings of Seminat
Topical Problems of Fluid Mechanics 2002, 71-74. Institute of Thermomechanics
AS CR, Prague
5. Sembera, J., Maryska, J., Novak, J. (2003): FEM/FVM Modelling of Processes
in Combustion Engine. In: Chen, Z., Glowinski, R., Li, K. (eds) Current Trends
in Scientific Computing. Contemporary Mathematics, 329.
6. Sutton, K., Gnoffo, P.A. (1998): Multi-Component Diffusion with Application To
Computational Aerothermodynamics. 7th AIAA/ ASME Joint Thermophysics
and Heat Transfer Conference, Albuquerque, NM. AIAA 98-2575.
Numerical Study of Convection of
Multi-Component Fluid in Porous Medium

Olga Kantur and Vyacheslav Tsybulin

Rostov State University, Department of Mathematics and Mechanics, ul. Zorge 5,


Rostov-on-Don, 344090 Russia
[email protected], [email protected]'U.ru

Summary. Planar problem of convective filtration of incompressible multi-compo-


nent fluid saturated a porous medium is studied. The combined spectral and finite-
difference approach is applied to compute nonstationary regimes and continuous
families of steady states with variative spectrum.

1 Introduction

Convective flows in a porous medium is a subject of many works [1]. For the
convective filtration of viscous fluid in porous medium (Darcy model) it was
observed an appearance of one-parameter family of steady states with the
spectrum, which varies along the family [2]. V. Yudovich shown that these
families cannot be the orbits of any symmetry group and derived the theory of
cosymmetry [3, 4]. Investigation of continuous families of steady states may be
performed only experimentally or numerically. Computer modelling of Darcy
convection was done by spectral method [5], finite-difference method [6]' and
combined spectral and finite-difference approach [7, 8]. It was found that keep-
ing a cosymmetry in finite-dimensional approximation of differential equation
is extremely important to correct computation of continuous families of steady
states [6, 7]. In the present work we give an extension of approach [7] for the
convection of multi-component fluid obeying Darcy law.

2 Darcy convection problem

The equations of filtrational convection of multi-component fluid [9] in dimen-


sionless form may be written as

fJrO~ = KrL10r + Ar1/;x + fJrJ(1/;, or) == pr, r = 1, ... , S. (1)


s
o= L11/; - L)~ == c. (2)
r=l
or = (V, r = 1, ... , S, 1/; = 0 on aVo (3)
532 O. Kantur, V. Tsybulin

Here, x and yare Cartesian coordinates on a plane, t is time, L\ is the


Laplacian, J(1jJ, ()) = 1jJx()y - 1jJy()x denotes the Jacobian operator, 1jJ is the
streamfunction, ()l is the temperature, ()r (r = 2, ... , S) are concentrations.
This problem is characterized by the following parameters: "'r = Xr/ lI , fJr,
Ar = g'r/rA r l2K/1I 2,r = 1, ... ,S, where Al is Rayleigh number and Ar - pa-
rameters for concentrations (r = 2, ... , S). Here for each r we have expan-
sion coefficient 'r/T) gradient of distribution for species AT) diffusivity coeffi-
cient XT) kinematic coefficient fJr, and g is the acceleration due to gravity, l -
height of enclosure, K - permeability, 1I - viscosity. We consider the enclosure
V = [0, a] x [0, b] and CPr are given and do not depend on time.
Co symmetry for underlying system is given by (1jJ, -"'l()l, ... -"'S()S). Really,
multiply (1) by 1jJ and (2) by -"'r()r, sum and integrate over domain V. Then,
using integration by parts and Green's formula we derive

L
S

r=l
1D
S
(pr1jJ - GL"'r()r)dxdy = 0.
r=l
(4)

3 Method of solution
Spectral and spectral-difference methods are the powerful tools for solving
problems in mathematical physics [10]. We apply here the spectral-finite-
difference method in the form derived in [7]. Solution to the problem (1)-(3)
is seeking in the form:
m .
{()\ ... ,()s,1jJ} = L{()j(x,t), ... ,()J(x,t),1jJj(x,tn sin 7r~y. (5)
j=l

Substituting (5) into (1 )-(3) and performing projections, we obtain [7]:

fJsBj = "'s()7 S - Cj "'s()j + As1jJ; - fJrJj =. Fj, j = 1, ... ,m, (6)


0=1jJ7-Cj1jJj-();"=.Gj, j=I, ... ,m, (7)
()j(t, 0) = ()j(t, a) = cps, 1jJj(t, 0) = 1jJj(t, a) =0, j=I, ... ,m. (8)
Then, we define the grid w = {Xk = kh,k = O, ... ,n,h = a/(n + on In
the segment [0, a] and introduce the notion: ()'j,k = ()'j(Xk,t), 1jJj,k = 1jJj(Xk,t),
JJ,k = Jj(Xk, t). Using the centered finite-difference operators of second order
accuracy on the three-point stencil we obtain a system of ordinary differential
equations
. . - ]()r, k+ 1 - 2()r]k,+]()r, k- 1 - . ()r +
fJ r ()]k - "'r h2 C]"'r jk (9)

+Ar 1jJj,k+l 2h
-1jJj,k-l
-
fJ Jr -+.r
r j,k =. 'l'ljk,
Numerical Study of Convection of Multi-Component Fluid 533

(10)
BT _ BT
+ J,k+l 2h J,k-l =_ 'f'2jk,
A-
r = 1, .. . ,S.

Henceforth, the dot is a derivative with respect to t, Cj = j2Jr2 /b 2, and JJ,k is


expressed as

Jjk = 2; (~~j'i'k + ~ (j,i,k) ,

~j,i,k = 2i; j [ds,k(B~+j' 'lj;i)-ds,k(B~, 'lj;i+j)]- ~[da,k(B~+j' 'lj;i)+da,k(B~, 'lj;i+j)],


(j,i,k = j ~ i [ds,k(BL 'lj;j-i) + da,k(B~, 'lj;j-i) - ds,k(Bj_i, 'lj;i) + da,k(Bj_i, 'lj;i)] ,
where operators da,k and ds,k are derived in [7] using the requirement that
discrete version of (4) took place
BT _ BT 01. 01.
d (BT 01.) = k+l k-I ol• _ Br 'f'k+l - 'f'k-l
a,k , 'f' 2h 'f'k k 2h '

d (BT 01.) = 2B k+I'lj;k+1 + 'lj;k(B k+1 - Bk-l) + Bk('Ij;k+1 - 'lj;k-l) - 2Bk-I'lj;k-1


s,k ,'f' 6h .
Excluding stream function 'lj;j,k we may rewrite the system (9)-(10) in a vector
form

where

The matrix A consists of m three-diagonal submatrices, the nonzero entries of


the skew-symmetric matrix B = {b sr }~,';'=l are given by bs,s+1 = -bs+1,s =
h/2, s = 1, ... ,nm - 1, and L presents the nonlinear terms in (9).
The system (11) has a trivial solution with zero velocity and linear temper-
ature and concentration profiles. The integration is performed by the fourth
order Runge-Kutta method, and the family is calculated by using the algorithm
[5, 7]. Starting from the vicinity of unstable zero equilibrium we integrate the
system (11) up to a point close to a stable equilibrium on the family. Then
of the algorithm to family computation may be formulated as the sequence of
the following steps. Correct the point using the modified Newton method. De-
termine the kernel of the matrix of linearization at given point and predict the
next point on the family. Repeat these steps until a closed curve is obtained.
534 O. Kantur, V. Tsybulin

4 Numerical Results

We explored the derived technique to calculate the families consisting of both


stable and unstable equilibria. The narrow enclosure (bja ~ 1,a = 1) is consid-
ered and the case of f3r = 1, ""r = 1 is analyzed. We found that a scenario of the
appearance of a family of the steady states in the case of multi-component fluid
with co-directed temperature and concentration gradients is similar to the case
of one-component fluid [11]. Firstly a family consisted of stable steady states

"2=-5 ;\=5
120 80
c 2/-)..~·'\ C 3 ....... · . . . . 4
/. . ....
1 1 /0\ 4 \
2/ ~.;;
d b .l \
\ 1 1
I l
\ I I '\

'0
I I \
I I \ I \ '\

Nu
v 0 e I
~ E Nuv 0 I \ \
I
I I
I
I /
\ \
\ \ 1
\ \
1
\ '\ ~,
\ "- .1
'\. .... i . . . ·_· "-

-120 -80
-40 100 NU h 240 380 -20 70 NU h 160 250

Fig. 1. Evolution of stationary regimes family; S = 2, b = 2, a = 1, left (A2 = -5):


A1 = 72 (curve 1), 100 (2), 120 (3), 138.1 (4), right (A2 = 5): A1 = 40 (curve 1),60
(2), 80 (3), 95.5 (4)

branched off from the state of rest (zero equilibrium) as result of monotonic
instability. With increasing the Rayleigh parameter Al the family deforms, and
then on it the arches of unstable equilibria occur. In Fig. 1 the families for the
case of two-component fluid and the container with b = 2 are given as the
projections on a plane NUh and Nu v :

At the critical value Au two unstable points appear on the family. For
instance, for A2 = -5 we found this instability at Au = 138.1 (circles A and
E on Fig. 1 at the left), and for A2 = 5 - respectively at Au = 95.5 (asterisk
on Fig. 1 on the right). We present in Figs. 2 and 3 the streamlines, isotherms
and isolines for concentration corresponding the letters on Fig. 1.
At large negative gradient of concentration and small diffusive coefficients
the state of rest loses stability by oscillatory manner. We have observed here
a new scenario of convective regimes development. Convective transitions are
Numerical Study of Convection of Multi-Component Fluid 535

a b c d e

II

III

o '--------'
o
Fig. 2. Stream functions (I), temperature (II), concentration (III) for some regimes,
S = 2, b = 2, a = I, )'1 = 72, A2 = -5

organized using nonstationary regime branched off from zero equilibrium (state
of rest), and also two families of the stationary solutions were born 'from
air'. The given scenario is realized for the cases of two-component and three-
component fluids. We draw the development of convective regimes for two-
component fluid in Fig. 4; the parameters are the following A2 = -10, "'2 = 0.3
and b = 2.
The state of rest is stable up to A1 ~ 77 and simultaneously there exist
two families of steady states originated via 'out of thin air' bifurcation [12].
Each of these families consists of stable and unstable arches (here we mean
transversal stability or instability with respect to a family). State of rest loses
its stability via Poincare-Andronov-Hopf bifurcation (oscillatory instability)
and the stable limit cycle is formed (curve 3 in Fig. 4). Increasing A1 both
families become more complicated and at A1 ~ 79.4 collide one with another.
Then two new families are appeared: wholly unstable (curve 4) and partially
stable (curve 5). Further increasing of A1 leads to the reduction of unstable
arches and their disappearance. After that, on the interval 80 < A1 < 156.5
given family is wholly stable. Limit cycle on small interval of A1 undergoes
536 O. Kantur, V. Tsybulin

A B c D E
2,-------,

II

III

o
o
Fig. 3. Stream functions (I), temperature (II), concentration (III) for some regimes,
S = 2, b = 2, a = 1, Al = 138.1, A2 = -5

a sequence of bifurcations leading to the chaotic regime. The last collides with
the unstable family (curve 4) and disappears. This family reduces and vanishes
at ),1 = 85 in the result of collision with the state of rest. It corresponds to the
transition of two eigenvalues from left half plane to right one along real axis.
Then we observe the growing of main family and an appearance of four arches
of unstable equilibria as a result of oscillatory instability at ),1 = 156.6.

5 Conclusion
The combined spectral and finite-difference approach gives us an opportunity
to compute continuous families steady states in the planar problem of con-
vective filtration of incompressible multi-component fluid saturated a porous
medium. We study the scenario of transformation of convective regimes and
paricularly the onset of instability on the family of steady states with variative
spectrum.
Numerical Study of Convection of Multi-Component Fluid 537

Acknowledgements

This work was partially supported by the Program "Russian Universities -


Fundamental Studies" (# UR.04.01.063) and Russian Foundation for Basic
Research (# 02-01-00337).

References

1. Nield, D.A., Bejan, A. (1999): Convection in Porous Media. Springer, New York
2. Lyubimov, D. V. (1975): On the convective flows in the porous medium heated
from below. J. Appl. Mech. Techn. Phys., 16, 257-261.
3. Yudovich, V. I. (1991): Cosymmetry, degeneration of solutions of operator equa-
tions, and the onset of filtration convection. Mat. Zametki, 49, 142-148
4. Yudovich, V. I. (1995): Secondary cycle of equilibria in a system with cosymmetry,
its creation by bifurcation and impossibility of symmetric treatment of it. Chaos,
5, 402-441
5. Govorukhin V. N. (1998): Numerical simulation of the loss of stability for sec-
ondary steady regimes in the Darcy plane-convection problem. Doklady Akademii
N auk, 363, 806-808
6. Karasozen B., Tsybulin V.G. (1999): Finite-difference approximation and cosym-
metry conservation in filtration convection problem. Physics Letters A, 262, 321-
329
7. Kantur O. Yu., Tsybulin V.G. (2002): Spectral-difference method to the compu-
tation of convective motion of the fluid in porous medium and preservation of
cosymmetry. J. Comput. Mathematics and Math. Physics, 42, 878-888
8. Kantur O. Yu., Tsybulin V.G. (2002): Filtration-convection problem: spectral-
difference method and preservation of cosymmetry. ICCS, LNCS 2330, Springer-
Verlag Berlin Heidelberg, 432-441
9. Yudovich V. I. (2001): Cosymmetry and convection of multi-component fluid in
porous medium. Izvestiya Vuzov, SKNTs Vsh., Special Issue, Mat. model., 174-
178
10. Canuto C., Hussaini M.Y., Quarteroni A., Zang T.A. (1988): Spectral methods
in fluid dynamics. Springer-Verlag
11. Kantur O. Yu., Tsybulin V.G. (2003): Computation of steady states family of
filtration convection in tall enclosure. Applied. Mechanics and Technical Physics,
44,228-235
12. Kurakin L.G., Yudovich V.I. (2000): Bifurcations accompanying monotonic in-
stability of an aquilibrium of a cosymmetric dynamical system. Chaos, 10, 311-
330
538 O. Kantur, V. Tsybulin

A1=76.5 A1=77.5
120 120

~1 ~1
Nu v 0 + Nu v 0 (3
~2 ~2
-120 -120
-6 8 NU h 22 36 -6 8 NU h 22 36

A1=79 A1=79.5
120 120

Nu v Nu v 0
" \
14 5
I
I
\I

-120 -120
-6 8 NU h 22 36 -6 8 NU h 22 36

A1=80 A1=83
120 120

r\
'\ I
Nu v 0
I
I :4
Nu v 0
II
5
5 114
I f f
1/

-120 L
-6
__ -=====::::::::;====-_J
8 22 36
_120L---~":::::=====-_.-.J
-6 14 34 54

Fig. 4. The convective flows scenario: >-2 = -10, S = 2, b = 2, "'2 = 0.3; cross -
state of rest, curves 1 and 2 - families with stable and unstable equilibria, curve 3 -
limit cycle, dashed line (curve 4) - unstable family, curve 5 - joint family with stable
states (solid line) and unstable ones (stars)
Multi-yield Elastoplastic Continuum-Modeling
and Computations

Johanna Kienesberger, Jan Valdman

Special Research Program SFB F013


'Numerical and Symbolic Scientific Computing'
Johannes Kepler University Linz
{johanna. kienesberger, jan. valdman} @sfb013.uni-linz. ac. at

Summary. The quasi-static evolution of an elastoplastic body with a multi-surface


constitutive law of linear kinematic hardening type allows the modeling of curved
stress-strain relations. It generalises classical small-strain elastoplasticity from one
to various plastic phases. Firstly, we briefly recall a mathematical model represented
by an initial-boundary value problem in the form a variational inequality. Then, the
main concern of this paper is focused on an efficient numerical implementation of
a one time-step problem. Based on the minimisation problem we describe an iterative
non-linear algorithm whose linear subsystems are solved by a geometrical multigrid
method. Finally, the numerical computations in 2D and 3D are presented.

1 Introduction

In this paper we consider the quasi-static initial-boundary value problem for


small strain elastoplasticity with a multi-surface constitutive law. We treat
here a Prandtl-Ishlinskii model of a play type which goes back in the ID
case to PRANDTL [Pra28] and ISHLINSKII [Ish54] and in the multidimensional
case to BESSELING [Bes58] and IWAN [Iwa66]. The model extends the clas-
sical linear kinematic harding model (single-yield model), that goes back to
MELAN [Me138] and PRAGER [Pra49] in the sense, that is operates with more
plastic strains (multi-yield model). Hysteresis properties have been intensively
studied by VISINTIN [Vis94] or KREJCi [Kre96] amongst others. Our functional
formulation of the model and its analysis is based on a direct extention of the
work of HAN AND REDDY [HR99] for the linear kinematic hardening model in
terms of a time dependent variational inequality. Our numerical approximation
for one time-step problem uses the formulation of ALBERTY, CARSTENSEN,
AND ZARRABI [ACZ99] extended for a two-yield model, where the solution pa-
rameters, i.e., the displacement and two plastic strains, are sought as minimis-
ers of a convex but non-smooth functional. For our approach we regularise this
functional, thus standard methods can be applied to the quadratic optimisation
problem. The main idea for the algorithm is the use of the Schur-Complement
form of the discretised problem in the displacements. The arising linear system
is solved by a multi-grid preconditioned conjugate gradient solver.
540 J. Kienesberger , J. Valdman

Fig. 1. Prandtl-Ishlinskii model of play type (left) and its 0' - E hysteresis type
behaviour for a periodical stress O'(t) = Asin(t), t E (0,27f) (right)

The paper is organised as follows: In Section 2, the local material model


is presented, which is the basis for the boundary value problem in Section 3.
The numerical algorithm is designed in Section 4, the numerical experiments
are presented in Section 5. Finally, an outlook on the work still to do is given.

2 The Local Material Model


The constitutive law furnishes the relationship between the stress tensor 0' and
the strain tensor c. The model discussed here is the Prandtl-Ishlinskii model
of play type described by VISINTIN [Vis94] and KREJci [Kre96] among others.
It contains finitely many surfaces and its rheological structure and typical
hysteresis behaviour are depicted in Figure 1. It is local in the sense that for
any given material point x it involves only the time histories 0' = O'(t) and
c = c(t) at that point. It is given by the following system of equations and an
evolution variational inequality:
c=e+p
p = z= Pr
rEI
(1)
0' = O'~ + O'~ , rEI
O'=Ce (2)
O'~ = IHI rPr, rEI (3)
O'~EZ, Pr:(Tr-O'n:s;O forallTrEZr,rEI, (4)
Multi-yield Elastoplastic Continuum~Modeling and Computations 541

Equation (1) represents the additive decomposition of the strain S into its
elastic part e and its plastic part p as well as of the stress cr into the backstresses
cr~ and the plastic stress crf, where rEI = {I, ... , M}. The plastic strain pis
additively decomposed to internal plastic strains Pro The equation (2) denotes
a linear elastic law, in the isotropic case one has

Cs = 2/1£ + A(tr s)lI, (5)


where the (positive) coefficients f.1, and A are called Lame coefficients. Here 1I
denotes the second order identity tensor (an identity matrix) and tr : lR dxd --+
lR defines the trace of a matrix, trs := L:=l Sjj, for S E lR dXd , where d is the
problem dimension. Equation (3) couples the backstresses cr~ and the plastic
strains Pr through linear mappings with positive definite hardening matrices
1HIr, rEI. A typical choice will be lHIr = hrlI, where hr > 0, rEI are hardening
coefficients. Variational inequality (4) formalises the Prandtl-ReuB normality
law, also called the principle of maximal dissipation. The sets Zr C lR~;~, rEI
describe the admissible (plastic) stresses, their boundaries aZr are called the
yield surfaces. We will exclusively use the standard von Mises cylinder with
yield stress cry
Zr = {cr E lR~;~ : II dev crll ::; crn· (6)
Here, IIal1 2 = a : a, a: b = L~,j=l aijbij defines the (Frobenius) norm
and the corresponding scalar product, and the deviator of cr is defined as
dev cr := cr - ~ (tr cr)lI. Since this model is described by more (namely M) yield
stresses cri!, we classify the model as a multi-yield model or as M -yield model
in order to express the number of yield stresses. If M = 1 then we speak about
a single-yield model, which represents a classical linear kinematic hardening
model.

3 The Boundary Value Problem


The elastoplastic continuum is assumed to occupy a bounded domain D C lR d ,
with a Lipschitz boundary r = aD. The boundary r is split into a Dirichlet
boundary rD, a closed subset of r with a positive surface measure, and the
remaining (relatively open and possibly empty) Neumann part rN := r \ rD.
We pose essential and static boundary conditions, namely

u = 0 on rD and cr·n=g onrN,


where g is a given applied surface force and n denotes the outer normal to the
boundary rN. Our analysis will be restricted to the study of a boundary value
problem defined in these functional spaces:

H1(D) = {v E H1(D)dl v = 0 on rD},


Q = {q: q E devlR~;~,qij E L2(D)},
542 J. Kienesberger , J. Valdman

where HI(il) and L2(il) are the usual Sobolev and Lebesgue spaces. The con-
dition q E dev JR~;~ in the definition of Q implies that tr q = 0, i.e., q is a trace
free matrix. It is shown by BROKATE, CARSTENSEN, VALDMAN [BCV03] that
the combination of the system (1)-(4) describing the Prandtl-Ishlinskii model
of the play type together with (quasi-static) equilibrium between external (de-
noted as f) and internal forces, i.e.,

div O'(x, t) + f(x, t) = 0, x E il, t E (0, T) (7)

results in the time-dependent variational inequality for the state variable w =


(u, (Pr)rEI):

a(w(t), z - w(t)) + 1j;(z) -1j;(w(t)) :::: (£(t), z - w(t)) , for all z E 'H. (8)

w is considered to be an element of the Hilbert space H = Hb (il) x ITrEI Q


and to satisfy the zero initial condition w(O) = 0. Writing z = (v, (qr)rEI), a
bilinear form a(·, .), a linear functional £(.) and a nonlinear functional1j;(·) are
defined as:

a: H x 'H ----+ JR, a(w, z) = J


n
C(c(u) - LPr) : (c(v) - L qr) dx+
rEI rEI

+L J
rEI n
IHIpr: qr dx ,
(9)

£(t) : 'H ----+ JR, (£(t), Z) = J J


f(t) . v dx + g(t) . V dS(x), (10)
n rN

1j; : H ----+ JR, 1j;(z) = L JO'i!llqrll dx. (11)


rEI n
Thus we can formulate the following formulation of the boundary value prob-
lem of quasi-static elastoplasticity.
Problem 1 (BVP of quasi-static multi-surface elastoplasticity).
For given l E HI(O, T; H*) with £(0) = 0, find w E HI(O, T; H) with w(O) = 0,
such that (8) holds for almost all t E (0, T).
The unique solvability of Problem 1 under the assumption that the elastic
and hardening tensors are symmetric and positive definite bases on the ex-
tension on the proof of HAN AND REDDY [HR95, HR99] and can be found in
works of VALDMAN [Va102] or BROKATE, CARSTENSEN, VALDMAN [BCV03]:

Theorem 1. Let l E HI(O,T;H*) with £(0) = 0. Then there exists a unique


solution w E HI (0, T; H) of Problem 1.
Multi-yield Elastoplastic Continuum-Modeling and Computations 543

4 Numerical Algorithm
The starting point for the finite element method is the time-discretised form of
the variational problem. Problem 1 is solved by an implicit time discretisation,
we use the implicit Euler scheme with equidistant time intervals.
It was shown by ALBERTY, CARSTENSEN, AND ZARRABI [ACZ99] that
in the single-yield case, i.e., M = 1, the time-discretised dual formulation in
each time step is equivalent to an optimisation problem depending only on
the displacement u and the plastic strain p. This result was obtained by us-
ing functional analytic arguments, as the variational inequality is regarded as
a sub-differential, for which the dual sub-differential exists and can be refor-
mulated. The resulting objective depends on the chosen hardening law (linear
kinematic hardening or isotropic hardening), though the structure remains the
same. In KIENESBERGER [Kie03] an algorithm solving the single-yield prob-
lems was developed using the results and notation of ALBERTY, CARSTENSEN,
AND ZARRABI [ACZ99]. Since the multi-yield hardening model structurally
generalises the linear kinematic hardening model, authors managed to extend
the original code using templates in C++ effectively in the way, that the
multi-yield hardening model becomes a new hardening model. For computa-
tional reasons new parameters a r are introduced, which are internal harden-
ing parameters of the the same dimension as the plastic strains Pr and are
defined by a r = lHIrPr.
The notation is as follows: For given variables with index 0 of an initial time
step to, the upgrades of the variables at the time step t 1 = to + Llt have to be
determined. The already time-discretised generalised optimisation problem for
the multi-yield case in each time step, subject to the modifications for fitting
to the single-yield algorithm, reads as:

f(U,Pl,'" ,PM) := ~ J
n
C(c(u) - I>r): (c(u) - LPr)dx
rEI rEI

J
+~ L
n rEI
ia~i2 dx + ~ J L
n rEI
iPr - p~i2 dx + J L
n rEI
a~ : (Pr - p~) dx

+J L CJ1j.iPr - p~i dx -J fudx -----> min,


n rEI n
(12)
where aO is the internal hardening variable from the initial time step.
The basic idea idea for solving the quasi-static problem is using a uniform
time discretisation and iterate in each time step until the minimisers, i.e., the
displacement u and the plastic strains Pr are determined. Then these values
and the separately calculated a r are used as the reference values with index 0
for the next time step t 2 .
The fifth term in (12) contains a norm the sharp bend of which may cause
trouble, as the function f is not differentiable. To apply standard methods,
544 J. Kienesberger , J. Valdman

the objective is desired to be differentiable and quadratic, thus the function


is regularised as follows: The term 1.1 is regularised by smoothing the norm
function, i.e.,
II .- { 1·1 2 if 1·1 ~ E, (13)
• E·- icl.1 + ~ if 1.1 < E.
For small E, the quadratised function f(U,P1, ... ,PM) is very similar to the
original one, but its properties change enormously. Therefore, it will be referred
to by the new symbol 1.
Another simplification is defining the change of Pr by Pr = Pr - p~, and
using it as an argument of the objective instead of Pr:
The spatial discretisation is carried out by the standard finite element
method using linear triangular, resp. tetrahedral finite elements. For reasons
of better readability and coherence, the name of the vector denoting the dis-
cretised displacement U is again u. The same is valid for Pn p~, furthermore
the symmetric matrices are transformed to vectors, e.g. in 2D

-11 -12
Pr Pr
( -12 -22) ===}
-11)
(p;2
Pr
-22
Pr ,
Pr Pr

such that the objective and other equations can be written in a matrix and
vector notation.
For the derivation of the algorithm and numerical experiments we will
consider only the two-yield case, i.e., M = 2, as it shows the characteristics of
the multi-yield problem and can be extended easily. Now, the objective reads
as

!(U,P1,P2) = ~2 (~)
P2
T (B:~: ~!~ -~C) (Pp:21)
-CB C C+ 2 '0

+ (-l(~ !Tpg~1 ~;~)) T (~) (14)

+ pg) + Qag
qp~ P2
+ ~Cpo
2 1 .. pO1 + ~Cpo . pO + ~laol2 + ~laOl2
2 2. 2 2 1 22 -+ min ,

where Bu denotes the discretised strain E(U), and Q is the result of regarding
Pr as vectors, i.e., the matrix norm is defined by Ipi = (pTQp)!.
'0 1 = Q(l + 12_O"il ) is the non-linear iteration matrix of ! with respect
PI€

to P1, and analogous for '0 2 and P2. These matrices are computed in every
iteration step using the current Pr, but apart from that the dependencies on
IPr IE will be neglected. This is not an exact method for determining the change
of the plastic strain, but its error will be corrected later on as the Pr will be
calculated separately and iteratively with the alternating direction method.
Multi-yield Elastoplastic Continuum-Modeling and Computations 545

y r y
>r - - - - - - - - ,
>
>
>
>
> F
>
>
>
>
>
>A~------~-----
X x
Fig. 2. Geometry of a beam (left) and the quarter of a ring (right) problems

The matrix in (14) is positive definite, thus the minimiser (u,ih,P2) has to
fulfill the necessary condition of the derivative being equal to zero:

(15)

Extracting the vector (PI, P2 f from the two lower lines in (15) and inserting
it into the first one yields the Schur-Complement system in u:

This linear system is solved by a multigrid preconditioned conjugate gradient


method, see e.g. BRAMBLE [Brag5]. From the numerical tests we have seen
that it is not necessary to use the multigrid preconditioner arising from the
plasticity problem, the pre conditioner for the related problem of elasticity is
sufficient and much faster.
For the multigrid method, we use one Gauss-Seidel pre- and post-smoothing
step in a V-Cycle, the system on the coarse grid is solved exactly. Furthermore,
the nested iteration approach was used, which means that the starting values
for the coarse grid correction are the restrictions of the fine grid functions.

5 Numerical Experiments

The algorithm was implemented in NGSolve - the finite element solver exten-
sion package of the mesh generator tool NETGEN developed in our group.
Finite element basis functions were chosen as piecewise linear for the displace-
ment u and piecewise constant for the plastic strains PI and P2. Furthermore,
546 J. Kienesberger , J. Valdman

Fig. 3. Plasticity domains in the single-yield (left) and two-yield case (right) of the
beam

the full multigrid method was used, i.e., we started with a coarse grid, solved
the problem, refined the grid, solved the problem on the finer grid et cetera.
The algorithm was tested on two- as well as on three-dimensional domains,
for both the single-yield and multi-yield case, see Figure 2 for the geometries.
The first testing geometry is the 2D beam of Figure 2 with the left edge
fixed and the right edge charged with a force acting in the direction of the
external normal vector. The second geometry tested is the 3D quarter of a
ring from Figure 2 with constant thickness in the z-axis which is the same as
the thickness of the ring in the 2D sketch. The quarter ring is fixed on the lower
face and a force is acting upwards on the right face. The finest uniform mesh
consists of 131 072 triangles (which corresponds to 658428 degrees of freedom
DOF in the calculation of u) for the 2D examples and 25 088 tetrahedra (122
334 DOF) for the 3D example. Figures 3 and 4 show the plasticity domains
in the single-yield and in the multi-yield case. The elastic zones are colored·
light grey, the first plastic zones are middle-grey, and the second plastic zone
is dark-grey.
Multi-yield Elastoplastic Continuum-Modeling and Computations 547

Fig. 4. Plasticity domains in the single-yield (left) and two-yield case (right) of the
quarter of the ring

6 Conclusions and Future Work

In this paper a multi-yield plasticity model and its numerical computations


were shown. The nonlinear iterative algorithm that uses a multigrid precondi-
tioned solver was presented, its performance in 2D and 3D was demonstrated.
In the future we will extend the solution idea to a quasi-Newton algorithm,
i.e., the Schur-Complement matrix will have some Hessian-type entries in order
to improve the computational performance. This idea is already implemented
for the single-yield case, where the numerical results demonstrate the faster
algorithm performance with linear complexity. We expect the same result for
the multi-yield case.
Another long-term aim is to identify the interfaces between the elastic and
plastic zones and to refine the mesh adaptively in such a way, that the interface
is approximated by the mesh. Then we expect an even faster performance of
the algorithm.

7 Acknowledgment
The authors are pleased to acknowledge support by the Austrian Science Fund
'Fonds zur Forderung der wissenschaftlichen Forschung (FWF) , under grant
SFB F013/F1306 in Linz, Austria.
548 J. Kienesberger , J. Valdman

References

[ACZ99] J. Alberty, C. Carstensen, and D. Zarrabi. Adaptive numerical analysis


in primal elastoplasticity with hardening. Comput. Methods Appl. Mech.
Eng., 171(3-4):175-204, 1999.
[BCV03] M. Brokate, C. Carstensen, and J. Valdman. A quasi-static boundary value
problem in multi-surface elastoplasticity: Part 1 - analysis. SFB Report
2003-16, Johannes Kepler University Linz, SFB "Numerical and Symbolic
Scientific Computing", 2003. (and submitted).
[Bes58] J.F. Besseling. A theory of elastic, plastic and creep deformations of an
initially isotropic material showing anisotropic strain-hardening, creep re-
covery and secondary creep. J. Appl. Mech., 25:529-536, 1958.
[Bra95] J. H. Bramble. Multigrid methods. Pitman research notes in mathematical
series. Longman, 1995.
[HR95] W. Han and B.D. Reddy. Computational plasticity: the variational ba-
sis and numerical analysis. Computer methods in applied mechanics and
engineering, pages 283-400, 1995.
[HR99] W. Han and B. Reddy. Plasticity: Mathematical Theory and Numerical
Analysis. Springer-Verlag New York, 1999.
[Ish 54] A. Ju. Ishlinskii. The general theory of plasticity with linear hardening (in
russian). Ukrainian mathematical journal, 6(3), 1954.
[Iwa66] W.D. Iwan. A distributed-element model for hysteresis and its steady state
dynamic response. J. Appl. Mech., 33:893-900, 1966.
[Kie03] J. Kienesberger. Multigrid preconditioned solvers for some elastoplastic
problems. In Lecture Notes in Computer Science. Springer, 2003. (to ap-
pear).
[Kre96] P. KrejcI. Hysteresis, Convexity and Dissipation in Hyperbolic Equations.
GAKUTO International Series, Mathematical Sciences and Applications,
1996.
[Mel38] E. Melan. Zur Plastizitiit des riiumlichen Kontinuums. Ingenieur-Archiv,
9:116-126, 1938.
[Pra28] L. Prandtl. Ein Gedankenmodell zur kinetischen Theorie der festen Karper.
ZAMM, 8:85-106, 1928.
[Pra49] W. Prager. Recent developments in the mathematical theory of plasticity.
J. Appl. Phys., 9:235-241, 1949.
[Va102] J. Valdman. Mathematical and Numerical Analysis of Elastoplastic Material
with Multi-Surface Stress-Strain Relation. PhD thesis, Christian-Albrechts-
Universitiit zu Kiel, 2002. published at www.dissertation.dein Berlin,
Germany, 2002, ISBN 3-89825-501-8, download: http://www.sfb013.uni-
linz.ac.at / ~jan/plasticity.pdf.
[Vis94] A. Visintin. Differential models of hysteresis. Springer, 1994.
Celebrating Fifty Years of David M. Young's
Successive Overrelaxation Method

David R. Kincaid

Department of Computer Sciences, University of Texas at Austin, Austin, Texas


78712 USA [email protected]

Summary. It has been over fifty years since David M. Young's original work on the
successive overrelaxation (SOR) methods. This fundamental method now appears
in all textbooks containing an introductory discussion of iterative solution meth-
ods. (Most often the SOR method appears after a presentation of Jacobi iteration
and Gauss-Seidel iteration and before the conjugate gradient iterative method.) We
present a brief survey of some of the research of Professor David M. Young, together
with his students and collaborators, on iterative methods for solving large sparse lin-
ear algebraic equations. This is not a complete survey but just a sampling of various
papers with a focus on some of these publications.
Dr. David M. Young's doctoral thesis [27] was accepted in 1950 by his supervis-
ing Professor Garrett Birkhoff of Harvard University and his paper [28] based this
work appeared in 1954. This is one of the landmark contributions in modern numer-
ical analysis. The red-black ordering for matrices is of great importance in parallel
computing. Gene Golub has said: "It's almost as if David could see into the future!"
David Young celebrated his 80th birthday on October 20, 2003
(http://www.ma.utexas.edu/CNA/photos.html).

1 Introduction

We present a brief survey of some of the work of Professor David M. Young,


together with his students and collaborators, on iterative methods. Dr. David
M. Young has been involved in research on iterative methods for solving large
sparse linear algebraic equations for over forty years until his recent retirement.
This is not a complete survey but just a sampling of various projects with
a focus on some of his publications.

2 Successive Overrelaxation

From research first done at Harvard University, Young presented in his Ph.D.
thesis [27], and in a subsequent paper [28], an analysis of the successive overre-
laxation (SOR) method for the case where the coefficient matrix of the linear
algebraic system Au = b is consistently ordered [30]. An elliptic partial dif-
ferential equations over a region with grid points numbered in the natural
ordering (left-to-right and up) and using the standard five-point discretization
550 D.R. Kincaid

stencil results in such a matrix system. In fact, any matrix system derived in
this way has Young's Property A [30]. Moreover, a consistently ordered system
can be obtained from one with Property A after a suitable permutation.
For a matrix with Property A, one can permute the rows and corresponding
columns to obtain a red-black system. The red-black ordering corresponds to a
red and black checkerboard ordering of the grid points. When A is a red-black
matrix, it is consistently ordered and Young's equation [27] (A + W - 1)2 =
w2 p,2 A gives a relation between the eigenvalues A of the iteration matrix Lw
for the SOR method and the eigenvalues p, of the iteration matrix B for the
Jacobi method. If A is symmetric positive definite, then the eigenvalues of B
are real and less than 1 in absolute value and the optimum or best value of
the acceleration factor W is given by Wb = 2/ (1 + \11 - S(B)2 ). Here S(B)
is the spectral radius of the Jacobi matrix B, which is the magnitude of the
eigenvalue of largest absolute value of the matrix B. Moreover, the spectral
radius of the SOR matrix with the optimum relaxation parameter W = Wb is
given by S(LWb) = Wb - 1 = r.
For model problems involving the Poisson equation over a region with mesh
points of grid size h, it can be shown that the number of iterations required
for convergence of the SOR method is n = O(h- 1 ) whereas the number of
iterations is n = O(h- 2 ) using either the Jacobi or Gauss-Seidel methods. In
this situation, the SOR method is faster by an order of magnitude.
Work has been done on the choice of the optimum W for the case where A is
consistently ordered but not symmetric positive definite and where some of the
eigenvalues of the Jacobi iteration matrix B are complex eigenvalues. Several
programs are available for choosing the optimum W if all of the eigenvalues of
B are known or if one knows a convex region containing them. See Young and
Eidson [35] and Young and Huang [39].
An extension of the SOR method is the modified SOR (MSOR) method
for a linear system with a red-black coefficient matrix. The MSOR method
involves the use of relaxation factors Wl'W~, W2,W&, ... , where Wi is used for
the red components, and w~ is used for the black points, for each i. In Young,
Wheeler, and Downing [34]' it is shown that there are suitable values of Wi
and w~ that are as good, though not better than, the choice Wi = w~ = Wb for
all i. On the other hand, other choices are more effective if one measures the
effectiveness in terms of certain norms as shown in Young and Kincaid [42].
Chapters 8 and 10 of Young [30] cover the modified SOR method with fixed
and variable parameters, respectively.
A number of other modifications and extensions have been made to the
SOR theory. For instance in group or block methods, the unknowns are
grouped into blocks and all values within a block are updated simultaneously.
Usually, each inner iteration of a block method is done by a direct method
since the matrices for the blocks are assumed to be easily solvable. For exam-
ple, these matrices are tridiagonal in the case of the line SOR method when the
five-point finite difference stencil is used. Also, faster convergence is obtained
Young's Research 551

for line SOR methods than for point SOR methods, in general. Moreover, the
general SOR theory has been applied to group iterative methods in Chapter
14 of Young [30].
Research on norms associated with the SOR method for the red-black
system has resulted in new formulas. It has been shown that graph of the
DLnorm function for the SOR matrix C':/;, is not monotonically decreasing
(it increases and then decreases), but the ALnorm is indeed a monotonically
decreasing function of m; however, it is still considerably larger than the spec-
tral radius function S(C':/;,) = rm. See Young and Kincaid [42] and Chapter 7
in Young [30].
Corresponding to the SOR method is the Symmetric Successive Overre-
laxation (SSOR) method in which an iteration consists of one iteration of
the (forward) SOR method followed by one iteration of the (backward) SOR
method. In the Unsymmetric Successive Overrelaxation (USSOR) method, dif-
ferent parameters may be used in the red and black equations, respectively.
Young [29] presents convergence properties of the symmetric and unsymmetric
successive overrelaxation methods and related methods.

3 Chebyshev Acceleration

The SO R method can be regarded as a way to accelerate the convergence of the


Jacobi method in a certain sense. Another way of speeding up the convergence
of the Jacobi method is to use an extrapolation method or a Chebyshev accel-
eration method, which is based on Chebyshev polynomials. These are general
procedures and they can be applied to methods other than just the Jacobi
method as shown in Hageman and Young [7].
Suppose the basic iterative method to be used in the acceleration procedure
has the form u(n+I) = Gu(n) +k where the eigenvalues J.1, of G are bounded such
that m(G) :::; J.1, ::::: M(G). Here m(G) and M(G) are the smallest and largest
eigenvalues of G, respectively. Using the three-term relation for Chebyshev
polynomials, the optimal Chebyshev acceleration method can be written as
u(n+1) = pn+1 h(Gu(n) + k) + (1 _,)u(n)} + (1 - Pn+1)u(n-I) where Pn+1 =
(1 - ((J/2)2Pn), (with PI = 1 and P2 = (1 - (J2/2)), (J = [M(G) - m(G)]/[2-
M(G) - m(G)], and, = 2/[2 - M(G) - m(G)].
Varga [26] refers to this procedure as the Chebyshev semi-iterative method.
One needs to choose estimates for M(G) and m(G), which may cause difficul-
ties in some cases. In fact, the behavior of the acceleration procedure is often
sensitive to these estimates and especially the one for M(G). It can be shown
that the optimum Chebyshev acceleration procedure is an order of magnitude
faster than the optimum extrapolated procedure for (J close to one [7].
If one applies the Chebyshev acceleration procedure to the Jacobi method
as the basic method for solving a linear system with a red-black coefficient
matrix, then the computation can be simplified. This is done by rewriting
552 D.R. Kincaid

the procedure in terms of only the red points or only the black points for
the Jacobi method. Golub and Varga [6] refer to this as the cyclic Chebyshev
semi-iterative method. The cyclic acceleration method is the original Cheby-
shev acceleration method with half of the calculations bypassed. The result-
ing method is equivalent to a special case of the modified SOR method with
Wi = w~ = Pn+l.
With an adaptive Chebyshev acceleration procedure, one continuously re-
vises the iteration parameters as the iterative method proceeds. The algorithm
fixes the smallest eigenvalue estimate mE :S m(G) and adaptively modifies the
largest eigenvalue estimate ME but keep ME :S M(G). The iterative procedure
continues using these values of mE and ME until the observed convergence is
much slower than expected in a certain sense. By solving a Chebyshev equation,
the algorithm increases ME but keeps ME :S M(G). This adaptive Chebyshev
acceleration procedure is repeated until convergence is achieved according to
the stopping test being utilized. Chebyshev polynomials are used in the algo-
rithm for choosing these maximum and minimum eigenvalue estimates. Such
a procedure was developed by Hageman and Young [7] and it was incorporated
into the algorithms used in the ITPACK software packages [18].
It has been shown that, in some cases, one can obtain almost as good
a convergence as with Chebyshev acceleration by the use of a stationary second
degree method given by u(n+l) = pb(Gu(n) + k) + (1-,)u(n)} + (1- p)u(n-I).
Here let P = 1 when n = O. See some of the papers by Young and/or Kincaid
on second-degree methods [14, 15, 20, 31].

4 Conjugate Gradient Acceleration


Conjugate gradient acceleration u(n+l) = Pn+1 bn+l(r(n) + urn)} + (1 -
Pn+l)u(n-I) is similar to Chebyshev acceleration except that the parameters
used involve inner products:
Pn+l = [1 - ((n+lhnPn) (r(n) ,r(n»)/(r(n-I), r(n-I»)]-l,
(with PI = 1) and ,n+1 = (r(n),r(n»)/(r(n),Ar(n»). Here r(n) = b - Au(n) is
the residual vector. As with Chebyshev acceleration method, conjugate gradi-
ent acceleration method can speed-up the Jacobi method and other methods.
Conjugate gradient acceleration has some advantages over Chebyshev accel-
eration [7]. It can be shown that the convergence of conjugate gradient accel-
eration, measured in a certain norm, is at least as fast as that of Chebyshev
acceleration. With conjugate gradient acceleration there are no parameter es-
timates; however, the basic iterative method may involve a parameter as in
the case when SSOR is used as the basic method. Since the conjugate gradi-
ent acceleration requires the computation of inner products for each iteration,
the work required per iteration may be somewhat greater than for Chebyshev
acceleration. For basic methods that are not symmetrizable, the generalized
conjugate gradient methods can be used to accelerate their convergence [7].
Young's Research 553

We can introduce a nonsingular matrix W as follows [W(I -G)W-l][Wu] =


Wk which in terms of the original linear system is [WQ-IAW-l][Wu] =
WQ- 1 b. The generalized conjugate gradient acceleration metbod [7] corre-
sponding to the basic iterative method is given by

where

pn+l = [1- (rn+l!rnPn) (Wo(n) ,Wo(n))/(Wo(n-l), Wo(n-l))]-I,


(with PI = 1),

and the pseudo-residual vector is o(n) = Gu(n) + k - u(n). The conjugate


gradient acceleration method minimizes the [WTW(I - G)] Lmatrix-norm of
the error as compared with any polynomial acceleration procedure based. If
A and Q are symmetric positive definite matrices and if WTW = Q, then we
minimize the A! -matrix-norm of the error as in the conjugate gradient method.
It can be shown that the average rate of convergence for the conjugate gradient
method, when measured in the [WTW(I - G)] Lmatrix-norm, is at least as
large as that for the corresponding Chebyshev acceleration procedure. (See
Hageman and Young [7].) As in Young [30], we denote the L-matrix norm by
IIQIIL = IILQL- 1 112.

5 N onsymmetric Systems
A difficult problem is solving the linear system when the coefficient matrix
A is not necessarily symmetric positive definite or even symmetric. Three
generalized conjugate gradient acceleration metbods called ORTHODIR, OR-
THOMIN, and ORTHORES were considered by Young and Jea [40]. It was
shown that under fairly general conditions these methods converge, in exact
arithmetic, in at most N iterations, where N is the order of the matrix. Also
in Jea and Young [9]' the biconjugate gradient (BCG) metbod as well as other
forms of Lanczos metbods were considered as generalized conjugate gradient
acceleration methods corresponding to certain double linear systems involv-
ing A and AT.
The generalized minimum residual (GMRES) metbod [25] is a widely used
method for solving nonsymmetric linear systems. The method is generally very
reliable although stagnation may occur in some cases. Moreover, for nonsym-
metric systems, the amount of work required per iteration usually increases
as the number of iterations increases. 'Recently, Young working with Chen
and Kincaid developed various generalizations of the GMRES method and
combined them with the Lanczos procedure. New iterative methods called
554 D.R. Kincaid

GGMRES, MGMRES, and LAN/MGMRES have been established [2, 3, 4, 5].


The GGMRES method is a slight generalization of the GMRES method. The
GMRES method minimizes a norm of the residual and GGMRES minimizes
a more general norm-both involve a minimization condition. Alternatively,
the MGMRES method is a modification of the GMRES method applied to
a symmetric indefinite linear system using a Galerkin condition. This latter
method is related to the BCG method and to other variants of the Lanczos
method. The LAN/GMRES method aims at combining the reliability of the
GMRES method with the reduced work of a Lanczos-type method. When con-
ducting initial numerical experiments on nonsymmetric linear systems arising
from convection-diffusion problems, it was found that LAN/MGMRES was
comparable with a number of other methods that are extensively used.

6 Software for Iterative Methods


Under the direction of Kincaid and Young, several research-oriented software
packages were written as part of the ITPACK Project at the Center for Nu-
merical Analysis. Beginning in the mid-1970s, there was an increased effort
to develop iterative algorithms and portable public domain software. Software
packages, such as the ITPACK 2C package, were developed, which included
automatic procedures for handling choices that were causing difficulties for
users of iterative methods. Automatic procedures included were developed for
determining all necessary iteration parameters and for accurate and realistic
stopping tests for iterative algorithms. Algorithms based on these procedures
were described in the book by Hageman and Young [7]. Also, software from
the ITPACK 2C package was modified and incorporated into the ELLPACK
package at Purdue University for solving elliptic partial differential equations.
(See Rice and Boisvert [24].)
While the ITPACK 2C package was intended primarily for linear systems
where the coefficient matrix is symmetric positive definite or nearly so, other
packages such as NSPCG [23] and PCG [10] were developed with the capability
of handling nonsymmetric systems. Other ITPACK Project software include
ITPACKV 2D [17], ITPACK 3A [45] and ITAPCK 3B [22], for example. See
Kincaid and Young [19] for a review of the ITPACK software packages.

7 Alternating-Type Iterative Methods


To construct an alternating-type method for solving Au = b, we choose matri-
ces H, V, and E such that A = H + V + E, where E is a diagonal matrix with
positive diagonal elements. For any linear system of the form (H + pE)v = w
or (V + pE)v = w, we assume that H + pE and V + pE are nonsingular matri-
ces for any positive real number p and so that for any vector w one can easily
solve for v. To define an alternating-type iterative method, we choose positive
Young's Research 555

numbers p and pi and, for a given u(n), we determine u(n+~) and u(n+l) by
(H +pE)u(n+~) = b-(V -pE)u(n) and (V +p' E)u(n+l) = b-(H _pi E)u(n+~).
Thus, we have u(n+l) = Tp,p,u(n) + kp,p' where Tp,p' = (V + pi E)-l(H -
pi E)(H + pE)-l(V - pE) = 1 - (p + p')(V + pi E)-l E(H + pE)-l(H + V)
and kp,p' = (p+p')(V +p'E)-lE(H +pE)-lb = (1 -Tp,pl)A-lb. Examples of
alternating-type methods are the alternating-direction implicit (ADI) method,
the symmetric successive overrelaxation (SSOR) method, and the unsymmet-
ric successive overrelaxation (USSOR) method. With the ADI method, Hand
V are either tridiagonal or are permutationally similar to tridiagonal matrices
and E = 1. With the SSOR and USSOR methods, H and V are lower trian-
gular and upper triangular matrices, respectively, and E is a diagonal matrix
with positive diagonal elements.
In certain cases, the ADI method converges rapidly. For example, with
problems involving Poisson's equation in the rectangle with a grid of mesh
size h, the ADI method convergences in n = O(1og h- l ) iterations using the
optimum number of parameters and in n = O( h -11m) iterations using the best
m parameters. Recall that n = O(h-l) for the SOR method. The commutative
case is when HV = V H, HE = EH, V E = EV. It holds for certain separable
self-adjoint elliptic partial differential equations defined over rectangles. Given
the commutativity condition and also bounds on the eigenvalues of Hand
V, necessary and sufficient convergence conditions related to choosing ADI
parameters can be found in Birkhoff, Varga, and Young [1] and in Chapter 17
of Young [30]. Also see Young and Wheeler [48].
With a nonstationary alternating-type iterative method, the parameters p
and pi may vary from iteration to iteration. We seek to determine the parame-
ters {Pi} and {pa so that u(m) is as close to the true solution u = A -lb as pos-
sible. In practice, we seek to make the spectral radius S (II: TPi ,P:)
l as small
as possible. As an alternative to the (sequential) non-stationary method, we
consider the parallel alternating-type iterative method. See papers by Young
and Kincaid [43, 44].

8 Books

The classical research monograph Iterative Solution of Large Linear Sys-


tems [30] by Professor Young was published in 1971 and reprinted in 2003.
Also, the book Applied Iterative Methods [7] by David M. Young in collabo-
ration with Louis A. Hageman appeared in 1981 and was reprinted in 2004.
Also, we should mention A Survey of Numerical Mathematics, in two volumes
(1972 and 1973), written by David M. Young and Robert T. Gregory [36, 37].
556 D.R. Kincaid

References
1. Birkhoff, G., Varga, R S., Young, D. M.: Alternating direction implicit methods,
in F. Alt, M. Rubinoff, editors, Advances in Computers, Academic Press, New
York, 189-273, 1962.
2. Chen, J-Y. Iterative solutions of large sparse nonsymmetric linear systems, Re-
port CNA-285, Center for Numerical Analysis, University of Texas at Austin,
January 1997.
3. Chen, J-Y., Kincaid, D. R, Young, D. M.: MGMRES iterative method, in J.
Wang, M. B. Allen III, B. M. Chen, T. Mathew, editors, Iterative Methods in
Scientific Computation, IMACS, New Brunswick, 15-20, 1998.
4. Chen, J-Y., Kincaid, D. R, Young, D. M.: GGMRES iterative method, in J.
Wang, M. B. Allen III, B. M. Chen, T. Mathew, editors, Iterative Methods in
Scientific Computation, IMACS, New Brunswick, 21-26, 1998.
5. Chen, J-Y., Kincaid, D. R, Young, D. M.: Generalization and modifications of
the GMRES iterative method, Numerical Algorithms, 21 (1999) 119-146.
6. Golub, G. H., Varga, R. S.: Chebyshev semi-iterative methods, successive over-
relaxation iterative methods, and second-degree Richardson iterative methods,
Parts I & II, Numer. Math., 3 147-168, 1961.
7. Hageman, L. A., Young, D. M.: Applied Iterative Methods, Academic Press,
New York, 1981. (Reprinted by Dover, New York, 2004.)
8. Jea, K. C.: Generalized conjugate gradient acceleration of iterative methods,
Report CNA-176, Center for Numerical Analysis, University of Texas at Austin,
February 1982.
9. Jea, K. C., Young, D. M.: On the simplification of generalized conjugate-gradient
methods for nonsymmetrizable linear systems, Linear Algebra and Its Applica-
tions, 52/53:399-417, 1983.
10. Joubert, W. D.: PCG examples manual: A package for the iterative solution of
large sparse linear systems on parallel computers, Report CNA-284, Center for
Numerical Analysis, University of Texas at Austin, July 1996.
11. Kincaid, D. R: An analysis of a class of norms of iterative methods for systems
of linear equations, Ph.D. thesis, University of Texas at Austin, 1971. Also,
Report CNA-18, Center for Numerical Analysis, University of Texas at Austin,
May 1971.
12. Kincaid, D. R.: Norms of the successive overrelaxation method, Math. Comp.,
26(118):345-357, 1972.
13. Kincaid, D. R: A class of norms of iterative methods for solving systems of
linear equations, Numer. Math., 20:392-408, 1973.
14. Kincaid, D. R: On complex second-degree iterative methods, SIAM J. Numer.
Anal., 11(2):211-218, 1974.
15. Kincaid, D. R: Stationary second-degree iterative methods, Applied Numerical
Mathematics, 16:227-237, 1994.
16. Kincaid, D. R, Hayes, L. J.: editors, Iterative Methods for Large Linear Systems,
Academic Press, New York, 1990.
17. Kincaid, D. R., Oppe, T. C., Young, D. M.: ITPACKV 2D User's Guide, Report
CNA-232, Center for Numerical Analysis, University of Texas at Austin, May
1986.
18. Kincaid, D. R, Respess, J. R, Young, D. M., Grimes, R G.: ITPACK 2C:
A FORTRAN Package for Solving Large Sparse Linear Systems by Adaptive
Accelerated Iterative Methods, ACM Trans. Math. Software 8, 1982.
Young's Research 557

19. Kincaid, D. R., Young, D. M.: A brief review of the ITPACK project, Journal
Computer 8 Applied Mathematic, 24:121-127, 1988.
20. Kincaid, D. R, Young, D. M.: Linear stationary second-degree methods for solu-
tion oflarge linear systems, in Th. M. Rassias, H. M. Srivasiava, A. Yanushauska,
editors, Topics in Polynomials of One and Several Variables and Their Appli-
cations, World Scientific Publishing Co., River Edge, NJ, 609-629, 1993.
21. Kincaid, D. R., Young, D. M.: Note on parallel alternating-type iterative meth-
ods, in S. D. Margenov P. S. Vassilevski, editors, Iterative Methods in Linear
Algebra II, IMACS, New Brunswick, 131-139, 1996.
22. Mai, T-Z., Young, D. M.: ITPACK 3B user's guide (preliminary version), Re-
port CNA-201, Center for Numerical Analysis, University of Texas at Austin,
January 1986.
23. Oppe, T. C. Joubert, W. D., Kincaid, D. R: NSPCG user's guide, version 1.0:
A package for solving large sparse linear systems by various iterative methods,
Report CNA-216, University of Texas at Austin, Center for Numerical Analysis,
April 1988.
24. Rice, J. R., Boisvert, R.: Solving Elliptic Problems Using ELLPACK, Spring-
Verlag, New York, 1985.
25. Saad, Y., Schultz, M. H.: GMRES: A generalized minimal residual algorithm for
solving nonsymmetric linear systems, SIAM J. Scientific and Statistical Com-
puting, 7:856-869, 1986.
26. Varga, R S.: Matrix Iterative Analysis, Prentice-Hall, Englewood Cliff, New
Jersey, 1962.
27. Young, D. M.: Iterative Methods for Solving Partial Difference Equations of El-
liptic Type, Ph. D. thesis, Harvard University, Mathematics Department, Cam-
bridge, MA, May 1950.
28. Young, D. M.: Iterative methods for solving partial difference equations of el-
liptic type, Trans. Amer. Math. Soc., 76:92-111, 1954.
29. Young, D. M.: Convergence properties of the symmetric and unsymmetric suc-
cessive overrelaxation methods and related methods, Math. Comp., 24(112):793-
807, 1970.
30. Young, D. M.: Iterative Solution of Large Linear Systems, Academic Press, New
York, 1971. (Reprinted by Dover, New York, 2003.)
31. Young, D. M.: Second-degree iterative methods for the solution of large linear
systems, J. Approximation Theory, 5:137-148, 1972.
32. Young, D. M.: On the accelerated SSOR method for solving large linear systems,
Advances in Mathematics, 23(3):215-271, 1977.
33. Young, D. M.: A historical review of iterative methods, in A History of Scientific
Computation S. G. Nash, editor, Addison-Wesley, Reading, MA, 180-194, 1989.
Also in Report CNA-206, Center for Numerical Analysis, University of Texas
at Austin, February 1987.
34. Young, D. M., Downing, J. A., Wheeler, M. F.: On the use of the modified suc-
cessive overrelaxation method with several relaxation factors, in W. A. Kalenich,
editor, Proceedings of IFIP 65, Spartan Books, Washington, D.C., 1965.
35. Young, D. M., Eidson, H. D.: On the determination of the optimum relaxation
factor for the SOR method when the eigenvalues of the Jacobi method are
complex, Report CNA-l, Center for Numerical Analysis, University of Texas at
Austin, September 1970.
36. Young, D. M., Gregory, R T.: A Survey of Numerical Mathematics, Volume 1,
Addison-Wesley, New York, 1972. (Reprinted by Dover, New York, 1988.)
558 D.R. Kincaid

37. Young, D. M., Gregory, R. T.: A Survey of Numerical Mathematics, Volume 2,


Addison-Wesley, New York, 1973. (Reprinted by Dover, New York, 1988.)
38. Young, D. M., Hayes, L. J.: The accelerated SSOR method for solving large
linear systems. Report CNA-123, Center for Numerical Analysis, University of
Texas at Austin, May 1977.
39. Young, D. M., Huang, R.: Some notes on complex successive overrelaxation,
Report CNA-185, Center for Numerical Analysis, University of Texas at Austin,
July 1983.
40. Young, D. M., Jea, K. C.: Generalized conjugate gradient acceleration of non-
symmetrizable iterative methods, Linear Algebra and Its Applications, 34:159-
194, 1980.
41. Young, D. M., Jea, K. C., Mai, T-Z.: Preconditioned conjugate gradient algo-
rithms and software for solving large sparse linear systems, Report CNA-207,
Center for Numerical Analysis, University of Texas at Austin, March 1987.
42. Young, D. M., Kincaid, D. R.: Norms of the successive overrelaxation method
and related methods, Report TNN-94, Computation Center, University of Texas
at Austin, September 1969.
43. Young, D. M., Kincaid, D. R.: Parallel implementation of a class of nonstation-
ary alternating-type methods, in D. Bainov, V. Covachev, editors, Proceedings
of the Third International Colloquium on Numerical Analysis, VSP, Utrecht,
The Netherlands, 219-222, 1995.
44. Young, D. M., Kincaid, D. R.: A new class of parallel alternating-type iterative
methods, Journal of Computational and Applied Mathematics, 74:331-344, 1996.
45. Young, D. M., Mai, T-Z.: ITPACK 3A User's Guide (Preliminary Version),
Report CNA-197, Center for Numerical Analysis, University of Texas at Austin,
1984.
46. Young, D. M., Mai, T-Z.: Iterative algorithms and software for solving
large sparse linear systems, Communications in Applied Numerical Methods,
4(3):435-456, 1987. (Also, Report CNA-215, Center for Numerical Analysis,
University of Texas at Austin, 1987.)
47. Young, D. M., Mai, T-Z.: The search for omega, in D. Kincaid, L. Hayes, editors,
[16]' 293-311.
48. Young, D. M., Wheeler, M. F.: Alternating direction methods for solving partial
difference equations, in W. F. Ames, editor, Nonlinear Problems of Engineering,
Academic Press, New York,l964.
On the Relational Database Style Parallel
Numerical Programming *

Bela Kiss 1 and Anna Krebsz 2

1 Department of Mathematics, Szechenyi Istvan University, H-9026 Gyor, Egyetem


ter 1., Hungary, E-mail: [email protected]
2 Department of Numerical Analysis, ECitvCis Lorand University, H-l117 Budapest,
Pazmany Peter Setany I.e, Hungary, E-mail: [email protected]

Summary. A simplified relational database model implementation for parallel nu-


merical computation is presented. The efficiency of this data model is demonstrated
through the example of an optimally parallelized H-matrix (hierarchical-matrix)
multiplier and its application as 3-dimensional Laplace preconditioning matrix.

1 Introduction

The numerical algorithms have become quite complex and require dynamic
data structures. The general practice is to use a mixed data structure for this
purpose [5]. In this paper a simplified version of the well-known relational
database model is presented, which is developed for parallel numerical compu-
tation. This database model was introduced by E. F. Codd [3, 6]. The great
advantage of this model is the flexible, simple and homogenous data structure.
Moreover, the application of special geometric search trees (oct-tree, ADT,
etc.) [4, 6, 13] can be avoided in such way. The simplified database model is
implemented in object oriented manner by using the programming language
C++.
The non-parallel version of this simplified relational database model for
non-structural mesh generation is investigated in [11]. This paper demon-
strates the efficiency of the parallel version in the case of an optimally paral-
lelized H-matrix (hierarchical-matrix) multiplier. The applied H-matrix tech-
nique is originated from W. Hackbush and B. N. Koromskij [8, 9, 2]. A great
advantage of the presented parallel H-matrix multiplier is that its implemen-
tation is independent of the geometrical complexity of the domains.
Our paper is organized as follows. Section 2 contains the description of
the simplified relational database model. The parallelized H -matrix multiplier
is discussed in Section 3. Section 4 is devoted to its application as Laplace
preconditioning matrix on a very complex 3-dimensional domain.
* This work was partly supported by the Grants T043258 and T042826 of the Hun-
garian National Research Found.
560 B. Kiss, A. Krebsz

2 Simplified Relational Database Model

In the case of the classical relational database model the data files are con-
sidered as tables, where the number of columns is fixed and the number of
rows may vary. Each column contains elementary data from the same data
type and can be referred to by a symbolic name. These symbolic names are
called field identifiers. Each row (record) is identified by its row (record) num-
ber and a part of a row (record) by its field identifier.
The numerical problems usually require only one ordering on one data
table. Hence it is worth to use combined data and index tables here to store the
records in the adequate key order. This idea is implemented as a RelDataBase
class with the following member functions.

Table 1. Member Functions of the Class RelDataBase


Member Function I Description
Create(int TableId, int RecSize, Creates a data table.
KeyDesc* pKeyDesc);
Erase (int TableId); Erases a data table.
Insert(int TableId, char* pRecord); Inserts a record.
Update(int TableId, char* pRecord); Updates a record.
Delete(int TableId, char* pRe cord) ; Deletes a record.
DeleteAll(int TableId, char* pRecord); Deletes all record.
GetFirst(int TableId, char* pRecord); Gets the first.
GetNext(int TableId, char* pRe cord) ; Gets the next.
GetLast(int TableId, char* pRe cord) ; Gets the last.
GetPrior(int TableId, char* pRe cord) ; Gets the previous.
GetCurrent(int TableId, char* pRe cord) ; Gets the current.
SeekLessEq(int TableId, char* pRe cord) ; Seeks for less equal.
SeekGreatEq(int TableId, char* pRecord); Seeks for greater equal.

For the sake of flexibility the records are called by reference, and the record
pointers are casted to character type. The third parameter KeyDesc of the
member function Create is a key descriptor structure for key segment identi-
fication (data type, beginning position).

struct KeyDesc
{
char KeyFieldType [KEY .sEGMENT .MAX] ;
char KeyFieldBegPos [KEY .sEGMENT .MAX] ;
};

The integrated data and index tables are realized by the well-known
Red-Black and B balanced tree structures [4, 6]. Both tree structures are
asymptotically optimal and widely used. The cost of the tree operations In-
sert, Delete, Update and Seek require less than or equal to O(log(n)) arith-
On the Relational Database Style Parallel Numerical Programming 561

metic operations, where n denotes the number of tree-nodes. However, the cost
of the sequential read in index order is only 0(1) in the practice.
We have applied the distributed database technique, using local simpli-
fied relational database on each process together with the message-passing
standard MPI [7].

3 Parallel H-Matrix Multiplier

Let D C R3 be a given domain and let it be endowed with a regular, quasi-


uniform tetrahedral mesh. Furthermore let Q = xt=l [ai, bi ] be a suitable
chosen rectangular box which contains D. The coordinates of the minimal and
maximal nodes of Q are denoted by dRectMin and dRectMax, respectively.
We consider the H-matrix partitioning of this fictitious domain Q. Two main
levels of the partitioning are introduced. The first one, nLevelSubDom is
the level of subdomains. This level determines the size of greatest H -matrix
blocks. The second one, nLevelMesh, is the level of the quasi-uniform mesh
on D.

3.1 H-Matrix Partitioning

We have applied the next H -matrix partitioning algorithm.

//--------------------------------------------------------------------
void HMtxGen (int nLevelSubDom, int nLevelMesh,
double dRectMin[3] , double dRectMax[3])
{ int nLV, nLevel, nRoyBlock[3] , nColBlock[3];
nLevel = 0;

for(nLV = 0; nLV < 3; nLV++) { nRoyBlock[nLV] = 0; nColBlock[nLV] = O;}

HSubMtxGen(nLevelSubDom ,nLevelMesh , dRectMin , dRectMax ,


nLevel,nRoyBlock,nColBlock);

//--------------------------------------------------------------------
void HSubMtxGen(int nLevelSubDom, int nLevelMesh,
double dRectMin[3] , double dRectMax[3] ,
int nLevel, int nRoyBlock[3] , int nColBlock[3])
{int nLVi,nLVj,nLVk,nNextLevel,nRowSubBlocks[8] [3],nColSubBlocks[8] [3];

if( nLevel == nLevelMesh)


HMtxBlockWrite( nLevelSubDom,nLevelMesh,dRectMin,dRectMax,
nLevel,nRowBlock,nColBlock);
else
{ nNextLevel = nLevel+l;
for(nLVi = 0; nLVi < 2; nLVi++)
for(nLVj = 0; nLVj < 2; nLVj++)
for(nLVk = 0; nLVk < 2; nLVk++)
{ nRoySubBlocks[4*nLVi+2*nLVj+nLVk] [0] =2*nRowBlock[0] +nLVi;
nRoySubBlocks[4*nLVi+2*nLVj+nLVk] [1]=2*nRowBlock[1]+nLVj;
nRowSubBlocks[4*nLVi+2*nLVj+nLVk] [2] =2*nRoyBlock [2] +nLVk;

nColSubBlocks[4*nLVi+2*nLVj+nLVk] [0] =2*nColBlock[0] +nLVi;


nColSubBlocks[4*nLVi+2*nLVj+nLVk] [1]=2*nColBlock[1]+nLVj;
562 B. Kiss, A. Krebsz

nColSubBlocks[4*nLVi+2*nLVj+nLVk] [2] =2*nColBlock[2] +nLVk;

for(nLVi = 0; nLVi < 8; nLVi++)


{ for(nLVj = 0; nLVj < 8; nLVj++)
{ if«nLevel < nLeveSubDom) I I
(dBlockCentrDistance(nRowSubBlocks[nLVi],
nColSubBlocks[nLVj]) <= (sqrt(3.01)/2.0)))
HSubMtxGen(nLevelSubDom ,nLevelMesh, dRectMin ,dRectMax ,
nNextLevel ,nRowSubBlocks [nLVi] ,nColSubBlocks[nLVj]);
else
HMtxBlockWrite(nLevelSubDom,nLevelMesh,dRectMin,dRectMaxJ
nNextLevel,nRowSubBlocks[nLVi],nColSubBlocks[nLVj]); } }

3.2 H-Matrix-Vector Multiplication

Two tables are introduced for the matrix-vector computation by H -submatri-


ces. The table THMtx contains the descriptions of the H-submatrix blocks.
If the row and column blocks of a submatrix block are not too close to each
other then this table is used for the submatrix-vector product computation
as well. In this implicit case the appropriate variables are determined by the
row and column blocks by using the table TSimp of tetrahedra. A tetrahedron
belongs to an H -sub matrix row or column block if its center of gravity belongs
to the given block.
If a block is small and the row and column blocks are close to each other
then the submatrix block is given explicitly. In this case the submatrix-vector
product is computed by the application of the table TExplHMtxBlocks.
The record structures of tables THMtx, TExplHMtxBlocks and TSimp
are the next ones.

11----------------------------------------------------------------------
II Table: THMtx
II Content:Geometric search table for H-submatrix blocks.
II Key: nCSubDomld+nRSubDomld+nSubMtxType+nSubMtxld
II +nLevel+nColBlockCoords+nRowBlockCoords
struct RHMtx

int nRowBlockSubDomld; II Subdomain identifiers of the row


int nColBlockSubDomld; II and column blocks.
int nNeighbType;11 Neighb. type of the subdomains (NBT_IDENT, NBTJNEAR, NBTJ'AR).
int nSubMtxType; I I Submatrix type (BLOCK.EXPL, BLOCLIMPL).
int nExplHMtxBlockld; II Submatrix block id. in the explicit case.
int nLevel; II Division level.
int nRowBlockCoords[3];11 The block coordinates of the
int nColBlockCoords[3];11 row and column blocks.
};
11----------------------------------------------------------------------
II Table: TExplHMtxBlocks
II Content:Explicitly given H-submatrix blocks.
II Key: nExplHMtxBlockld+nRVarld+nCVarld
struct RExplHSubMtx
{
int nExplHMtxBlockld; II Explicit submatrix block identifier.
int nRVarld; II Row variable identifier.
int nCVarld; II Column variable identifier.
double dValue; II Value.
On the Relational Database Style Parallel Numerical Programming 563

};
11----------------------------------------------------------------------
II Table: TSimp
II Content:Geometric search table for terahedra.
II Key: nSubDomId+nLevel+nBlockCoords+nSimpId
struct RSimp
{
int nSimpId; II Tetrahedron identifier.
int nSubDomId; II Subdomain identifier.
int nLevel; II Fine mesh level.
int nBlockCoords[3]; II Search block coordinates.
int nLocation; II Location (LOC_INSIDE, LOCJBOUNDARY).
int nNodeIds[4]; II Node identifiers.
};

The global matrix-vector product y = H . J:. is computed as a sum of lo-


cal matrix-vector products of the form-'JiL(i),R(j) = HL(i),R(j),C(k) . J:.L(i),C(k)·
Here J:.L(i),C(k) stand for the subvector of J:. determined by the cluster with
parameters level L(i) and column block coordinate C(k). The vector 'JiL(i),R(j)
is defined similarly. Only uniform H-matrices will be considered. In this case
the order of the approximation is the same on each cluster determined by
H-submatrix blocks. Then the implicitly given H-submatrix blocks can be
expressed as HL(i),R(j),C(k) = "L;~1 "L;;:;'=1 Qd!.T,m and the matrix-vector prod-
ucts 'JiL(i),R(j) = HL(i),R(j),C(k) . J:.L(i),C(k) are computed as

M M M (M )
'JiL(i),R(j) = L Cz . QI = LQI . CI = LQI' L !!.T,m· J:.L(i),C(k) ,
1=1 1=1 1=1 m=1

where the constant M depends only on the order of the used approximation.
The distribution of the vectors among the processes is determined by the
sub domain level H-matrix partitioning. The values CI = "L;;:;'=l !!.T,m .J:.L(i),C(k)
are computed on that subdomain which is identified by the H -submatrix col-
umn block. Three types of 'JiL(i),R(j) = HL(i),R(j),C(k) . J:.L(i),C(k) , i.e. of local
matrix-vector product computations by H-submatrix blocks are introduced.
The field nNeighbType of the table THMtx serves this purpose. The pos-
sible values of this field are the following.
The compact data exchange denotes the exchange of single and grouped
data together with the values Cz. In the grouped case we send only the param-
eters level L(i) and block coordinate R(j) of the row block cluster. The costs
of the local and global variants of the compact data exchange are O(log(~))
and O(log(p)· p), respectively. Here n is the number of unknowns and p is the
number of processes.
The computation of the global y = H . J:. product needs an additional step
to correct the local 'JiL(i),R(j) value~ at the boundaries of neighbouring sub do-
mains. For this purpose a simple local data exchange is applied by using the
MPI functions MPLSend and MPLRecv too. The cost of this correction
step is O((~)~).
564 B. Kiss, A. Krebsz

Table 2. Possible Values of the Field nNeighbType


Value Description
NBLIDENT Then the subvectors :fL(i).C(k) and 'lLL(i),R(j) belong to the same sub-
domain.
NBT~EAR Then the subvectors :fL(i),C(k) and 'lLL(i),R(j) belong to neighbouring
sub domains in geometric sence. Here compact local data exchange is
applied by using the table THCommNear and the MPI functions
MPLSend and MPLRecv.
NBLFAR Then the subvectors :fL(i),C(k) and
'lLL(i),R(j)
belong to non-
neighbouring sub domains. Here compact global data exchange is ap-
plied by using the array ArrHCommFar and the MPI function
MPLAllreduce.

Since the computational cost of the values Cz and L;~l Cz 'Qz is O(log( ~). ~)
on each subdomain, the total computational cost of the global matrix-vector
product '}L = H . ~ is 0 (log (~) . ~ + p)
per subdomain.
The record structures of the table THCommNear and the array Ar-
rHCommFar are the next ones.

11----------------------------------------------------------------------
II Table: THCommNear
II Content:Data exchange among neighbouring subdomains.
II Key: nSubDomld+nDataType+nVarld+nLevel+nBlockCoords
struct RHCommNear
{
int nSubDomld; II Subdomain identifier.
int nDataType; I I Data type (DLSINGLE,DLGRDUPED).
int nVarld; II Variable id. (single case).
int nLevel; II Division level(grouped case).
int nBlockCoords[3]; II Block coordinates(grouped case).
double dValues[M]; II Values.
};
11----------------------------------------------------------------------
II Array: ArrHCommFar
II Content:The array for data exchange among non-neighbouring
II subdomains.
pArrCommFarArr = new double [nProcNum*M] ;
II The number of processes multiplied by M.

3.3 H-Matrix Multiplier Object


Our H-matrix multiplier object is implemented in the following C++ class
structure form.
11------------------------------------------------------------------
class HMTX {
public:
HMTX() ;
- HMTX();
int SetIOPuff(char* pIOPuff, int nIOPuffSize);
int SetCommParam(int nProcld, int nProcNum);
On the Relational Database Style Parallel Numerical Programming 565

int SetDataBase(RelDataBase *pRDBase); II Rel.database pointer.


int SetGlobalTables(int tGSubDom, II Subdomain descriptions.
int. tGSubDomVar, II Subdomain-varibale
int tGVarSubDom, II connections.
int tGNode, II Nodes.
int tGSimp, II Simplices
int tGHMtx, II H-matrices.
int tGExplHMtxBlocks,l1 Explicit H-matrix blocks.
int tGVectSo, II Source vector.
int tGVectTa ); II Target vector.
int SetLocalTables( int tLSubDom, II Subdomain descriptions.
int tLSubDomVar, II Subdomain-variable connections.
int tLHCommNear, II Near-communication.
int tLNode, II Nodes.
int tLSimp, II Simplices.
int tLHMtx, II H-matrices.
int tLExplHMtxBlocks,l1 Explicit H-matrix blocks.
int tLVectS, II Auxiliary vector.
int tLVectSo, II Source vector.
int tLVectTa, ); II Target vector.

int ScatSubDomInfo( void ); II Scatters tGSubdom.


int ScatSubDomVar( void ); II Scatters tGSubDomVar and
II tGVarSubDom.
int ScatNode( void ) ; II Scatters tGNode.
int ScatSimp( void ) ; II Scatters tSimp.

int ScatHMtx( void ); II Scatters tHMtx.


int ScatExplHMtxBlocks( void ); II Scatters tExplHMtxBlocks.

int ScatVect(int tGVect, int tLVect); II Scatters tGVect.


int GathVect(int tGVect, int tLVect); II Gathers tLVect.

int CompMtxMult(int rLVectTa, int rLVectSo);


private:
int CorrVectBlocks(int nProcIdl, int nProcId2); II The member functions
int SendVectBlock(int nProcIdl, int nProcId2); II of the compact local
int RecvAndAddVectBlock(int nProcIdl, int nProcId2); II data exchange.

int CorrVects(int nProcIdl, int nProcId2); II The member functions


int SendVect(int nProcIdl, int nProcId2); II of the simple local
int RecvAndAddVect(int nProcIdl, int nProcId2); II data exchange.
};

4 Numerical Results
Our parallel H-matrix multiplier implementation has been tested as Laplace
preconditioning matrix. The enclosed test results regard to quasi-uniform reg-
ular tetrahedral triangulations of the very complex water domain fl of the
RABA Euro 2 diesel engine. The shape of this domain can be seen in Figure
1.
We have tested the parallel solution of the three-dimensional Poisson equa-
tion
-Llu = f in fl, u = 0 on afl
by using the preconditioned conjugate gradient (peG) [5, 10, 12] algorithm.
The Poisson equation has been discretized by the finite element method,
applying piecewise linear finite elements. The optimal preconditioning matrix
566 B. Kiss, A. Krebsz

Fig. 1. The Test Domain

P was given in the form P = V2, where the matrix V is the H -matrix ap-
proximation of the bilinear form

V (u, v) = 11
n n
u (;r.) . v
I~ -It
(Y)
2
I
which is an H- 1 j2 norm representation. It is discretized by piecewise linear
finite elements on the level of the given tetrahedral mesh, and by piecewise
bilinear elements on the clusters, which are determined by the H-submatrix
blocks.

Table 3. Test results


Nodes Tetrahedra Processes Computers Prec. matrix Iterations CPU time in sec.
20455 100986 1 1 D 17 92
1 1 P 5 87
8 8 D 17 54
8 8 P 5 18
64 16 D 17 23
64 16 P 5 6
121441 403944 1 1 D 26 246
1 1 P 7 224
8 8 D 26 117
8 8 P 7 59
64 16 D 26 64
64 16 P 7 12
On the Relational Database Style Parallel Numerical Programming 567

Our computer cluster was a 16-machines cluster of IBM PC compatible


machines with 900M H z Intel processor using the message-passing standard
MPI under the operation system LINUX.
The computational results given in Table 3 correspond to the case when
the right hand side of the Poisson equation is

f (x, y, z) = sin (xyz)

and the error bound is c = 10- 4 . The preconditioning matrix D denotes the
inverse of the main diagonal of the matrix of the discretized Laplace operator.
From our results we can conclude that the matrix P is an efficient precon-
ditioning matrix, but its computation is fast enough in parallel environment
only.

Remark 1. The H-matrix approximation of the inverse Laplace-matrix [1]


would probably give a better convergence results, however, we found that its
implementation is much more complicated.

References

1. Bebendorf, M., Hackbusch, W. (2002): Existence of H-Matrix Approximation to


the Inverse FE-Matrix of Elliptic Operators with LOO-Coeffitients, Max-Planck-
Institute, Leipzig, Preprint no. 21.
2. Barm, S., Grasedyck, L., Hackbusch, W. (2002): Introduction to Hierarchical
Matrices with Applications, Max-Planck-Institute, Leipzig, Preprint no. 18.
3. Codd, E. F. (1970): A Relational Model of Data for Large Shared Data Banks,
Communication of the ACM, 13(6), 377-387.
4. Cormen, T. H., Leiserson, C. E., Rivest, R. L. (1990): Introduction to Algorithms,
MIT Press.
5. Douglas, C., Haase, G., Langer, U. (2003): A Tutorial on Elliptic PDE Solvers
and Their Parallelisation, SIAM.
6. Garcia-Molina, H., Ullmann, J.D., Widom, J. (2000): Database System Imple-
mentation, Prentice Hall.
7. Gropp, W., E. Lusk, E. (2001): Installation and Users Guide to MPICH, a
Portable Implementation of MPI, Technical Report ANL-01/x, Argonne National
Laboratory.
8. Hackbush, W. (1999): A Sparse Matrix Arithmetic Based on H-matrices, Part I:
Introduction to H-matrices, Computing 62, 89-108.
9. Hackbush, W., Koromskij, B. N. (2000): A Sparse Matrix Arithmetic Based on
H-matrices, Part II: Application to Multidimensional Problems, Computing 64,
21-47.
10. Heinrich, V. (1993): Iterative Methods for Linear Systems of Equations, Ham-
burger Beitrage zur Angewandted Mathematics, Hamburg.
11. Kiss, B., Krebsz, A. (2002): On the Relational Database Type Numerical Pro-
gramming, In: Proc. of the 3rd Int. Conf. on Engineering Compo Technology,(eds.:
Topping, B. H. V., Bittnar, Z.), Civil-Comp Press, 127-128 (Book + CD-ROM).
568 B. Kiss, A. Krebsz

12. Samarskij, A. A., Nikolajev, E. S. (1989): Numerical Methods for Grid Equations,
vol. 1 and 2, Basel, Bikhiiuser.
13. Samet, H. (1990): Application of Spatial Data Structures, Addison-Wesley, Read-
ing, MA.
A Dynamical System Describing Evolution of
the Implicit Surfaces in Incompressible Viscous
Liquids

Petr Kloucek 1 , Michel V. Romerio 2 and Jennifer L. Wightman 1

1 Department of Computational and Applied Mathematics, Rice University, 6100


Main Street, Houston, TX 77005, USA
[email protected] and [email protected]
2 Institut d' Analyse et Calcul Scientifique, Ecole Polytechnique Federale de
Lausanne, 1015 Lausanne, Switzerland
romerio@epfi·ch

Summary. We present an equation capable of describing the evolution of implicit


surfaces (boundaries of gas bubbles) in liquids. The equation does not contain any
convective term and thus it is independent of the velocity and pressure fields, given
by e.g. Navier-Stokes equations. The equation itself is obtained by a series of vari-
ational problems. First, by interpreting the Euler implicit time-stepping scheme for
the update of Lagrangian flow maps as a Monge-Kantorovich transference problem.
This approach provides then a variational principle for the updates of an Eulerian
phase Indicatrix in terms of the Wasserstein metric. The Euler-Lagrange equations
corresponding to the later variational principle provide the thought after evolution
equation for the implicit surfaces.

1 Classical Approach to Tracking of the Implicit Surfaces


Let us consider a mixture of liquid and gas. We assume that the gas is formed in
bubbles, and we further require that both phases are immiscible. Let us denote
by X = X(x, t) the Lagrangian flow map. Let the Eulerian Indicatrix function
X E BV(D,{0,1}) represent the characteristic function of the domain Dc
occupied by the gas. We set X(x, t) = 1 for x E Dc and X(x, t) = 0 otherwise.
The connection between the Lagrangian and Eulerian formalism is provided
by the relation

OtX(x, t) = v (X(x, t), t) , for all xED, t > 0, (1)


where v(x, t) is the velocity field in the Eulerian representation. This equation
can be solved for a smooth v by the Cauchy-Lipschitz theorem. The required
immiscibility has the form

X (X(x, t), t) = X(x, 0) = XO(x), for all xED, t > O. (2)


The transport equation for the Indicatrix follows immediately from the immis-
cibility requirement (2) and from the equation (1). Namely,
570 P. Kloucek et al.
d
0= dtX(X(x, t), t)
(3)
= OtX(X(x, t), t) + v(X(x, t), t) . V'X(X(x, t), t), xED, t > O.
The symbol V' denotes gradients computed with respect to the configuration
at t = o.
The free surfaces are characterized by the set S(V'X(·, t)), denoting the
points of discontinuities in the gradient of X. The transport equation (3) can
thus be written formally (note that V'x lives in the dual to LOO!) as

OtX(x, t) + v(x, t) . V'X(x, t) = 0 on the free surface S(V'X(·, t)). (4)

The explicit dependence on the unknown, implicit, surface S(V'xL t)) can
be circumvented by introduction of the so-called level set function G, [11].
Namely, let G(x, t) be a smooth function measuring the distance to the in-
terface, being positive inside of a gas domain, and let B denote the Heaviside
function. Then X = BoG. Substituting this representation of the Indicatrix
into the transport equation (4) yields

Jo(G) (OtG + v· V'G) = O. (5)

Assuming that the velocity field is known for any xED, we obtain from (5)
the following level set formulation, [11],

OtG(x, t) + v(x, t) . V'G(x, t) = 0, xED, t > O. (6)


Our goal is to derive a variational principle for X. In other words we strive
to derive a Helmholtz free energy depending on x. A similar approach effort
within the field models is reported in [?]. The selfcontained, i.e., independent
of the subsequent velocity and pressure fields, evolution equation for the In-
dicatrix is then obtained as the Euler-Lagrange equation of the Lagrangian of
the system. The key difference between the approach leading to (6) and our
theory is that we look at the evolution of the implicit (free) surfaces as the
Monge-Ampere transport problem.

2 Generalized Least Action Principle

The Lagrangian of our system includes the kinetic, potential, and surface en-
ergies. We assume that the density of the liquid and the density of the gas are
constant. Thus

po(x) ~f p(x, 0) = p(X(x, t), t). (7)

We assume that the density p is given by

p(X(x, t), t) = PGX(X(x, t), t) + pL(l - X(X(x, t), t)), xED, t > 0, (8)
Selfcontained Dynamics of Implicit Surfaces in Liquids 571

where Pa, PL are the density of the gas and liquid, respectively. We consider
a gas-liquid system with total energy of a Mumford-Shah type, [9]' given by

C:(X, t) ~f ~ J (!
n
po(x) X(x, t)) 2 dx + E(X, t),

where

J
(9)
E(X, t) ~f po(x) g. X(x, t) dx + a II\7x(·, t)11 (0), and
n
X (X(x, t), t) = X(x, 0).
The surface energy, a II\7x(·, t)11 (0), i.e., the total variation of \7x, represents
the perimeters of the sub domains occupied by the gas multiplied by a sur-
face tension coefficient a which is a positive measured quantity. The vector g
represents the gravitation force field.
We apply a generalization of the Principle of Stationary Action, [8]' to
the Lagrangian (9). The equlibrium dynamics obtained from this principle
are given by the Euler equations with an added dissipative term. Moreover,
the pressure drop across the gas-liquid interfaces satisfies the Laplace-Young
equation. To account for the dissipative effects, we define the action as

A(X , t 1, t)
2 ~f 1 t2

t1
e).,(t-to)c:(X , t) dt , (10)

where A > 0 represents the Rayleigh's friction dissipation coefficient, and to :::::
o is a given initial time. Then
(11)

represents a functional on the configuration space ]V(l, given by

]V( ~f {X: 0 ~ 01 X is one-to-one and onto, volume preserving,


separately C1-maps in both the gas and the liquid}.
(12)
We require the action to be stable on the flow maps X(., t) E ]V( with re-
spect to variations which are compatible with the incompressibility and the
immiscibility constraints. Therefore we consider a family of maps Xr preserv-
ing the Lebesgue measure such that Xo = X. We set
d
y= -d
T
Xrl .
r=O

1 The configuration space is not a linear function space. Consequently, the differ-
entiation has to be considered in the tangent space to M where summation is
allowed.
572 P. Kloucek et al.

Hence, Y E TX(.,t)JVL The tangent space to M at the point X(., t) is given by


TX(. ,t)M = {Z 0 X(., t) I Z E TldM, X(., t) EM}, (13)
where X(., t) : x f--> X(x, t), and
TldM = {Z : [l f--> ]R31 div Z(x) = 0 for x E [la, ilL,
Z . n = 0 on a[l, (14)
[Z· nn = 0 on a[lL n a[la}.
The jump in Z . n is the usual difference of the values from the different
sides of the gas-liquid interfaces. The structure of the linear space T1dM, i.e.,
the divergence-free condition, is, by the Liouville theorem, equivalent to the
preservation of the Lebesgue measure. Moreover the variations must satisfy
Y(., tI) = Y(., t2) = o. (15)
We implement the Principle of Stationary Action by requiring the Gateaux
derivative of the action to vanish on variations Y E TX(.,t)M satisfying (15),
i.e.,

for all Y(., t) E TX(.,t)M satisfying (15). (16)


We find that (16) translates into the following problem:
Find X(.,t) EM, l:tX(.,t) E L2 ([O,T], (TX(.,t)Mr) , such that

lT J p(X(x, t), t) (! X(x, t) . ! Y(x, t) - A! X(x, t) . Y(x, t)) dx dt

+ lT J
S?

p(X(x, t), t)g . Y(x, t) dx dt


S?
T
+a { ( H(X(s, t))Y(s, t) . n(X(s, t)) dS dt = 0,
Jo JaS?a(O)
for all Y, ! Y E L2 ([0, T], Tx(.,t)M) , Y(.,O) = Y(., T) = O.
(17)
Here, H denotes the mean curvature. The space (Tx(.,t)M) * denotes the dual
to Tx(.,t)M. The duality is given by the Riemannian metric induced by the
kinetic energy. Consequently, we can identify the dual space with itself.

3 Evolution of free surfaces and Monge-Ampere


Transport Problem
As the first step towards the sought after variational principle, we discretize in
time the Lagrangian c by the implicit Euler scheme to proceed from a given
Selfcontained Dynamics of Implicit Surfaces in Liquids 573

state Xk to Xk+1. The dissipative effects are included in the definition of


the generalized action, but they are not included into the Lagrangian itself.
To overcome this difficulty, we rescale the time coordinate. Namely, let X be
given by

- clef clef e.\(t-to) - 1


X(·,T) = X(·,t), where T= (18)

Then, using the substitution t ----+ T,

1: ~e.\(t-to) J
2
po(x) (:tX(X,t)r dxdt

1: ~ J
n (19)
=
2
Po (x) (d~X(X'T)r dX(AT+1)2 dT.
n
Similarly,

(20)

- clef-
Let Tk = k t:n, and let X k (.) = X(., k b. T). In accordance with the PSA,
we assume that Xk+I, Xk - I are given states. We are looking for Xk which is
a solution of the minimization problem

inf {ex" + 1)' ~! Po (x) F (it) dx + ("7)' E (it) Iit E:M}, whece,

F (X) ~f IX(x) - Xk+I(x)1 2 + IX(x) _ Xk - l (x)1 2 .


(21)

Since
e.\(tk+ 1 -to) - 1
b.T=------
A
we have for A « 1
(22)
Hence, going back to the original time coordinate and using the above approx-
imation, we obtain

xk ~f Arginf i
XEM
J po(x)F (X) dx + (6tk)2 E (X) . (23)
n
574 P. Kloucek et al.

Remark 3.1 (Weak form (17) and the variational principle (23). The
variational formulation (23) is obtained from the PSA applied to the time
interval (tk-l, tk+l) by using the trapezoidal rule to provide an integration rule
in time. Computing the Gateaux derivative of the functional appearing in (23)
and using the discrete integration by parts in time, we recover the implicitly in
time discretized weak formulation (17). D
Remark 3.2 (Time step 6tk). It follows from the above calculations that
if A « 1 then, from (18) and (22),

D
We first state the following result before we proceed to make the link be-
tween the minimizer of the variational problem (24) below and the flow maps.
Theorem 3.3. Let us assume that n is bounded with piece-wise smooth bound-
ary, and that Xk+l, Xk E X are given. Then there exists a unique minimizer
Xk of the following variational problem:

We use the following Definitions

'Dw(X, Xk+l, Xk - 1) ~ ~PG distw (X, Xk+l)2 + ~PL distw (1 - X, 1 _ Xk+l)2


+ 'iPG
1 d'1stw ( X, Xk-l)2 + 'iPL
1 .
d1stw (
1- X, 1 - X k_l)2 ,

EE(X) = de!! PO(X)g . xdx + a II\7xll (n),


n
p(x) ~ PGX(x) + PL(1 - X(x)),
X ~ {X E BV(n,{O, I}) I:JX E N(XO,X) :
lwX(X) dx = Jx-~w)
r XO(x) dx, \j open wen}.

Proof. The proof follows from the strict convexity of the Wasser stein distance
on the relaxed space Xrelaxed which is defined as X but with BV(n, [0, 1]) as the
base space. The existence of the unique minimizer of (24) is then obtained by
the density argument, which follows from the compatibility of the Wasser stein
metric with a convergence in the LP -spaces. The proof can be found in [7]. D

Remark 3.4. We recall that the Monge-Ampere problem is formulated as


follows
Selfcontained Dynamics of Implicit Surfaces in Liquids 575

where

N (XO, Xl) '!2


{V'q'> E £., (XO, Xl) I q'> Lip. cant. and convex, V'q'> invertible for a. a. x E Q},

£., (XO, Xl) '!2 { M E J (XO, Xl) I

(
} M-l(W)
XO(x) dx = 1
W
xl(x) dx, for all Borel subsets w of Q },

J (XO, Xl) '!2


{M: {XO > O} ~ Q ~ {Xl> O} ~ Q I M measurable}.
(26)
D
The existence result of Theorem 3.3 serves as a basis for showing the solv-
ability of the variational problem (23). In particular, it follows from [7J and
[12J that the minimizer has the form

(27)

where q'>k-l,k is the unique solution of the Monge-Ampere problem (25). More-
over, the link mentioned above is provided by

xE Q. (28)

4 Dynamical System

In order to obtain a dynamical system for the evolution of the Indicatrix, we


perform two steps. First, we compute the Gateaux derivative of (24) and then
we, formally, take a limit as Lt -4 0+. Obtaining the Gateaux derivative is a
delicate task, see [7J for complete details.
Let X k ,k+1 E JV( be the optimal flow map of Xk to Xk+l. Similarly, let
X T E JV( be a family of optimal maps bringing Xk onto X T such that X T IT=O =
X k ,k+1. We expand X T around the point T = O. Up to the first order in T, we
have

(29)

Since X T E JV( we have


576 P. Kloucek et al.

(30)

Consequently, in view of (13),

. d
dlV -XT(Xk,k+l(X)) I = 0 for a. a. x E fl. (31)
dT T=O
Hence, we consider all possible variations of the form

~XT (Xk,k+l(X))) I = (\7f 0 h) (x), where


dT T=O
hE Mk, f E W 1 ,2 (fl,JR 1 ), and div(\7f(h(x))) = 0 for a.a. x E fl.
(32)

It follows from [7] that for any f E W 1 ,2 (fl, JR 1 ) there exists h E Mk such
that the divergence free requirement in (32) is satisfied. Thus, we require Xk
to solve

(33)

The fundamental step is to show that within () (.6t) we have

d 1 . k+l 2 d 1 . k-l 2
-d - dlstW(XT' X ) I + -d - dlstW(XT' X ) I

J
T 2 T=O T 2 T=O
~ (Xk+l(x) - 2Xk(x) + Xk-l(X)) f(x) dx
n
-J J
(34)
(x - h(x)). \7f(h(x))Xk(X) dx - (f(h(x)) - f(x)) Xk(x) dx
n n
for all f E W 1 ,2 (fl,JR 1 ), hE Mk.

The last two integrals vanish due to the transport equation. After some calcu-
lations, suitable approximations, and the limit procedures E -7 0+, .6t -7 0+,
(again see [7] for details), we find that the dynamical system is given by a de-
generate equation

Pc (OttX(x, t) - >'OtX(x, t)) = div (X(x, t) ((pc - PL) g + 0:\7 H(x, t))),
(35)
x E S(\7X), t > O.

Remark 4.1. The sequence of solutions obtained by solving the discretized


version of (35) is a sequence of approximate minimizers of (24). Hence, this
sequence is written with respect to a moving frame given, implicitly, by the
optimal flow maps. The connection between the successive solutions is given
by
Selfcontained Dynamics of Implicit Surfaces in Liquids 577

xE D, (36)

where \1tjJk,k+l is the solution of the Monge-Ampere transport problem. This


relation has three implications. First, it shows that Xk+l is the Eulerian rep-
resenter of Xk. The previous time level solution, Xk, is written with respect to
the Lagrangian frame of reference. Secondly, it follows from (35) that we do
not need to convert the two systems. Instead, we can use Xk+l directly in the
fixed, Eulerian frame of reference.
The last implication of (36) concerns the use of the solutions of (35)
with other differential equations, say incompressible Euler equations, semidis-
cretized in time. The purpose of computing sk is both the tracking of the phase
boundaries and the computation of the evolving density. We have in the La-
grangian frame of reference

PLagrange(X, k 6. t) = PGXk(x) + PL (1- Xk(x)) . (37)

Since PLagrange (x, k 6. t) = PEuler (\1 tjJk,k+l (x), k 6. t), it follows from (36) that
the appropriate form of the density passed to the Euler equations written in
the fixed frame of reference, in the time-semidiscrete form, is the following

PEuler(X, k 6. t) = PGXk+l(x) + PL (1- Xk+l(X)) . (38)

In other words, the tracking has to be computed one 6.t ahead of the solution
of the Euler equations. D

5 Appendix: Wasserstein Distance

The Wasser stein distance, denoted distw(., .), is defined by, [13], [2],

distw(so, Sl)2 ~f ~~f


J..L07r x =80,
r Ix - yl2 dJL(x, y).
J"inxn (39)
j.L07r;l=Sl

In words, the infimum in (39) is taken over bistochastic measures with


marginals So and Sl, respectively. This means

JJ 'P(x) dJL(x, y) = J 'P(x)so(x) dx, for all 'P E C(D),


nxn n
JJ J
(40)
'P(y) dJL(x, y) = 'P(y)Sl (y) dy, for all 'P E C(D).
nxn n
We assume that the functions So and Sl are bounded, non-negative, measur-
able functions with compact support. It follows from [Section 3, [2]] that the
578 P. Kloucek et al.

infimum in (39) is attainable and, c. f. [Proposition 3.1, [2]], that the unique
optimal measure f-l* has the form

f-l*(x, y) = 6" (M*(y) - x) so(x) dx. (41)

It can be shown, [Proposition 3.1, [2]], that the representating map is almost
everywhere equal to the gradient of a convex Lipschitz continuous scalar func-
tion which has an, almost everywhere, invertible gradient. Substituting the
measure f-l* back into (39), we obtain an alternative formulation of the Wasser-
stein distance more closely related to the Monge-Kantorovich problem, ([10]'
[4], [1], [5], [3], [6]' [14]). Namely, if the marginals have the same mass, i. e. ,
In so(x) dx = InSl (x) dx, it follows from the above discussion, that there exist
¢*, convex and Lipschitz continuous, such that '\l¢* E .G(so, Sl) and invertible,
solving

inf {{.,(X) I
1M (x) - xl' dx M E L(.", .<d, M inv"tibl, }, Whff'

:1(so, Sl) ~f {M: {so> O} <:::: 0 f--' {Sl > O} <:::: 0 I M measurable},

.G(so, Sl) ~f {M E :1(so, Sl) I ( so(x) dx

1
JM-l(w)

= Sl(X) dx, for all Borel subsets w of o}.


(42)
Thus

distw(so, Sl)2 = J Xo(x) l'\l¢*(x) - Xl2 dx. (43)


n
We note that the requirement of the transport of So onto Sl by a volume
preserving map can be written in a form

J
n
Sl(X)rp(X) dx = J
n
so(x)rp(M(x)) dx, for all rp E CO(O), (44)

which is to say that,

sl(M(x)) = so(x), xE 0, (45)

for M E M. o

6 Acknowledgement
The authors would like to thank Professor Olivier Besson for his hospitality
and support while we worked on the presented material in the Institut de
Selfcontained Dynamics of Implicit Surfaces in Liquids 579

Mathematique, Universite de Neuchatel. We are also pleased to acknowledge


the generous support for this work provided by the Alcan Technology Center.
Petr Kloucek was supported in part by the grant NSF DMS-OI07539, by the
grant from TRW Foundation and by the grant NASA SECTP. Michel Romerio
was supported in part by Contract No. 74837-001-0349 from the Regents of
University of California (Los Alamos National Laboratory) to William Marsh
Rice University. Jennifer Wightman was supported in part by the grant NASA
SECTP.

References

1. J.-D. Benamou and Y. Brenier. A numerical method for the optimal time-
continuous mass transport problem and related problems, pages 1-11. Contempo-
rary Mathematics: Monge-Ampere Equation:Application to Geometry and Op-
timization. American Mathematical Society, first edition, 1999. L. A. Caffarelli
and M. Milman, eds.
2. Y. Brenier. Polar factorization and monotone rearrangement of vector-valued
functions. Comm. Pure Appl. Math., 64:375-417, 1991.
3. L. A. Caffarelli. Boundary regularity of maps with convex potentials. Ann. of
Math., 2(3):453-496, 1996.
4. B. Dacorogna and J. Moser. On a partial differential equation involving the
jacobian determinant. Ann. Inst. H. Poincare Anal. Non Lineare, 7(1):1-26,
1990.
5. L. C. Evans. Partial differential equations and monge-kantorovich mass transfer.
Lecture Notes, pages 286-294, 1998.
6. W. Gangbo and R. J. McCann. The geometry of optimal transport. Acta Math.,
(2):113-161, 1996.
7. P. Kloucek, M. V. Romerio, and J. L. Wightman. The variational formulation
of tracking free surfaces in liquids. SIAM, J. Appl. Math., 2003. submitted.
8. C. Marchioro and M. Pulvirenti. Mathematical Theory of Incompressible Non-
viscous Fluids. Applied Mathematical Sciences. Springer-Verlag, 1994.
9. J.-M. Morel and S. Solomini. Variational Methods in Image Segmentation, vol-
ume 14 of Progress in Nonlinear Differential Equations and Their Applications.
Birkhiiuser, Boston, 1st edition, 1995.
10. J. Moser. On the volume elements on a manifold. Trans. Amer. Math. Soc.,
120:286-294, 1965.
11. S. Osher and R. Fedkiw. Level Set Methods and Dynamic Implicit Surfaces,
volume 153 of Applied mathematical Sciences. Springer Verlag, New York, 1
edition, 2003.
12. F. Otto. Evolution of microstructure in unstable porous media flow: a relaxation
approach. Comm. Pure Appl. Math., 2002. to appear.
13. S. T. Rachev. Probability metrics and the stability of stochastical models. Wiley,
New York, 1 edition, 1991.
14. S. T. Rachev and L. Ruschendorf. Mass transportation problem, volume I, II of
Probability and its Application. Springer-Verlag, New York, 1998.
Discrete Maximum Principles in Finite
Element Modelling

Sergey Korotov 1 and Michal Kfizek 2

1 Department of Mathematical Information Technology, P. O. Box 35, University of


Jyviiskylii, FIN-400 14, Finland, [email protected]
2 Mathematical Institute, Academy of Sciences, Zitmi 25, CZ-115 67 Prague 1,
Czech Republic, [email protected]

Dedicated to Prof. Miloslav Feistauer on his 60th birthday.

Summary. Nonobtuse tetrahedral partitions and linear finite elements guarantee


the validity of a discrete analogue of the maximum principle for a wide class of
parabolic and elliptic problems in the three-dimensional space. In this paper we
propose global and local refinement techniques which produce nonobtuse face-to-
face tetrahedral partitions of a polyhedral domain.

1 Introduction

The maximum principle represents one of the most characteristic features of


solutions of second order elliptic or parabolic problems. In this paper we sur-
vey some of our recent results concerning the discrete maximum principle for
three-dimensional nonlinear elliptic boundary-value problems solved by lin-
ear tetrahedral finite elements. In particular, we introduce sufficient geometric
conditions on tetrahedral meshes that guarantee the validity of the discrete
maximum principle, and present algorithms for global and local refinements of
meshes preserving these geometrical properties.
Linear tetrahedral finite elements are commonly used for solving second
order boundary value problems, since they do not require a high regularity of
the solution. The structure and properties of the associated stiffness matrix
essentially depend on the dihedral angles between faces of tetrahedral elements.
To see this fact, let us consider an arbitrary tetrahedron ABCD. Let p and q
be two linear functions such that

p(A) = 1, p(B) = p(C) = p(D) = 0,


q(B) = 1, q(A) = q(C) = q(D) = O.
Then a straightforward calculation leads to the following formula (see [11,
p. 63] for the proof)

n . n __ meas2ACDmeas2BCD
vp vq - 9 (meas3ABCD)2 coseY, (1.1)
Discrete Maximum Principles in Finite Element Modelling 581

where a is the angle between the faces ACD and BCD (see Fig. 1) and the
symbol measd stands for d-dimensional measure. Scalar product (1.1) is, there-
fore, independent of all the other 5 dihedral angles of the tetrahedron ABCD.
If a > 7r/2 then the scalar product (1.1) is obviously positive. Hence, each
obtuse dihedral angle of the tetrahedron ABCD gives a positive contribution
to the corresponding off-diagonal entry of the element stiffness matrix, when
solving a boundary value problem with the Laplace operator by the finite
element method.

ex:
A

c
Fig. 1. A tetrahedron whose two faces include the angle a

Note that the same is also true (see [11]) for a wider class of nonlinear
elliptic problems of the form

-V' . (,\(x, u, V'u)V'u) = f(x) in n, (1.2)

u=o an,
on (1.3)
where ,\ is a positive smooth function and n is a bounded polyhedral domain
with Lipschitz boundary an. Equation (1.2) describes, for instance, a station-
ary nonlinear heat conduction or magnetic potential in ferromagnetic media
(cf. [13]).
According to [5, p. 206]' problem (1.2)~(1.3) satisfies the maximum princi-
ple, i.e., if f ::::; 0 then the maximum of u over n is attained on the boundary
an (see also [14]). It is thus natural to look for a class of finite elements such
that the same implication is satisfied. Note that, e.g., for bilinear rectangular
elements the discrete maximum principle can be violated (see [1, p. 254]). The
same is true also for trilinear block elements (a simple example is given in [12,
p. 562]). However, if we decompose these elements into nonobtuse triangles or
tetrahedra keeping the number of degrees of freedom, the discrete maximum
principle will hold.
582 S. Korotov, M. Kflzek

2 N onobtuse tetrahedralizations
A tetrahedron is said to be nonobtuse if all six dihedral angles between its
faces are less than or equal to 7r /2. By a tetrahedralization we shall mean
a face-to-face partition of [2 into tetrahedra in the standard sense (see [3]).
A tetrahedralization is said to be nonobtuse if it contains only nonobtuse tetra-
hedra.
By [3, p. 150], linear triangular nonobtuse elements guarantee the validity
of the discrete maximum principle in the plane. This result is generalized into
three-dimensional space in [11], namely, linear tetrahedral elements applied to
problem (1.2)-(1.3) on nonobtuse partitions yield irreducibly diagonally dom-
inant stiffness matrices (whose all off-diagonal entries are non positive). It is
well known (see [15, p. 85]) that such matrices are monotone. Nonobtuse tetra-
hedral partitions thus guarantee the validity of the discrete maximum principle
for problem (1.2)-(1.3) solved by linear elements, i.e., we have Uh ::; 0 provided
f ::; 0, where Uh is the continuous and piecewise linear Galerkin approxima-
tion of the solution of (1.2)-(1.3). In other words, Uh attains its maximum
on the boundary 8[2, if the homogeneous Dirichlet boundary conditions are
considered. Note that nonobtuse tetrahedralizations enable us to prove the
Loo-convergence of finite element approximations to the weak solution (see,
e.g., [4]).
According to [7], an arbitrary polyhedron can be decomposed into no nob-
tuse tetrahedra. In order to improve the discretization error a given partition is
refined locally or globally. That is why, the issue of preserving the nonobtusity
appears during the refinement process.
In [6] we give a global refinement procedure yielding nonobtuse tetrahe-
dra over the whole domain. This procedure is briefly described in Section 3.
However, such a technique requires a large amount of computer memory to
store the associated stiffness matrix. Therefore, in Section 4 we introduce a lo-
cal refinement procedure yielding nonobtuse partitions that refine only near
a given vertex, where a singularity of the exact solution may appear (see, e.g.,
[2], [8], [9]).

3 Global refinement techniques


Definition 3.1. A tetrahedron is said to be a path tetrahedron if it has three
mutually perpendicular edges which do not pass through the same vertex.
The reason for the name of the above tetrahedron is that its three perpen-
dicular edges form a "path" (see Fig. 2).

Proposition 3.2. Any path tetrahedron is nonobtuse.


For the proof see [6, p. 728-729].
In Fig. 2 we observe a typical shape of a path tetrahedron (all its right
angles, solid and dihedral, are indicated there).
Discrete Maximum Principles in Finite Element Modelling 583

D
Fig. 2. A path tetrahedron

Theorem 3.3. Let T be an arbitrary tetrabedron sucb tbat its circum-


centre belongs to T, and let all faces of T be nonobtuse triangles. Tben tbere
exists a family of tetrabedral partitions of T containing only path tetrahedra.
This theorem is proved in [6]. Note that each path tetrahedron satisfies
the assumptions of Theorem 3.3, since its faces are right triangles and its
circumcentre is the midpoint of the longest edge. In Fig. 3 we see a partition
of a tetrahedron T, that satisfies the assumptions of Theorem 3.3, into path
tetrahedra. They are defined in the following way. First we divide each face
F of T into 6 or 4 right subtriangles by connecting the circumcentre of F
with 3 vertices and 3 midpoints of sides of F. The common vertex of these
subtriangles is the circumcentre of F. This kind of plane refinement we call 2d
yellow (see [6]). Denoting the circumcentre of T by G, we can define the path
subtetrahedra as the convex hull of G and particular right subtriangles on the
surface of T (compare with Fig. 3). We call such a kind of three-dimensional
refinement 3d yellow.

The advantage of the above approach is that a common face F of any two
adjacent tetrahedra (satisfying the assumptions of Theorem 3.3) in a given
tetrahedralization of a polyhedral domain is divided in a unique way. This
enables us to develop global refinement techniques yielding only nonobtuse
subtetrahedra (see [6] for details).

4 Local refinement techniques

The main idea of generating local refinement technique producing only nonob-
tuse partitions is exposed in the following theorem.
584 S. Korotov, M. Krizek

Fig. 3. Nonobtuse partition of a tetrahedron into path subtetrahedra

Theorem 4.1. Let ABCD be a path tetrahedron whose edges AB, BC,
and CD are mutually perpendicular. Then there exists an infinite family of
nonobtuse tetrahedralizations of ABC D into path tetrahedra that locally re-
fine ABC D in a neighbourhood of the vertex A.
For a detailed constructive proof see [2] or [9]. Its main idea is sketched on
Fig. 4. Using several sophisticated orthogonal projections, we first subdivide
the tetrahedron ABCD into five nonobtuse tetrahedra. Then we show that
the path tetrahedron AT SQ from Fig. 4 is similar to the original tetrahedron
ABCD. The subtetrahedron ATSQ can be now decomposed into 5 subtetra-
hedra in a similar way as ABCD, and thus we can get further refinement near
the point A. In this manner, we obtain recursively the required infinite family
of nonobtuse tetrahedralizations.
An algorithm for a local refinement of a cube producing only nonobtuse
tetrahedra is presented in [8]. If several cubes meet at one point, then we can
apply this algorithm to each of them so that the whole partition remains face-
to-face. For instance, in Fig. 5 we see such a local refinement a polyhedral
domain D = (-1,1)3 \ [0,1)3 which presents the union of 7 cubes. Each cube
is first divided in a standard way into 6 path tetrahedra having a common
vertex in the reentrant corner of D. For each of the 7 x 6 = 42 tetrahedra
we apply the algorithm given by Theorem 4.1 such that the partition of each
path tetrahedron is just the mirror image of the partition of any adjacent
tetrahedron having a common face. Note that the concave (reentrant) corner
in Fig. 5 is called the Fichera corner or the Fichera vertex.
Discrete Maximum Principles in Finite Element Modelling 585

D
Fig. 4. Partition of a path tetrahedron into 5 path subtetrahedra

Fig. 5. Local refinement technique producing nonobtuse tetrahedra near the Fichera
corner

5 Concluding remarks

Nonobtuseness of all dihedral angles of all tetrahedra in the partition repre-


sents only a sufficient condition to guarantee the discrete maximum principle
for linear elements. In [10] we present a weakened condition for the shape of
tetrahedra, which enables us to use tetrahedra, some of whose dihedral angles
are slightly bigger than 7r /2. This condition is also only sufficient to get a
monotone stiffness matrix and thus also the validity of the discrete maximum
principle.
586 S. Korotov, M. Kflzek

Acknowledgement
This paper was supported by Grant InBCT of TEKES, Finland and Grant
No. A 1019201 of the Academy of Sciences of the Czech Republic.

References

1. Axelsson, 0., Barker, V.A. (1984): Finite Element Solution of Boundary Value
Problems. Theory and Computation. Academic Press, New York
2. Beilina, L., Korotov, S., Kflzek, M. (2003): Application of the local nonobtuse
tetrahedral refinement techniques near Fichera-like corners. Preprint 2003-02,
Chalmers Finite Element Center, Goteborg, 1-16
3. Ciarlet, P.G. (1991): Basic error estimates for elliptic problems. In: Ciarlet, P.G.,
Lions, J.L. (eds) Handbook of Numer. AnaL, vol. II. North-Holland, Amsterdam
4. Feistauer, M., Felcman, J., Rokyta, M., Vlasek, Z. (1992): Finite-element solution
of flow problems with trailing conditions. J. Comput. Appl. Math., 44, 131-165
5. Gilbarg, D., Trudinger, N.S. (1977): Elliptic Partial Differential Equations of
Second Order. Springer-Verlag, Berlin
6. Korotov, S., Kflzek, M. (2001): Acute type refinements of tetrahedral partitions
of polyhedral domains. SIAM J. Numer. Anal. 39, 724-733
7. Korotov, S., Kflzek, M. (2002): Dissection of an arbitrary polyhedron into nonob-
tuse tetrahedra. Preprint nr. B 3/2002 of Dept. Math. Inform. Technol., Univ. of
Jyviiskylii, 1-6
8. Korotov, S., Kflzek, M. (2003): Local nonobtuse tetrahedral refinements of
a cube. Appl. Math. Lett., 16, 1101-1104
9. Korotov, S., Kflzek, M. (2003): Global and local refinement techniques yielding
nonobtuse tetrahedral partitions (accepted by Comput. Math. Appl.), 1-10
10. Korotov, S., Kflzek, M., Neittaanmiiki, P. (2001): Weakened acute type condition
for tetrahedral triangulations and the discrete maximum principle. Math. Compo
70, 107-119
11. Kflzek, M., Lin Qun (1995): On diagonal dominance of stiffness matrices in 3D.
East-West J. Numer. Math. 3, 59-69
12. Kflzek, M., Liu, L. (2003): On the maximum and comparison principles for a
steady-state nonlinear heat conduction problem. Z. Angew. Math. Mech. 83,
559-563
13. Kflzek, M., Neittaanmiiki, P. (1990): Finite Element Approximation of Varia-
tional Problems and Applications. Pitman Monographs and Surveys in Pure and
Applied Mathematics vol. 50, Longman Scientific & Technical
14. Protter, M.H., Weinberger, H.F. (1967): Maximum Principles in Differential
Equations. Prentice-Hall, New Jersey
15. Varga, R.S. (1962): Matrix Iterative Analysis. Prentice-Hall, New Jersey
A Posteriori Error Estimation in Terms of
Linear Functionals for Boundary Value
Problems of Elliptic Type

Sergey Korotov 1 , Pekka Neittaanmiiki 1 and Sergey Repin 2

1 Department of Mathematical Information Technology


University of JyviiskyUi
P.O. Box 35, FIN-40014 Jyviiskyla, Finland
korotov, [email protected]
2 V.A. Steklov Institute of Mathematics in St.-Petersburg
St.-Petersburg, 191011, Fontanka 27, Russia
[email protected]

Summary. The paper deals with a posteriori error estimation in terms of special
problem-oriented quantities, represented as a linear functionals that control the be-
havior of a solution in certain sub domains , along some lines, or at especially inter-
esting points. The method of estimating such quantities is based on the analysis of
the adjoint boundary-value problems, whose right-hand sides are formed by the con-
sidered linear functionals. On this way, we propose a new effective approach based
on two principles: (a) the original and adjoint problems are solved on non-coinciding
meshes, and (b) the term presenting the product of gradients of errors of the pri-
mal and adjoint problems is estimated by using the "gradient averaging" technique.
The model problem of elliptic type is analysed and the results of numerical tests are
presented.

1 Introduction

A posteriori error estimates play an important role in modern numerical analy-


sis. For finite element methods, they are usually obtained either by estimating
a weak norm of the residual (see, e.g., [1], [2], [3]' [4], [5], [12]) or by using
special post-processing procedures (see, e.g., [12], [13]). For Galerkin approxi-
mations of linear elliptic problems, they estimate the error in the global (en-
ergy) norm and also provide an error indicators that are further used in various
mesh adaptive procedures. Global error estimates give a general presentation
on the quality of an approximate solution and a stopping criteria. However,
for the engineering purposes, such an information is often not sufficient. In
many cases, analysts are mainly interested not in the value of the total error,
but in errors over certain subdomains, lines, or at special points. A possible
way of estimating such errors is to introduce a linear functional C associ-
ated with a "quantity of interest" and to obtain an estimate for the value of
< C, u - v > ,where u is the exact solution and v is the approximate one.
588 S. Korotov et al.

Known methods (see, e.g., [1], [6]' [11]) find estimates of < C, U - uh > for
a Galerkin approximation Uh by employing an additional (adjoint) problem,
whose right-hand side is formed by the functional C . If the Galerkin approx-
imation of the adjoint problem is computed on the same mesh as Uh, then
the functional < C, U - Uh > is expressed via a certain integral functional,
which can be estimated by using, e.g., "equilibrated residual method" (see,
e.g., [1], [11]).
In the present work, we propose a new way of estimation of "quantities of
interest". It is based on two principles: (a) the original and adjoint problems
are solved on non-coinciding meshes, and (b) the term presenting the product
of errors arising in the primal and adjoint problems is estimated by the gradient
recovery technique widely used in various applied problems (see [7], [10]' [12],
[13]). This differs our approach from others, where it is usually assumed that
the Galerkin approximations of the primal and adjoint problems are computed
in the same finite dimensional subspaces.
The effectivity of the method suggested in this paper, strongly increases
when one is interested not in a single solution of the primal problem for a con-
crete data, but analyzes a series of approximate solutions for a certain set of
boundary conditions and various right-hand sides (such a situation is typical
in the engineering design when it is necessary to model the behavior of a con-
struction for various working regimes). In this case, the adjoint problem must
be solved only once for each "quantity of interest", and its solution can be
further used in testing the accuracy of approximate solutions of various primal
problems.

2 General scheme

Introduce Hilbert spaces Hand Y with scalar products and norms

and the Banach space V, which is continuously embedded into H, with the
norms denoted by I ·llv. Let A E £(V, Y), A E £(Y, Y), and

cIilYII~ : : : (Ay, y)y ::::: c211yll~ 'Vy E Y, c31lwllv::::: IIAwlly 'Vw E Vo,
(2.1)
where Vo is a subspace of V and Cl, C2, C3 are positive constants. Given j E Vo*,
consider the following problem:
Primal Problem (P): Find U E Vo, such that
(AAu, Aw)y = < j, w > 'Vw E Vo, (2.2)
where < .,. > denotes the duality pairing of the spaces Vo and Vo*.
Let C be another element of Vo*, which forms the "quantity of interest"
< C, U - il >, for an arbitrary element il E Va viewed as an approximation
A Posteriori Error Estimation in Terms of Linear Functionals 5S9

of u. In order to estimate the above defined quantity, we introduce another


problem.
Adjoint Problem (Pa): Find v E Va, such that

(A*Av,Aw)y = < £,w > Vw E Vo, (2.3)

where A* is the operator adjoint to A, i.e., (Ay, z)y = (y, A*z)y Vy, z E Y.
In the friequently encountered cases of the selfadjoint operators, the left-
hand sides of (2.2) and (2.3) coincide and both the problems are associated
with a functional of the type
1
J(w) = '2 (AAw, Aw)y + < /-L, w >,
which is known to have a unique minimizer on Vo for any /-L E Vo*.

Proposition 1. Let U and v be solutions of problems (P) and (Pa ), respec-


tively. Then for any il, v E Vo

< £, U - il >=: E(il, v) = Eo(il, v) + El (il, v), (2.4)


where
Eo(il,v) = < f,v > -(AAil,Av)y, (2.5)
and
E1(il,v) = (AA(u-il),A(v-v))y. (2.6)
The proof can be found in [8].
The term Eo (il, v) is explicitly computable, whereas the term El (il, v)
contains unknown solutions of Problems (P) and (Pa ). Evidently, the term
Eo (il, v) dominates if v is "sufficiently close" to the exact solution v of Prob-
lem (Pa ). Really, if v ---+ v in V, then v ---+ v in H, and A(v - v) ---+ 0 in Y, so
that
E1(il, v) ---+ 0, and Eo(u, v) ---+ < £, u - u >,
i.e., Eo (il, v) contains the major part of the quantity of interest.
Let Vh and VT be two finite-dimensional subspaces of Va, and let il = Uh,
v=v T , where Uh and V T are solutions of the problems

(AAuh, AWh)y = < f, Wh > VWh E Vh , (2.7)


(A*AvT,AwT)y =<£,wT > VWTEVT. (2.8)
In a particular case of Vh == VT, the term Eo (Uh' vT) = 0 due to the orthog-
onality condition in problem (2.7). Thus, if meshes in problems (2.7) and (2.8)
coincide, then the estimate has only one term containing the product of the
(unknown) energy errors. On the contrary, an usage of non-coinciding meshes
leads to another estimate that has two terms.
590 S. Korotov et al.

Finding a sharp approximation of the adjoint problem may require high


computational costs. A more economical way is to use an approximate so-
lution of the adjoint problem having approximately the same quality as the
approximate solution of the primal one and recover unknown functions Au
and Av by some post-processing techniques. Let G h and G r be certain aver-
aging operators defined on Vh and Vr , respectively. We replace E( Uh, v r ) by
the directly computable functional
(2.9)
where
(2.10)
If the operators G h and G r provide proper recovery of Au and Av, then
it is natural to await that the difference between E1(Uh,V r ) and E1(Uh,V r )
is presented by the higher order terms and, thus, the latter quantity can be
succesfully used instead of E 1 . This observation is justified theoretically and
numerically in what follows for a model problem of elliptic type.

3 Model elliptic type problem


3.1 Formulation of the problem

We define the operator A as \7 := (&~1 ' &~2) and set

Y = L 2 (D,R 2 ), H = L 2 (D), V = Wi(D), Vo* = H-1(D).


Consider the following problem: find U satisfying the system
-div (A \7u) =f in D, (3.1)
U = 0 on an. (3.2)
In the above, D is a bounded and connected domain in R2 with a Lipschitz
continuous boundary aD, the symbol 1I denotes the unit outward normal to
the boundary, the matrix of coefficients A = {aij(x)};,j=l is symmetric and
meets the conditions
aij(X) E Loo(D), A (x)e· e ;: : c IIel1 2 \Ie E R2 \Ix E D, (3.3)
where the dot denotes the scalar product in R 2 . Also, it is assumed that
(3.4)
Hereafter, various constants, independent of hand T, are denoted by one let-
ter C.
Now, Problem (P) consists of finding U E Vo such that

J
f2
A \7u· \7w dx = J
f2
fw dx \lw E Va, (3.5)
A Posteriori Error Estimation in Terms of Linear Functionals 591
o
where Vo = W HD). Let U E Vo be an approximation of u. Assume that we are
interested in the value of

< £, u - U >= J <p(u - u) dx, (3.6)


n
where supp <P C w <;;:; n. Estimates of such quantities for one (or several)
a priori given functions <p give an information about the behaviour of u - U
In w.
In this case, Adjoint Problem (Pa ) consists of finding v E Vo such that

J
n
A'Vv· 'Vw dx =< £, W > Vw E Vo. (3.7)

Suppose that v E Vo is an approximation of v. Then, for problems (3.5) and


(3.7), the relation (2.4) reads as follows:
<£,u-u> = Eo(u,v) + E 1 (u,v), (3.8)
where
Eo(u, v) J =
n
fvdx - J
n
A 'Vv . 'Vu dx, (3.9)

E 1 (u,v) =J A 'V(v - 'Vv)· 'V(u - 'Vu) dx. (3.10)


n
Let Vh and VT be two finite-dimensional subspaces in Vo, constructed by
the Courant type finite element discretization. The respective Galerkin ap-
proximations Uh and V T are formed by piecewise-affine continuous functions
and defined on the domains

Dh = U T~, DT = U T~ (3.11)
T~ETh T~ETT

where the respective triangulations are denoted by T", and '4, and their el-
ements are denoted by T~ and T4, respectively. Consider the corresponding
finite dimensional problems (we assume for a moment that Dh = DT = D):
Problem (ph): Find Uh E Vh such that

J
n
A'VUh' 'VWh dx = J
n
fWh dx VWh E Vh, (3.12)

Problem (P:;): Find VT E VT such that

J
n
A 'VvT . 'VwT dx =< £, WT > VWT E Vr- (3.13)

Set u = Uh and v= v T • Then


592 S. Korotov et al.

< £, U - Uh >=: E(Uh, vT) = EO(Uh, vT) + E 1(Uh, VT).


On Th, we define the gradient averaging operator Gh : Loo([h, R2) ---)
Wi([h,R2) as the operator that defines a vector-valued piecewise linear
function by setting each nodal values as the mean values of 'VUh on all el-
ements incident with the corresponding nodal point. The averaging operator
GT : Loo(DT' R2) ---) W}(DT' R2) is defined in the same manner. Nowadays,
averaging operators of this type are widely used in the mathematical modelling
(see, e.g. [1], [4], [7], [10]' [12]).
Averaging near the boundary (Dh =1= DT =1= D) requires a more complicated
analysis. However, in the literature this question was investigated (see, e.g.
[7], where concrete forms of the averaging operators near the boundary are
presented). In our further considerations, we use the results of the above pa-
per and, therefore, impose the same conditions on the problem data and the
structure of meshes. Namely, we assume that aD belongs to the class C 3 , the
coefficients aij are smooth and the sets Dh and Dn defined in (3.11), are such
that

(3.14)

Further, we assume that functions Uh and V T as well as the averaged gra-


dients Gh('VUh) and GT('VVT) are extended by zero on DOh := D \ Dh and
DOT := D\ DT.
For any pair (w h, wT ) E Vh X Vn we define the following functional

(3.15)

where

E 1(Wh,WT) = J A('VWh-Gh('VWh))·('VWT-GT('VwT))dx. (3.16)


nh
The functional E( Uh, vT ) is directly computable once the approximations
Uh and VT are defined. Our aim is to show that E( Uh, vT) is a simple and
effective estimator of the quality < £, U - Uh >.
We can easily show that
(3.17)

where luI2,2,n denotes the L 2 -norm of the second derivatives of u, which exist
due to the conditions imposed on Problem (P), see below.
The estimate (3.17) is sharp, in the sense that it gives a correct asymp-
totical order of the term Eo, whereas the terms E1 (Uh' vT) and E1 (Uh' v T) are
asymptotically smaller ones. Really,

(3.18)

and
A Posteriori Error Estimation in Terms of Linear F'unctionals 593

lEi (Uh, vT)1 = I J A (V'Uh - Gh(V'Uh)) . (V'VT - GT(V'VT)) dxl ::;


Sh
::; C(II V'Uh - V'u 112,n + I V'u - Gh(V'Uh) 112,nh) X
X (II V'vT - V'v 112,n + II V'v - GT(V'VT) 112,nJ.
Therefore, if the superconvergence of the averaged gradients takes place,
then the first terms in the round brackets dominate and, therefore, Ei (Uh' v T )
has the same asymptotic order h7.
From the above, we see that asymptotically Eo (Uh' vT ) contains the main
part of the quantity < C, U - Uh >, except the case Vh == VT, when this term
vanishes.
It remains to compare Ei(Uh,V T) and Ei(Uh,V T). We do this under addi-
tional assumptions used in [7]. Namely, it is assumed that f and C belong to
Wi(D), so that u, v E Wi(D) and that triangulations Th and T, are composed
of uniform elements, for which the superconvergence estimates

II V'u - Gh(V'Uh) 112,nh::; C h3/ 2(11 U 113,2,n + II f 112,2,n), (3.19)

II V'v - GT(V'VT) 112,n r ::; C 7 3/ 2(11 v 113,2,n + II C 112,2,n) (3.20)


hold. Also, we use the superconvergence of Ihu and JITv, where JIh and JIT are
the nodal interpolation operators, that have been established (see [7]) under
the same additional assumptions. In our case, they have the form

I V'Uh - V'(JIhU) 112,nh::; C h3/ 2(11 U 113,2,n + I f 112,2,n), (3.21 )


II V'vT - V'(JITv) 112,nr ::; C 7 3/ 2(11 v 113,2,n + II C 112,2,n). (3.22)

Proposition 2. Let aD E C 3 , Th and T, be two families of triangulations with


above described properties, and u, v E Wi(D). Then, for sufficiently small h
and 7 (7 < h), we have

where m is any positive integer greater than 2, p,(h,7) contains higher order
terms, and the constant C does not depend on hand 7.

The proof can be found in [8].


Let us summarize the above analysis of the behavior of Eo (Uh, v T ),
Ei (Uh, v T ), and Ei (Uh, vT) with respect to sufficiently small hand 7 (h > 7):
the explicitly computable term EO(Uh,V T) is the major one;
- the terms Ei (Uh, v T) and Ei (Uh, v T) are subsidiary ones;
- the difference between Ei(Uh,V T) and Ei(Uh,V T ) has higher asymptotic
order provided that the solutions of the problems (P) and (Pa ) are regular
enough to guarantee the superconvergence of the Galerkin approximations.
The above observations suggest the following numerical strategy:
594 S. Korotov et al.

a) Define Vr taking into an account the nature of the functional £ (e.g., by


putting extra trial functions in a subdomain associated with it), and calcu-
late V r .
b) Define Vh and calculate Uh.
c) Calculate Eo (Uh, v r ) directly and use post-processed values of \l Uh and
\lvr to estimate E1(Uh,V r ) replacing gradients by the averaged gradients.

3.2 Numerical experiments

In this subsection, we present numerical results for the model problem (more
numerical tests on the subject can be found in [8]).

Test Problem: Find U such that


82 u 82 u
- 8x 2 - 8y2 = f(x, y) in n= (0,1) x (0,1),

U = 0 on 8n,
where the exact solution is the infinitely smooth "hat-function":

exp (16 - 1-(4~-3)2 ) . exp (16 - 1 d y6 3)2) ,


u(x, y) = { if (x, y) E (0.5,1.0) x (0.5,1.0),
0, otherwise,
As a quantity of interest, we take

< £, U - Uh >= J
w
(u - Uh) dx, where w = (0.125,0.250) x (0.125,0.250).

and define the effectivity index Jeff = I~~~:"'~~~I'


The results of computations are presented in Table 1 below, where the sym-
bol N x N (N = 16,32,64,128) mean that the corresponding test is performed
on a triangular mesh of thr square (0, 1) x (0, 1) consisting of 2 x N x N right
triangles. The results demonstrate a good performance of the estimator.

Table 1. The results of performance of the estimator E.


Primal Adjoint Eo < e, U - Uh > Jeff

16 X 16 32 X 32 -0.00000501 -0.00000062 -0.00000439 -0.00000495 0.89


16 X 16 64 X 64 -0.00000496 -0.00000017 -0.00000479 -0.00000495 0.97
16 X 16 128 X 128 -0.00000495 -0.00000005 -0.00000490 -0.00000495 0.99
A Posteriori Error Estimation in Terms of Linear Functionals 595

4 Final comments

1) The computations made in the item a) can be further used for estimation of
errors of an approximate solution obtained on another mesh and for different
righ-hand side functions f.
2) The approach is valid for another boundary conditions [8]' is suitable for
estimating local integral norms [8]' and can be applied to problems in linear
elasticity theory [9].

References
1. Ainsworth, M, Oden, J. T. (2000): A posteriori error estimation in finite element
analysis. John Wiley & Sons, Inc.
2. Babuska, 1., Rheinbold, W. C. (1978): Error estimates for adaptive finite element
computations. SIAM J. Numer. Anal. 15, 736-754.
3. Babuska, 1., Rheinbold, W. C. (1978): A posteriori error estimates for the finite
element method. Internat. J. Numer. Methods Engrg. 12, 1597-1615.
4. Babuska, 1., Strouboulis, T. (2001): The Finite Element Method and its Relia-
bility. Oxford University Press Inc., New York.
5. Bangerth, W., Rannacher, R. (2003): Adaptive finite element methods for dif-
ferential equations. Lectures in Mathematics ETH Zurich. Birkhauser Verlag,
Basel.
6. Becker, R., Rannacher, R. (1996): A feed-back approach to error control in finite
element methods: Basic approach and examples. East-West J. Numer. Math. 4,
237-264.
7. Hlavacek, 1., Krizek, M. (1987): On a superconvergent finite element scheme for
elliptic systems. ApI. Mat. 32, 131-154.
8. Korotov, S., Neittaanmiiki, P., Repin, S. (2003): A posteriori error estimation
of goal-oriented quantities by the superconvergence patch recovery. J. Numer.
Math. 11,33-59.
9. Korotov, S., Neittaanmaki, P., Repin, S. (2004): A posteriori error estimation
in terms of linear functionals for problems in the elasticity theory. Russian J.
Numer. Anal. Math. Modelling (to appear).
10. Krizek, M., Neittaanmaki, P. (1984): Superconvergence phenomenon in the finite
element method arising from averaging gradients. Numer. Math. 45, 105-116.
11. Oden, J. T., Prudhomme, S. (2001): Goal-oriented error estimation and adap-
tivity for the finite element method. Comput. Math. Appl. 41, 735-756.
12. Verfiirth, R. (1996): A review of a posteriori error estimation and adaptive mesh-
refinement techniques. Wiley-Teubner.
13. Zienkeiewicz, O. C., Zhu, J. Z. (1987): A simple error estimator and adaptive
procedure for practical engineering analysis. Internat. J. Numer. Methods Engrg.
24, 337-357.
Numerical Solution of Flow In Backward Facing
Step

Karel Kozel l , Petr Louda 2 and Petr Svacek 1

1 Department of Technical Mathematics, Karlovo mim. 13, Faculty of Mechanical


Engineering, CTU, Prague [email protected]
2 Institute of Thermomechanics AS CR, Czech Academy of Science, Dolejskova 5,
Prague

Summary. The work deals with numerical testing of two different numerical meth-
ods based on finite volumes (FV) and finite elements (FE) for different Reynolds
numbers. The finite volume method is based on upwind scheme (third order) for con-
vective terms and central second order for dissipative terms. Finite element method
consists of stabilization of weak formulation for higher Reynolds numbers with the
help of streamline-upwind (Petrov-Galerkin) modification.
Authors compare both numerical results with experiment for laminar Re E
(100,700), where steady solution exists, using the length of separation domain on
lower wall as well as on upper wall for Re 2: 400.

1 Introduction

The work deals with the numerical solution of the 2D and 3D flow through
backward facing step. The mathematical model is the system of Navier-Stokes
or Reynolds averaged Navier-Stokes (RANS) equations with two-equation tur-
bulence models. Concerning BFS flow in 2D, we focus ourselves on the transi-
tional regimes from laminar to turbulent flow. The start of transition is investi-
gated using the model of laminar flow, which allows us to make even the small
differences between different numerical schemes visible. The end of transition
is investigated using the RANS model. A comparison with measurement data
is made. The laminar model is extended to 3D.
We are also interested in comparison of results obtained with finite volumes
and finite elements schemes. We discuss the stabilization of finite elements
with the help of advanced SUPG method in order to obtain robust solver for
unsteady incompressible laminar flows.

2 Mathematical model

The equations used to solve laminar flow of an incompressible viscous fluid are
the N avier-Stokes ones, which in 2D and Cartesian coordinates (x, y) have the
form
Numerical Solution of Flow in Backward Facing Step 597

where
W = col lip, u, vii, R = it = diag 110, 1, 111,
F = uf + col 110,p, all, G = vf + col 110, O,pll, f = col 111, u, vii (2)
Re is Reynolds number, u, v components of velocity vector, p static pressure
divided by density, and colli, I denotes column vector.
In the case of turbulent flow, we solve the Reynolds averaged Navier-Stokes
(RANS) equations to obtaind mean flow-field. The RANS equations have the
form of (1) with (u, v), p being mean values of velocity and pressure and right
hand side replaced by

(3)
with
(4)
where Tij is Reynolds stress tensor. It is approximated using low-Re two-
equation turbulence models mentioned in Section 5.
The formulation of laminar and turbulent 2D backward-facing step flow was
completed by the boundary conditions in the inlet: fully developed channel
flow, on the walls: zero velocity, and in the outlet: zero streamwise derivative
of velocity.

3 Finite volume method


The system of Navier-Stokes or RANS equations is solved by means of artificial
compressibility method, which replaces matrix R in (1) by a regular matrix.
We have used
R = diag 1I1/Umax , 1, 111, (5)
where Umax is maximum velocity in the domain. The method is thus applicable
for solution of steady flows only.
The Eq. (1) with (5) is solved using a cell-centered finite volume mehod.
The mean value of W for the i,j finite volume (cell) Di,j, Wi,j, will satisfy

i -i
a numerical approximation of
ow,·· IDi,jl
R--'-'} + Fdy - Gdx = R1- Wxdy - Wydx, (6)
ot aDi,j Re aDi,j

where IDi,j I is area of the cell. The finite volumes compose structured (multi-
block) grid consisting of quadrilaterals. The grid is orthogonal and for turbu-
lent cases refined along walls.
598 K. Kozel et al.

3.1 Spatial discretization

The integrals in (6) are approximated using mid-point rule. Cell face values of
P, G, Wx,
Wy thus need to be defined. The face between cells Di,j and DH1,j is denoted
by i+ 1/2, j, analogically the face between cells Di,j and D i ,j+1 will be denoted
by i,j + 1/2.
The discretization of convective and pressure terms consists of defining cell
face velocities ui+l/2,j, Vi,j+1/2 and pressure in the central way,

1 1
1, 2 2,), + u'+l ,)
u'+1/2 ,J' = -(u' 'l, ,), Pi+1/2,j = '2 (pi,j + PH1,j),
1 1
Vi,j+1/2 = '2 (Vi,j + Vi,j+1), Pi,Hl/2 = "2(Pi,j + Pi,j+1),
and an upwind biased "Monotone Upstream-centered Schemes for Conserva-
tion Laws" (MUSCL) [12] interpolation in the direction of grid lines, here the
line j = canst, for cell face momentum in (2):

1 1
fHl/2,j = fi + 4(1 + K,)(fi+1 - fi) + 4(1 - K,)(fi - fi-l), uHl/2 > 0,
1 1
fHl/2,j = fHl + 4(1 + K, )(fHl - fi) + 4(1 - K,) (fH2 - fHd, ui+1/2 :s: 0,(7)
where the constant index j has been ommited in the Lh.s. and AH1/2 is
obtained similarly in j-direction depending on vi,Hl/2' We have used K, = 1/3,
i.e. up to third order accurate upwind. The same form is used for turbulent
flows as well, although the formal accuracy is degraded by non-regularity of the
grid. On the other hand, the grid refinement is needed inside of shear layers,
where the diffusive term is dominant. There was no need to use a limiter
in Eq. (7) for momentum equations. In case of turbulence model equations
a limiter is sometimes necessary to achieve better convergence in the non-
turbulent regions. We have used a minmod one.
The approximation of cell face velocity derivatives RWx , RWy needed in diffu-
sive terms is central. It uses quadrilateral dual finite volumes constructed over
each face of primary volume - the vertices are located at end of primary face
and in centres of adjacent primary volumes. The mid-point rule quadrature
formula is again used, with face value of velocity defined as average of values
in vertices of dual cell [11].

3.2 Pressure stabilization

The finite volume is common for all unknowns. In order to avoid pressure-
velocity decoupling in this collocated arrangement, a pressure diffusion term
Numerical Solution of Flow in Backward Facing Step 599

in the form of Laplacian of pressure is added to the r.h.s. of continuity equa-


tion. The magnitude of pressure diffusion is adjusted by the local resolution
of physical diffusion. The term is also essential for robustness of the method
[11].
The effect resembles the one in the pressure stabilized Petrov-Galerkin method,
however in the present formulation the pressure diffusion does not vanish com-
pletely in steady state.

3.3 Discretization in time

The system (1) with R given in (5) is integrated in time by means of explicit
methods - multi-stage Runge-Kutta methods and MacCormack method - or
by implicit backward Euler method. In the implicit method, the system of
algebraic equations is linearized by Newton method, which results in an block
nine-diagonal system. The system is solved iteratively using a line Gauss-Seidel
method, applying direct block tridiagonal system inversion on the lines. For
turbulent flows, solely implicit method was used.

4 Finite Element Method


In order to clear up the finite element method approach, we re-write the system
of the incompressible Navier-Stokes equations in non dimensional form
au
at - v6u + (u· \7)u + \7p = 0, in D x (O,T), (8)
\7. u = 0, in D x (0, T).

This system is equipped with suitable boundary conditions of either Dirichlet


type (inlet and walls) and do-nothing boundary condition (outlet). The initial
condition for velocity components has to be added. The time derivative of
velocity components vanishes in the case of relevant stationary solution and
thus can be ommited. On the other hand we use the time stepping scheme for
finding the stationary solution. Nevertheless, the precise time stepping scheme
is needed in the case of nonstationary fluid flow. We follow now with time and
space discretization.
For time discretization we use second order implicit schemes, where

au 3un +1 _ 4u n + un - 1
at'" 27
This leads to a solution of one nonlinear system in each time step. We use the
second order linearization of the nonlinear convective term, which leads to the
semiimplicit scheme
600 K. Kozel et al.

This leads to the following scheme

equipped with appropriate boundary and initial conditions.


Next, the problem (9) is reformulated in a weak sense, which is suitable for
the solution with the aid of the finite element method. Defining the velocity
space X = (HI (0)) 2 and the pressure space M = L6 (0) it is easy to see that
the solution U = (u,p) of problem (9) satisfies

a(U, V) = f(V), vV = (v,q) E (X,M) (10)

where

a(U, V) = 2~ (u, v) + l/ (\7u, \7v) + (( (2u n - un-I) . \7)u, v) -

- (p, \7 . V) + (\7 . u, q) ,
1
f(V) = 27 (4u n -un -\v) (11)

and by (.,.) we denote the scalar product in the space L2(0). Moreover, we
require that u satisfies the Dirichlet boundary conditions. The couple (u, p)
represents a solution on time level n + 1, i.e. u n + 1 := u and pn+1 := p.
Further, the use of the Galerkin FEM restrict the weak formulation from
couple of spaces (X, M) to approximate spaces (Xh' M h), i.e. find Uh E
(Xh' Mh) such that

(12)

The couple (Xh' Mh) of finite element spaces should satisfy the BB condi-
tion, which guarantees the stability of a scheme: there exists a constant c > 0
such that
(p, \7. w) >
sup
wEX h
I W I 1,2,n - cllpll, (13)

We use Taylor-Hood family of finite elements pk+l j pk-approximation.


The discretization (10) of the convection part may lead to 2nd order ac-
curacy, but the approximate solution may suffer from spurious oscillations for
high Reynolds numbers. In order to avoid this drawback, we apply the stabi-
lization via streamline-diffusionjPetrov-Galerkin technique (see, e.g., [5], [2]).
In the stabilized problem we consider some additional terms defined by
Numerical Solution of Flow in Backward Facing Step 601

Lh,n(U, V) = ~)K (2: u - vLu + (w· \7) u + \7p, (w. \7)V) K'

Fh,n(V) = ~OK(2~ (4u n _un- 1) ,(W.\7)V) K' (14)

where the function w stands for w = 2u n - u n - 1 and by (-,.) K we denote the


scalar product in the space L 2 (K). The parameter OK is a function of local
(element) Reynolds number Re w based on the transport velocity w:

where (15)

and we set ~(ReW) = R~!:l' The parameter 0* is an additional free parameter.


The resulting stabilized system then reads

a(Uh' Vh) + Lh,n(Uh, Vh) + L Tk (\7 . Uh, \7 . Vh) = !(Vh) + Fh,n(Vh). (16)
KETh

where we used the additional grad - div-stabilization, for details see, e.g., [2].
The space-time discretization leads to the solution of the following system
of equations repeatedly for each time step

(17)
where 11: E Rnh and p E Rmh are vectors whose components represent degrees
of freedom defining the velocity u and the pressure p, respectively, S is a non-
singular nh x nh matrix and B is an nh x mh matrix. In practical computations
we solve this system with the help of iterative methods or - for a smaller sys-
tem - with the help of advanced direct methods, see, e.g., [6]. UMFPACK is
a set of routines for solving unsymmetric sparse linear systems, Ax = b, using
the Unsymmetric MultiFrontal method written in ANSI/ISO C.

5 Results
The Reynolds number is defined using 2/3 of maximum velocity in the inlet
(i.e. bulk velocity in the laminar case), and step height.
The figure 1 shows the distance of reattachment point on the lower wall
from the step. The grid has regular spacing in both directions equal to 0.05.
The result of finite volume method on coarse grid with double spacing is also
shown.
At higher Reynolds numbers, the laminar flow separates on the upper wall
as well. The results are shown in Fig. 2. This separation is for higher Reynolds
numbers predicted in better agreement with measurement than the primary
one in Fig. 1.
602 K. Kozel et al.

For Re > 6600, the flow is fully turbulent and two dimensional in the mean.
The primary separation length becomes independent of Re and according to [7]
equal 8 step heights. Our numerical results give well higher values 8.83 using
SST (Shear Stress Transport) turbulence model [9] and 9.05 using sst mC k-
E (modified Chien k-E with shear stress transport) one [10]. Both turbulence
models are low-Re two-equation ones, first one uses k-w formulation, second
one k-E. Several profiles of streamwise component of velocity for these models
are compared in the upper part of Fig. 3. The lower part of same figure shows
the k-variable of turbulence models, which could be interpreted as kinetic
energy of velocity fluctuations. Also the structure of primary separation highly
depends on turbulence model, as shown in Fig. 4. However, high dependence
on numerical method (and grid) should be mentioned as well - e.g. the small
corner vortex for SST model disappears if the limiter in w-equation is used.

Acknowledgements

The work has been partially supported by the Research Plan of MSMT No.
210000003 and by grants No. 101/03/0018 and No. 101/02/0684 of GA CR.

14,-----,-----,------,-----,------,-----,------,


12

4
/'. " . measurement
/.
~ .-0- .. FEM fine
~/'
0 AUSM coarse
- "*- AUSM fine
2
100 200 300 400 500 600 700 800
Re

Fig. 1. Prediction of primary separation zone


Numerical Solution of Flow in Backward Facing Step 603

18

16

6 measurement x2
measurement x3
4 - ",,- AUSM x2 fine
- ",,- AUSM x3 fine
- <)-- FEMx2
- <)-- FEMx3

500 550 600 650 700 750 800 850


Re

Fig. 2. Prediction of secondary separation zone

sst mC k-e model - -


u-velocily SST model ---- ....

:Prf2Jrrl??????VPv 12 12 [7 II
o 2 4 6 8 10 12 14 16 18
1
20

sst mC k-e model - -

Jr~:t]?$cpl?~~IS2I52 I;» 1'» 1


turbulence kinetic energy SST model········

o 2 4 6 8 10 12 14 16 18 20

Fig. 3. BFS flow with SST and sst me k-€ model, Re = 6667
604 K. Kozel et al.

Fig. 4. Detail of streamlines for BFS flow, SST (left) and sst mC k-E model (right),
Re = 6667

References
1. Feistauer M. (1993): Mathematical Methods in Fluid Dynamics. Longman Sci-
entific & Technical, Harlow.
2. Gelhard T., Lube G., Olshanskii M. A.: Stabilized Finite Element Schemes with
LBB-stable elements for incompressible flows (preprint)
3. Svacek P., Feistauer M.(2003): Numerical Simulations of Laminar Viscous In-
compressible Flow Over a Profile, Proc. Topical Problems of Fluid Mechanics,
IT CAS, Prague.
4. Dolejsi V. (2001): Anisotropic Mesh Adaptation Technique for Viscous Flow Sim-
ulation, East-West Journal of Numerical Mathematics, 9(1):1-24.
5. Turek S. (1998): Efficient Solvers for Incompressible Flow Problems, Springer.
6. Davis T. A.,Duff I. S. (1999): A combined unifrontal/multifrontal method for
unsymmetric sparse matrices, ACM Transactions on Mathematical Software, 25,
no. 1, pp. 1-19.
7. Armaly B.F. et al. (1983): Experimental and Theoretical Investigation of
Backward-facing Step Flow, J. Fluid Mech., 127, pp. 473-496.
8. Kozel K., Louda P., Prihoda J. (2001): Numerical Solution of 2D and 3D Incom-
pressible Viscous Flow Problems, Internal Flows Vol. 2, (eds) P. Doerffer, 5th
ISAIF Symposium, pp. 847-854.
9. Menter F. R. (1994): Two-Equations Eddy-Viscosity Turbulence Models for En-
gineering Applications, AIAA Journal, 32, No.8, pp. 1598-1605.
10. Kozel K., Louda P., Prihoda J.(2003): Numerical Solution of Turbulent Backward
Facing Step and Impinging Jet Flows, Modelling Fluid Flow '03, vol. I, (eds) T.
Lajos, J. Vad, Budapest, pp. 694-701.
11. Louda P., (2002): Numerical Solution of 2D and 3D Turbulent Impinging Jet
Flow, Ph.D. thesis, CTU Prague (in Czech).
12. Vee H. C. (1989): A Class of High-Resolution Explicit and Implicit Shock-
capturing Methods, NASA Technical Memorandum 10 1088.
Periodicity Properties of Solutions
to a Hysteresis Model in Micromagnetics

Martin Kruzfk

Institute of Information Theory and Automation, Academy of Sciences of the


Czech Republic, Pod vodarenskou vezl 4, CZ-182 08 Praha 8, Czech Republic.
[email protected]

Summary. We show the existence of a solution with a periodic integral average


of the magnetization. This property is generic for both uniaxial as well as cubic
ferromagnets. Our proof mainly uses properties of time-discrete approximations and
a fixed point theorem.

1 Introduction

In this paper we show the existence of a solution to a hysteresis model of


bulk ferromagnets established in [12] having time-periodic spatial averages
of the magnetization. This holds under general assumptions for uniaxial as
well as for cubic ferromagnets. The model is based on the Brown's theory
of micromagnetics [1] which is here enriched by a suitable rate-independent
dissipation mechanism. The basic assumption is that the transformation of
the magnetization from one pole to another one requires a certain amount of
energy. This energy is related to the coercive force He. The rate-independence
allows for the description of pure hysteresis losses [5] and is well accepted for
a fairly wide range of frequencies of external magnetic fields. There are also
other attempts in the literature to build phenomenological rate-independent
dissipation mechanisms into the models; see e.g. [13]. We assume sufficiently
slow processes so that the released heat can be put off (hence the process is
isothermal). The model fully relies on energy principles and it is based on
the two main requirements, namely, stability (1) and the energy inequality (2).
Roughly speaking, we say that q = q(t) is a solution process if

v q: J(t, q) + D(q(t), q)
it
J(t, q(t)) ~ and (1)

J(t,q(t)) + Var(D,q;s,t) ~ J(s,q(s)) + OtJ(B,q(B)) dB , (2)

where J is Gibbs' stored energy of the system, "Var" stands for the total
variation, D is a dissipation functional ensuring a rate-independent response
and s ~ t E [0, TJ, where [0, T] is the process time interval. The formulation of
rate-independent evolutionary processes in continuum mechanics by means of
(1) and (2) appeared in [10]. Its application to problems of rate-independent
606 M. Kruzik

hysteresis in micromagnetics appeared in [12]. In particular, they proved the


existence of a solution to a model capturing a virgin magnetization process.
Not much is known about properties of solutions to (1) and (2). Mielke
and Theil [9, Th. 7.1] showed the uniqueness of the solution if I is smooth
and uniformly convex in q. However, these assumptions do not hold in our
application. As our functional I comes from the convexification of a function
with multiple minima it is not strictly convex and it is affine along the easy axis.
Nevertheless, analyzing time-discrete problems corresponding to (1) and (2) we
can show periodic behavior of the average magnetization. The main tool is the
Tychonoff fixed point theorem and uniqueness of the average magnetization in
the time-discrete case.

2 Model
2.1 Stored energy and its relaxation

The theory of rigid ferromagnetic bodies [1] assumes that a magnetization


m : fl --) JRn, describing the state of a body fl c JRn, n = 2,3, is subjected
to the Heisenberg- Weiss constraint, i.e. has a given (in general, temperature
dependent) magnitude

I m(x) I = Ms for almost all x E fl ,

where Ms > 0 is the saturation magnetization, considered here as a constant


(since temperature is considered constant, too).
In the no-exchange formulation, which is valid for large bodies [4], the
Helmholtz free energy of a rigid ferromagnetic body fl c JRn consists of two
parts. The first part is the anisotropy energy Jn tp( m( x)) dx related crystal-
lographic properties of the ferromagnet. A typical tp : S := {s E JRn; lsi =
Ms} --) JR is a nonnegative function vanishing only at a few isolated points on
S determining directions of easy magnetization, e.g. at two points for uniaxial
materials or at six (or eight) for cubic ones. The second part of the Helmholtz
energy, ~ JJF.n IVu m(x)12 dx, is the energy of the demagnetizing field VU m self-
induced by the magnetization m; its potential U m is governed (after neglect of
many terms in the full Maxwell system) by

div( - VU m + mxn) = 0 in JR n , (3)

where Xn : JRn --) {O, 1} is the characteristic function of fl. The demagnetizing-
field energy thus penalizes non-divergence free magnetization vectors. Stan-
dardly, we will understand (3) in the weak sense, i.e. U m E Hl(JRn) will be
called a weak solution to (3) if the integral identity JJF.n (mxn - Vum(x)) .
Vv(x) dx = 0 holds for all v E Hl(JRn), where Hl(JRn) :::::: W 1 ,2(JRn) denotes
the Sobolev space of functions from £2 (JRn) with all first derivatives (in the
Periodicity Properties of Solutions to a Hysteresis Model 607

distributional sense) also in L2(jRn). Altogether, the Helmholtz energy E(m),


has the form

E(m) = 1 n
cp(m(x)) dx + ~ r
2 i~n
lY'u m (x)12 dx . (4)

If the ferromagnetic specimen is exposed to some external magnetic field


h = h(x), the so-called Zeeman's energy of interactions between this field
and magnetization vectors equals to H(m) := - In
h(x)· m(x) dx. Finally, the
following variational principle governs equilibrium configurations:

minimize G(m):= E(m) - H(m)

inr cp(m(x)) - r
{
= h(x) . m(x) dx + ~ lY'um(xW dx ,
2hn
subject to (3), (m,u m ) E A x HI(jRn) ,
(5)
where the introduced notation G stands for Gibbs' energy and A is the set of
admissible magnetizations

A:= {m E Loo(D;jRn); I m(x) 1= Ms for almost all xED} .


As A is not convex we cannot rely on direct methods in proving the existence
of a solution. In fact, the solution to (5) need not exist in A x HI (jRn) Due to
nonconvexity of A weak limits of minimizing sequences of (5) do not necessarily
live in A x HI(jRn).
It is, therefore, natural to look for an extension (=relaxation) of our prob-
lem in which we would properly describe the behavior of (5) along minimizing
sequences. It is well-known [4] that such relaxation can be achieved by ex-
tending the Helmholtz energy by continuity on the convex set of the so-called
Young measures [14]

E(v)=lcpevdx+~
n
r
2 i~n
IY'U(idev)(xWdx, (6)

where [vev](x) := IrRn v(s)vx(ds) and id : jRn ----> jRn is the identity. The
set of Young measures Y(D; S) c L;:(D; rca(S)) 3:' LI(D; O(S))* is the
set of all weakly measurable essentially bounded mappings x f-+ Vx : D ---->
rca(S) 3:' O(S)* such that Vx is a probability Radon measure supported on the
sphere S for a.a. xED; the adjective "weakly measurable" means that Ve V
is Lebesgue measurable for any v E O(S). A natural embedding of a magne-
tization m E LOO(D; jRn), Im(x)1 = Ms, to Y(D; S) is v = i(m) defined by
Vx = 5m (x) with 58 denoting the Dirac measure at s E S. We say that a se-
quence {vkhEN C Y(D; S) converges weakly* to v if limk--->oo(vk, f) = (v, f)
for any JELI(D;O(S)) or, equally, for any J = g0v with g E LI(D) and
v E O(S), where the tensorial notation means naturally [g0v](x, s) = g(x)v(s).
From the last fact, we can also say that v k ----> v weakly* if and only if
608 M. Kruzik

w*-limk-->oovevk = VeV for all v E C(S). where the weak*-limit is under-


stood in LOO([2). Considering the weak* topology on L~([2; rca(S)) makes
y([2; S) a convex, metrizable compact set containing densely the set of admis-
sible magnetizations A if embedded via i.
As shown in [4, 11], a correct relaxation (=natural extension) of (5) is

minimize G(v):= E(v) - H(idell) , (7)


{
subject to (3) with m = idev, (v,u) E y([2;S) x Hl(JR.n) ,

The model (7) represents a so-called mesoscopic level model, because a mini-
mizing Young measure v records some, but not full information about spatial
oscillations of a minimizing sequence of (5) around each "macroscopic" point
x through volume fractions described as the probability distribution v x . This
information makes possible to describe the effective magnetic properties by
means of the first moment, the "macroscopic" magnetization m = id e v, and
moreover seems sufficient for designing a dissipative mechanism in a good
agreement with experiments, which will be just exploited further.

2.2 Rate-independent dissipation

For usual loading regimes and magnetically hard materials, one must consider
certain dissipation. Our simplified standpoint is that the amount of dissipated
energy within the phase transformation from one pole to the other can be
described by a single, phenomenologically given number (of the dimension
J/m 3 =Pa) depending on the coercive force He. Hence, we need to identify the
particular poles according to the magnetization vector. Inspired by [8, 10] and
considering L poles (L = 2 for uniaxial magnets or 6 or 8 for cubic magnets),
we define a continuous mapping .£ : S ---> 6.L where 6.L := {~ E JR.L; ~i 2:
0, i = 1, ... ,L, 2:~=l~i = I}. In other words, {.£l"",.£L} forms a partition
of unity on S such that .£i(S) is equal 1 if s is in i-th pole, i.e. s E S is in
a neighborhood of i-th easy-magnetization direction. Of course, .£(m) in the
(relative) interior of 6.L indicates m in the region where no definite pole is
specified. Hence .£ plays the role of what is often called an order parameter.
In terms of the mesoscopic microstructure described by the Young measure
v, the "mesoscopic" order parameter is naturally defined as

A = Av:= .£eV (8)

where [.£ e v](x) :=Is .£(S)Vx ( ds). Thus A is just a continuous extension of the
mapping m f-+ .£( m), i.e. if {mk} converges to 1I weakly* in L~ ([2; rca( S)),
then '£(mk) ---'- Av weakly* in LOO([2; JR.L).
To described phenomenologically the dissipative energetics, one must pre-
scribe a (pseudo) potential of dissipative forces (} as a function of the rate of
A. For rate-independent processes, this potential must be convex and homo-
geneous of degree-I. Considering a (not necessarily Euclidean) norm I . \L
Periodicity Properties of Solutions to a Hysteresis Model 609

on jRL, one can postulate [1(>-) = Hel>-IL, a constant He > is the so-
called coercive force. The energy needed to transform i-th pole to j-pole is
°
then Helei - ej IL with ei the unit vector with 1 at the i-th position. The
state of the specimen D (at a given time t) will be described by the couple
q = q(t) == (v, A) = ({vx,t}xE!?, A(" t)). Let us denote by Q the convex set of
admissible configurations:

Q:= {q = (v, A) EY(D; S) x £OO(D; jRL) (9)

A(X) E 6L, Av = A for a.a. XED}

The total dissipation of the process between states ql, q2 E Q is defined as


[6,12]

For the analysis below, we will need to consider rather a certain regulariza-
tion of the stored energy E which would control spatial smoothness of A. For
this, we will augment E by a higher-order term

where H"'(D) == W",,2(D) denotes the usual Sobolev-Slobodetskil space and


where we assume

a, p > 0, fixed. (12)

From now on, we will work with this regularized relaxed stored energy Ep
rather than E.
Let us abbreviate the Gibbs energy by

Q(t,q):= Ep(q) - ('H(t),q) , (13)

where

('H(t),q) = [H(t)](idev) = (v,h(·,·) ®id). (14)

Let us agree to identify quite naturally the mapping t f-+ v(t) = {[v(t)]x}xEn
with a Young measure (x, t) f-+ Vx,t.

Definition 1. We say that a process q = q(t) is stable if

Vij E Q: Q(t, q) :::; Q(t, ij) + V(q(t), ij) (15)

for all t E [0, T].


610 M. Kruzik

Definition 2. We say that the process q = q(t) satisfies the energy inequality
if for all s, t E [0, T], s :S t,

9(t, q(t))
'-..---'
+Var(D,q;s,t) :S9(s,q(s)) it (~~,q(B))dB
" v .f
, (16)
effective Gibbs' - -dissipated
- - - - - - Gibbs'
'--v -'
ener- reduced work of
energy at time t energy gy at time 0 external field

where the total variation over the time interval [s, t] is defined standardly,
without using explicitly any time derivative, as
I
Var(D,q;s,t):= sup LD(q(ti-d,q(ti)) (17)
i=l

where the supremum is taken over all J E N and over all partitions of [s, t] in
the form s = to < tl < ... < tJ-l < tJ = t.

In what follows "BV" stands for a space of maps with bounded variations.

Definition 3. The process q = q(t), q == (v, A), will be considered as a solution


if v E Y(Dx[O,T];S), A E BV([0,T];L1(D;lR L )) andq(t) E Qforallt E [O,T],
and it is stable in the sense (15) for all t E [0, T] and satisfies the energy
inequality (16) for a.a. s, t E [0, T], s :S t.

The existence of a response q with the above mentioned properties was


shown even in a more general case in [12] by a semi-discretization in time,
using the implicit Euler scheme. For simplicity, let us consider an equidistant
partition of the time interval [0, T] with a time step T > 0, assuming TIT an
integer. Even more, we consider a sequence of T'S converging to zero and such
that, TilTi+1 is integer, i.e. each next partition is a refinement of the preceding
one.
Then we put q~ = qo, a given initial condition, and, for k = 1, ... , TIT we
define q~ recursively as a solution of the minimization problem

Minimize I(q):= 9(kT, q) + D(q~-l, q)


{ (18)
subject to q == (v, A) E Q ,

where Q is from (9), 9 is from (13), and D from (10). If a solution (i.e. a global
minimizer) to (18) is not unique, we just take an arbitrary one for q~. Then
we define the piecewise constant interpolation qr E LOO(O, T; L':}'(D; rca(S)) x
LOO(D;lRL)) so that qrl«k-l)r,kTj = q~ for k = 1, ... ,TIT while for t = Owe
put qr(O) = qo. Besides, assuming
Periodicity Properties of Solutions to a Hysteresis Model 611

hE W 1,1(0, iT; £1([2; ~n)), h(·, t + jT) = h(·, t) , (19)


i E N, 1 ::::: j ::::: i-I, t E [0, T]

we have certainly 'H E C(O, T; £1([2; C(S)) and we can define the piecewise
constant approximation of 'H, denoted by 'Hn by 'Hy(t) = 'H(kT) for t E
((k - I)T, kT] and by 'Hy(t) = 'H(O) for t = O. Besides, we will still need
the piecewise affine interpolation, denoted by 'H~ff, i.e. 'H~ff is affine in time if
restricted on the interval [(k -1)T, kT] for k = 1, ... , T /T and 'H~ff (kT) = H(kT)
for k = 0, ... , T /T. Also, we will naturally assume that the initial condition qo
is admissible and even stable:

WjE Q; (20)

note that it implies, in particular, that Ep(qo) < +00. The scheme (18) together
with a suitable spatial discretization is also suitable for a numerical solution
to our problem.

Let us define a sufficiently large set P where the values of all the processes
qr (-) will certainly live; here it is natural to put

(21)

the constant C 1 can be now considered arbitrary but sufficiently large, and
will be fixed later, see (22). We will endow P by the (weak*xweak)-topology
of £':)([2; rca(S)) x H"'([2; ~L). Clearly, the set P is compact.
The following can be found in [6] or in a slightly different form in [12].

Proposition 1. Let (12), (19) and (20) hold. Let qr = (Vr,Ar) be a solution
constructed recursively from solutions to (18) at the prescribed time incre-
ments. The following a-priori estimates hold:

liAr IILOO(o,T;Ha (n;~L)nLoo (n;~L))nBV([O,TJ;£l (n;~L) ::::: C 1 , (22)


IIVrIILOO(O,T;L~(n;rca(S)) ::::: C 2 , (23)
IIQ)rIIBV([O,TJ) ::::: C3 , (24)

where Q)y(t) := Ep(qy(t)) - ('Hy(t), qr(t)) denote Gibbs' energy of the approxi-
mate trajectory.

Proposition 1 shows that solutions to (18) live in P if C 1 is large enough.


Let us denote

P:= {(;3,A) E ~n x H"'([2;~L); 3(V,A) E P such that;3 = v}, (25)

where
612 M. Kruzik

v:= 1 Q
meas
rr
} n }S
Avx(dA) dx = 1 Q
me as }n
m(x) dx , r (26)

i.e., v is the average of the macroscopic magnetization over the specimen Q,


in particular Ivl : : : Ms. We call v the Q x S average of v. Endowing P with
the Euclidean topology of]Rn x the weak topology of HCY.(Q; ]RL) we see that
P is convex, closed and bounded in the Banach space ]Rn x HCY. (Q; ]RL) with
the norm 11(,6,A)IIIRnXHa(n;IRL) = 1,61 + IIAIIHa(n;IRL). We define

8: P -+ P: 8(v, A) = (v, A) . (27)

Lemma 1. Let (12) hold. Let q1 = (V1' A1), q2 = (V2' A2) be two solutions of
(18) in Q for some fixed 1:::::: k:::::: T/r. Then 8(q1) = 8(q2)'
Proof. The assertion A1 = A2 follows from the convexity of I from (18) in q
and the strict convexity of I in A. Note also that I is strictly convex in 'VU(id. v),
i.e., in the magnetostatic field. Denoting m1 and m2 the macroscopic magne-
tization vectors corresponding to q1 and q2, respectively, the weak formulation
of (3) gives fn~.n(m1(x) - m2(x))Xn(x) . 'Vv(x) dx = 0 for any v E H1(]Rn). As
for any w E ]Rn one can find v E H1(]Rn) such that 'Vv = w in Q we have
In m1(X) dx = In m2(X) dx and the result follows. 0

Let Z : P -+ P be defined as follows. For any (,60, Ao) E P we take


qo = (vo, Ao) E P such that ,6 = Vo and solve (18) for all k = 1, ... , T /r.
Then we calculate ,6;/T = V;/T and we set Z(,6o,Ao) := (,6;/T,A;/T), where
q;/T = (V;/T, A;/T). Note that the image of (,60, Ao) does not depend on
particular (vo, Ao) E P having Vo = ,60 because solutions in (18) depend only
on Ao in the initial condition.

Lemma 2. The mapping Z : P -+ ]Rn x HCY.(Q; ]RL) is weakly sequentially


continuous in]Rn x HCY.(Q;]RL).

Sketch of proof. We can suppose that T /r = 1. Otherwise we write Z


as a composition of analogously defined mappings from the (k - l)th step
to the kth step. Having (,6j,Aj) -+ (,6o,Ao) in P for j -+ 00 we denote
Fj(q) := 9(r, q) + V(q, qj), and Fo(q) := 9(r, q) + V(q, qo), qj = (Vj, Aj), j :::: O.
We have limj--+oo Fj = F uniformly in P and as Fj are sequentially lower
semicontinuous on P by [2] any sequence of minimizers of {Fj}j~l contains a
subsequence converging to a minimizer of Fo. As Q x S average of v-components
of minimizers of Fo is given uniquely the whole sequence of Q x S averages
of v-components of minimizers of F j converges to the Q x S average of the
v-component of minimizers of Fo. Altogether limj--+oo Z(,6j, Aj) = Z(,6o, Ao).
o
Lemma 3. There is (,6, A) E P such that Z(,6, A) = (,6, A).
Periodicity Properties of Solutions to a Hysteresis Model 613

Proof. P is a closed, convex and bounded subset of the reflexive Banach


space ]Rn x HC«Q; ]RL). As Z is weakly sequentially continuous in P the as-
sertion follows from the Tychonoff fixed point theorem; cf. [3]. 0

The following theorem establishes the existence of a solution to our prob-


lem, which has the same average magnetization at the time t = 0 and t = T.

Theorem 1. Let (12) and (19) be valid. Then there are a process
q = (v, A) E Y(Q x [0, T]; S) x BV([O, T]; Ll(Q; ]RL)), 8(q(0)) = 8(q(T)) and
a net {qT< hES, 8( qT< (0)) = 8( qT< (T)) such that:
(i) limEEsAT«t) = A(t) strongly in L2(Q;]RL) for all t E [O,T],
(ii) limEEs VT< (t) = v(t) weakly* in L,:/(Q; rca(S)) for all t E [0, T],
(iii) limEEsQT«t,qT«t)) = Q(t,q) for all t E [O,T].
Moreover, A = Av a.e. on Q for every t E [0, T] and q thus obtained is a solu-
tion process according to Definition 3 and 8(q(0)) = 8(q(T)).
Sketch of proof. The proof is similar to the one of [7, Th. 3.4] or
[6, Prop. 3.13]. The point (i) relies on the weak* compactness of the set
L,:/(Q; rca(S))[O,T] due to the Tychonoff's theorem on compactness of product
topologies. 0

Theorem 2. Let (12) and (19) hold. Then there is a solution process q =
(v, A) E Y(Q x [0, iT]; S) x BV([O, iT]; Ll(Q; ]RL)) such that 8(q(t + jT)) =
8(q(t)) for any t E [0, T] and any 1 :S j :S i -1. In particular, In m(x, t) dx =
In m(x, t + jT) dx.
Proof. The process q can be constructed by the T-periodic extension of the
process whose existence was established in Theorem 1. Clearly, the extended
definitions 1 and 2 hold for such process. 0

Acknowledgment. This work was partly supported by the GA AVeR


grant No. A 107 5005.

References
1. Brown, W.F. Jr. (1966): Magnetostatic principles in ferromagnetism. Springer,
New York
2. Dal Maso, C. (1993): An introduction to r-convergence. Birkhiiuser, Boston
3. Deimling, K. (1985): Nonlinear Functional Analysis. Springer, Berlin
4. DeSimone, A. (1993): Energy minimizers for large ferromagnetic bodies. Arch.
Rat. Mech. Anal., 125,99-143
5. Hubert, A., Schiifer, R. (1998): Magnetic Domains. Springer, Berlin
6. Kruzik, M. (2003): Periodic solutions to a hysteresis model in micromagnetics.
In preparation
7. Mielke, A., Roubicek, T. (2003): A rate-independent model for inelastic behavior
of shape-memory alloys. Multiscale modeling and simulations, to appear. +
614 M. Kruzik

8. Mielke, A., Theil, F. (1999): Mathematical model for rate-independent phase


transformations. In: Alber, H.-D., Balean, R., Farwig, R. (eds) Models of Cont.
Mechanics in Analysis and Engineering. Shaker-Verlag Aachen
9. Mielke, A., Theil, F. (2003): On rate-independent hysteresis models. Nonlin. Diff.
Eq. Appl., to appear
10. Mielke, A., Theil, F., Levitas, V. (2002): A variational formulation of rate-
independent phase transformations using extremum principle. Arch. Rat. Mech.
Anal., 162, 137-177
11. Pedregal, P. (1997): Parametrized Measures and Variational Principles.
Birkhiiuser, Basel
12. RoubiCek, T., Kruzik, M. (2002): Mesoscopic model for ferromagnets with
isotropic hardening. Zeitschrift f. Angew. Math. Phys., to appear
13. Visintin, A. (1997): Modified Landau-Lifshitz equation for ferromagnetism. Phys-
ica B, 233, 365-369
14. Young, L.C. (1937): Generalized curves and existence of an attained absolute
minimum in the calculus of variations. Comptes Rendus de la Societe et des
Lettres de Varsovie, Classe III, 30, 212-234
Mixed Finite Element Method on Polygonal
and Polyhedral Meshes

Yuri Kuznetsov 1 and Sergey Repin 2

1 Department of Mathematics, University of Houston, Houston, TX 77204


[email protected]
2 St. Petersburg Department of V.A. Steklov Institute of Mathematics of the
Russian Academy of Sciences, St. Petersburg 191011, Russia [email protected]

Summary. A new mixed finite element method for the diffusion equations on gen-
eral polygonal and polyhedral meshes is presented. The basis vector functions in
macro cells are designed by solving the local mixed finite element problems with the
lowest order Raviart-Thomas elements. Numerical results for the Poisson equation
on distorted prismatic meshes are given.

1 Introduction

This paper is a natural generalization of the authors' recent results [3] to the
general diffusion equation with mixed type boundary conditions. The paper is
organized as follows. In Section 2 we give the formulation of the problem and
describe the meshes to be used. Section 3 is the most important in the paper.
We describe an approach to designing macroelements in the space of fluxes by
solving mixed finite element problems with the lowest order Raviart-Thomas
elements [1], [4] on macrocells. A convergence result is given in Section 4.
Finally, in Section 5 we present and discuss numerical results for a particular
3D test problem.

2 Problem formulation

We consider the diffusion problem in the form of the system of the first order
differential equations

K-1ii.+ gradp= 0
(1)
divii. + cp = f
in a bounded connected polygonal (polyhedral) domain n in ]R.d, d = 2 (d = 3),
with homogeneous boundary conditions

p=O on rD,
(2)
ii.·n=O on rN.
616 Yu. Kuznetsov, S. Repin

Here rD and rN are the Dirichlet and the Neumann parts of the boundary
aD, n is the outward unit normal to aD, K = K(x) is the diffusion tensor,
K = KT > 0, C = c(x) is a nonnegative function, and f E L2(D). We assume
that r D is a closed subset of aD consisting of a finite number of segments
(polygons) in the case d = 2 (d = 3).
The weak formulation of (1), (2) reads as follows: find
U E V == {v : v E Hdiv(D), V· n = 0 on rN}, P E Q == L 2 (D)
such that

j (K-1u) . v dx - j p(\7 . v) dx = 0
n n (3)
j (\7. u)qdx + j cpqdx = j fqdx
n n n
V x Q.
for all (v, q) E
Let Dh be a partitioning of D into m nonoverlapping polygonal (polyhedral)
macro cells E k :
m

Dh = UE k, (4)
k=l
and Vh and Qh be finite element subspaces of V and Q, respectively. In this
paper we assume that the interface nl = aEk n aEI between macro cells Ek
and El is either a point, or a segment, or a simply connected polygon in the
case d = 3 (see, for instance, Fig. 1). We also assume that for any function
Ph E Qh its trace on Ek is a constant, and for any vector-function v E Vh its
trace on Ek is a piecewise affine vector-function. Moreover, we assume that for
any vector-function Vh E Vh its normal component v . nkl on rkl is a constant,
where nkl is the unit normal vector to nl directed from Ek to E l , k > t. An
example of a polygonal mesh Dh is given in Fig. 1. The arrows on the interfaces
show the normal component of the fluxes.
In the case of a triangular (d = 2) or a tetrahedral (d = 3) partitioning
of D, Vh is the lowest order Raviart-Thomas space RTo(Dh) subject to the
boundary condition on rN.
The mixed finite element approximation to (1), (2) reads as follows: find
(Uh, Ph) E Vh x Qh such that

j(K-1Uh).VdX- jPh(\7.V)dX=O
n n (5)
j (\7 . uh)qdx + j cPhqdx = j fqdx
n n n
for all (v, q) EVh X Qh. The finite element problem results in the system of
linear algebraic equations
Mixed Finite Element Method on Polygonal and Polyhedral Meshes 617

Fig. 1. Example of a polygonal mesh

(6)

with a saddle point matrix

A= ( M
B -E
BT) (7)

where M is a symmetric positive definite matrix and E is a symmetric positive


definite (or semidefinite) matrix.

3 New finite element space V h

Let E be a macro cell in [lh and BE be a union of s segments, d = 2 (polygons,


d = 3), Tt, ... , Ts. For the moment, we assume that any two adjacent segments
(polygons) Ti and T j , i -I- j, do not belong to the same line (plane). We recall
that adjacent interfaces of a macro cell Ek with neighboring macro cells Ei and
E j may belong to the same line, d = 2 (plane, d = 3). A possible situation is
shown in Fig. 2.
t
Let Eh = U ei be a conformal partitioning of E into triangles ei, d = 2
i=l
(tetrahedra, d = 3), and RTo(Eh) be the lowest order Raviart-Thomas finite
element space. Here t is the total number of cells. We consider the mixed finite
element approximation to system (1) with right hand side f == constE in E,
c == 0 in E, and with boundary conditions U· fiE = 5l,j on T j , j = G, where
fiE is the outward unit normal to BE: find UE,h E RTo(Eh ), UE,h . fiE = 51 ,j
on T j , j = G,
PE,h E QE,h == {q : q == const on ei, i = l,t'}
such that
618 Yu. Kuznetsov, S. Repin

Fig. 2. A macro cell E and adjacent macro cells E k , E i , and E j

J (K- 1UE,h) . v dx - J PE,h('V . v) dx = 0


E E (8)
J ('V . UE,h)q dx = constE J q dx
E E
for all v E RTo(Eh), v . fiE = 0 on BE, q E QE,h. Here Ji,j denotes the
Kronecker delta. The value of constE in (8) is subject to the equation

J'V . UE,h dx == J (UE,h . fiE) dl = constE' lEI


E 8E

which gives

(9)

where

Inl = J
r1
dl and lEI = J
E
dx.

Under condition (9) problem (8) has the unique solution Wi = UE,h.
Repeating the same procedure on the same mesh Eh with the boundary
conditions UE,h . fiE = Ji,j on T j , 1 :::; j :::; s, i = 2, ... , s, we get the vector-
functions Wi, W2, ... , Ws which satisfy the conditions

To this end, a vector function


s

W = L O!iWi (10)
i=l
Mixed Finite Element Method on Polygonal and Polyhedral Meshes 619

satisfies the boundary conditions

W . fiE = ai on ri , i = T,"S. (11)


The extension of the procedure to the case when adjacent r i and rj may
belong to the same line, d = 2, or the same plane, d = 3, is straightforward.
To complete the description of the procedure for the designing of vector-
functions Wi, i = T,"S, we consider a situation where r 1 is partitioned into
two segments, d = 2, (simply connected polygons, d = 3) r1,D and r1,N. We
assume that any cell ei from Eh adjacent to the boundary BE satisfies the
condition that any of its edges, d = 2 (faces, d = 3) belong either to the
interior of E, or to rD, or to rN. Then, the requested vector function Wi is
the solution of the problem:
find Wi E RTo(Eh), Wi· fiE = 1 on fi == r i \ rN, Wi . fiE = 0 on BE \ fi'
PE,h E QE,h such that the equations (8) are satisfied for any iJ E RTo(Eh ),
iJ . fiE = 0 on BE, q E QE,h. The constant on the right hand side of (8) IS
defined by the formula

constE (12)

where

Ifil = J dl.
r
i

The extension of the procedure to other r j , 2 j s: s:


s is again straightforward.
The vector-functions W in (10) with new vector-functions Wi satisfies the
conditions

w·nE ai on ri , i = T,"S,
(13)
W·fiE o on BE n rN.
We denote the designed space of finite element vector-functions w(x) for
a macroelement Ek in fh by Vk,h. The global space Vh C Hdiv(D) for the
mesh is defined by

(14)
620 Yu. KuznetsQv, S. Repin

4 Convergence

To prove the convergence result for the mixed finite element problem (5) with
the designed space Vh we assume that

- all triangles (tetrahedra) e;k), i = 1, tk, in the partitionings of E k , k =


1, m, are regular shaped [1],
- the number of triangles (tetrahedra) tk in the partitionings of Ek is
bounded by a constant independent of k, i.e.,

max tk ::; const. (15)


l::;k::;m

Then, the convergence result follows from [3] after some minor modifica-
tions.

Theorem 1. Under the assumptions made, the finite element solution (ih, Ph)
of problem (5) converges to the solution (u, p) of problem (3) in Hdiv(D) x
L2(D), i.e.

IIUh - uIIHdiVU?) ---+ 0


(16)
Ilph - pIIL 2 (D) ---+ 0 as h -+ 0
where

h max diamEk.
l::;k::;m

5 Numerical results
To illustrate the proposed method by numerical results we consider the Dirich-
let boundary value problem for the Laplace equation:

6p = 0 in D,
(17)
P = g on aD
where D is a prism in ]R3 with the rectangular faces orthogonal to the (x, y)-
plane and the triangular faces parallel to the (x, y)-plane, and a function g is
the trace on aD of the test harmonic function p = (x 2 - z2) + x(y2 - z2).
Let Dh be a distorted prismatic mesh which is obtained from the reference
rectangular prismatic mesh by perturbation of vertices along the vertical mesh
lines. An example of such a distorted macro cell is given in Fig. 3. The choice of
this mesh is relevant to geophysical applications (basin modeling and reservoir
simulation) where distorted prismatic meshes are used for discretizations of
diffusion type equations in layered formations.
Mixed Finite Element Method on Polygonal and Polyhedral Meshes 621

Fig. 3. Reference (dashed lines) and distorted (solid lines) prismatic macro cell

Pressures: po(x, y, z) =- (.j- _ Z2) + x(~ _ Z2), u..-Q.3


1 0 - ' , - - - - - - - - -_ _- - - - - - - - - . . - - ,

10-"

-"
?
§

!
/
/
/

/
10-3 /

10-;O':;_,---------w1O':;-_'----~--~--.J,0'
mesh slep size h

Fig. 4. Error for pressures in C-norm

10' r---------_----------.--,
Normal oomponenl of fluxes: po(x, y, z):: {"} - 22) + x(i - z~, 0.=0.3

10-2L-_ _ _ _ _ _ _~~_--------~
10 2 10-' IOQ
mesh step size h

Fig. 5. Error for normal component of fluxes in C-norm


622 Yu. Kuznetsov, S. Repin

Four discretization methods have been utilized. The numerical results are
shown in Fig. 4 and Fig. 5. The first one is the nonconforming mixed finite
element method with the lowest order Raviart-Thomas elements. In fact, the
finite elements for the space of fluxes are nonconforming only on triangular
interfaces between distorted prisms where the continuity conditions for the
normal component of fluxes are imposed in the center of mass of the trian-
gles. On the quadrilateral interfaces which are orthogonal to (x, y)-coordinate
plane, the normal component of the fluxes are constant.
The second method which is called "Cubature" in the figures is a mod-
ification of the mimetic discretization [5] adopted to the prismatic meshes.
The third method is called " Interpolation" . This is a conforming finite ele-
ment method based on a modification of the Piola transformation adopted to
prismatic meshes.
Finally, the method proposed in this paper is called "Const Div". The
numerical results are given for the error functions in the discrete maximum
norms. For the solution function p (called "pressure") we measure the errors
for the mean values over the macrocells, and for the fluxes we measure the
mean values of the normal components over the interfaces.
The numerical results show that all the methods provide the same order
of accuracy of approximation to the solution of the differential problem (17).
We recall that the method proposed in this paper can be used on arbitrary
polygonal and polyhedral meshes.

6 Acknowledgments
The work was partly supported by a grant from Los Alamos Computational
Sciences Institute (LACSI). The authors are grateful to O. Boyarkin for pro-
viding us with the results of numerical experiments.

References
1. Brezzi, F., Fortin, M. (1991): Mixed and hybrid finite element methods. Springer,
Berlin
2. Kuznetsov, Yu. (2003): Spectrally equivalent preconditioners for mixed hybrid
discretizations of diffusion equations on distorted meshes. J. Numer. Math., v. 11,
No. 1,61-74
3. Kuznetsov, Yu., Repin, S. (2003): New mixed finite element method on polygonal
and polyhedral meshes. Russian J. Numer. Anal. and Math. Modelling, v. 18,
No.3, 261-278
4. Roberts, J.E., Thomas, J.-M. (1991): Mixed and hybrid methods. In: Ciarlet, P.,
Lions, J.-L. (eds) Handbook of Numerical Analysis II. Finite element methods,
523-639. Elsevier/North Holland, Amsterdam
5. Shashkov, M., Steinberg, S. (1996): Solving diffusion equations with rough coef-
ficients in rough grids. J. Compo Phys., 129, 383-405
Semi-discrete Schemes for Hamilton-J aco hi
Equations on Unstructured Grids

Doron Levy! and Suhas Nayak 2

1 Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA


[email protected]. edu
2 Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA
[email protected]

Summary. We present a new semi-discrete central scheme for approximating solu-


tions of Hamilton-Jacobi equations on unstructured meshes. This scheme extends the
numerical Hamiltonians of Kurganov et al. to unstructured grids. Similarly to the
previous works on structured grids, a semi-discrete formulation of central schemes is
made possible due to estimates of the local speeds of propagation. The consistency
of the method is obtained following Abgrall's calculations for the consistency of an
upwind Lax-Friedrichs scheme on unstructured grids. We conclude with comments
on high-order reconstructions.

1 Introduction

We present a new central-upwind scheme for approximating solutions of


Hamilton-Jacobi (HJ) equations on unstructured grids. These equations are
of the form
¢t + H(V¢) = 0, (1)
where ¢ = ¢(x, t), and with a Hamiltonian, H, that depends on V¢ and pos-
sibly on x and t. Since solutions of (1) may develop discontinuous derivatives
even when the initial data is smooth, it is generally required to interpret the
solution of (1) in a suitable weak sense. The corresponding theory, in the form
of "viscosity solutions" has been significantly developed over the past two
decades and we refer to [5, 6, 14, 15].
The increased understanding of the nature of solutions to HJ equations
have turned the area of numerical methods for the HJ equations into an active
research area. Converging first-order methods were introduced by Souganidis
[19]. The order of accuracy of the methods was increased through an essen-
tially non-oscillatory (ENO) reconstruction in the upwind schemes of Osher,
Sethian and Shu [17, 18]. Weighted essentially non-oscillatory (WENO) recon-
structions, which were first introduced for hyperbolic conservation laws [8, 16]'
were then used by Jiang and Peng [7] to even further increase the accuracy of
the numerical approximations using a compact reconstruction. Extensions of
the first-order and ENO upwind methods to unstructured grids were done by
624 D. Levy, S. Nayak

Abgrall [1]. The numerical fluxes of Abgrall were combined with WENO recon-
structions on triangular meshes by Hu and Shu in [20]. Another finite-volume
scheme on unstructured grids was proposed by Kossioris et al. in [9].
While upwind schemes require solving Riemann problems (or at least ap-
proximating their solutions) on the interfaces between two cells, Godunov-type
central schemes utilize evolution points that are located away from the discon-
tinuous interfaces in order to avoid Riemann solvers altogether. Fully-discrete
central schemes for HJ equations were first introduced by Lin and Tadmor
in [12, 13], and further improved by Bryson and Levy in [2]. Fully-discrete
schemes of order greater than two were derived in [3] using central-WENO
reconstructions. Semi-discrete formulations of central schemes enjoy reduced
numerical dissipation while keeping track over the local speeds of propagation
of information that is propagating from the discontinuous interfaces. Second-
order semi-discrete central schemes for HJ equations were derived by Kurganov
and Tadmor in [11]. A more accurate estimate of the local speed of propagation
was then utilized to reduce the numerical dissipation in [10]. The numerical flux
of Kurganov, Noelle and Petrova, was combined with the WENO reconstruc-
tions of Jiang and Peng to obtain fifth-order, semi-discrete central schemes for
multi-dimensional HJ equations in [4].
In this paper we extend our previous works on semi-discrete central schemes
for multi-dimensional HJ equations to unstructured grids. Our derivation re-
sults with a new numerical flux that takes into account information regarding
the local speeds of propagation of information from the discontinuous inter-
faces. Similarly to any other central scheme, the evolution points are taken
away from the interfaces in order to avoid Riemann solvers. This numerical
Hamiltonian should be viewed as a central version of the numerical Hamilto-
nian of Abgrall [1]. It can also be viewed as a generalization of the numerical
Hamiltonians of Kurganov et al. [10, 11] that were derived for Cartesian grids.
When we assume that the "unstructured" grid is Cartesian (which still is an
admissible grid in our formulation) the scheme we obtain is different of the
schemes obtained in [11, 10]. This difference results from a different averaging
procedure we use with the values obtained at the different evolution points,
and is reflected by a different coefficient in front of the dissipative term.

2 A Semi-discrete scheme for HJ equations


We consider two-dimensional Hamilton-Jacobi equations of the form

¢t + H(\l¢) = 0, (1)
augmented with initial values ¢(x, t = 0) = ¢o(x). We assume that T is a
given triangulation of Q. The grid points are denoted by Xa. For every grid
point there are ma angular sectors T{' that are ordered counterclockwise. For
simplicity we will drop the a index from the triangles, and use the notation
Semi-discrete Schemes for Hamilton-Jacobi Equations 625

Fig. 1. Grid point Xu and its angular sectors. Each angular sector 1 has an associated
angle (h

Tl = T{' whenever possible. Each node of our triangulation may be visualized


as in Figure l.
The semi-discrete scheme will be constructed as a limit of a fully-discrete
scheme as the time-step tends to zero. We therefore assume a time step, Llt,
and use the standard notation t n = nLlt. Let 'P~ denote the approximate value
of ¢( Xc<, tn). We assume that for each time tn, 'P~ is given at each node Xc<
of the unstructured grid. We also assume that the values 'P~ can be used to
reconstruct a continuous piecewise-polynomial interpolant, i.e., a polynomial
in each sector. We will comment below on a method for obtaining such a re-
construction. In either case, the numerical flux we develop is independent of
the reconstruction step. We denote this reconstruction by 'Pc< and denote the
approximation of 'V 'P in each sector 'V 'Pc<,l.
The second ingredient we need is an estimation of the maximal speeds of
propagation at the interfaces of each angular sector (in a direction that is
perpendicular to the interface). For any given angular sector, T l , the counter-
clockwise speed of propagation is denoted by at and the speed of propagation
on the other interface is ai. These speeds can be estimated by:
at = max {maxTz {'VH('V'Pc<,I)' lll-l,l} ,maxT {'VH('V'Pc<,l-l)' lll-l,l}} '
Z_ 1

ai = max {maxTz {'V H('V'Pc<,I) . llIH,I} , maxTz+ 1 {'V H('V'Pc<,I+l) . lll+l,I}}(7)


where llj,l is the normal vector on the interface between sectors Tj and Tl
pointing into T l .
We can now determine in every sector Tl around Xc< an evolution point x~
that is located away from the interfaces (see Figure 2). This will be done using
the distances that are defined through the local speeds of propagation at.
The distance of the evolution point x~ from Xc< is denoted by dl . Clearly,
dl depends on the local speeds of propagation at and on the angle Bl and is
given by
(3)
626 D. Levy, S. Nayak

Fig. 2. Evolution point, x~, derived from the maximal local speeds of propagation
into Tl, at and at

We then define d1 to be d1/ ,dt, a quantity that does not explicitly depend on
,dt.
The interpolant cp(x, tn) is evolved to the next time step t n+1 at the points
x~, which are located away from the propagating discontinuities (assuming
that the time step ,dt is sufficiently small). From (1), this is given to first
order in time by the Taylor expansion
'P(X~, t n+1) = cp(x~, tn) - ,dtH("Vcp(x~, tn)) + O(Llt2), (4)
where the approximation of the gradient "V 'P at x~, "V cp( x~, t n ), is obtained
from the reconstruction.
The next step is to combine the values of 'P at the different evolution
points around X a , x~, into one value 'P~+1. This is done by writing a convex
combination with weights 81 ::::: 0 that are yet to be determined:
n+1 _ 2:~181'P(x~, t n+ 1)
'Pa - ",rna 8 (5)
61=1 1
Using (4), we may express (5) as
n+1 _ 2:~181 [cp(x~, tn) - ,dtH("Vcp(x~, tn)]
'Po; - ",rna 8 (6)
61=1 1
If we now define PI to be the unit vector in the direction of x~ from Xo;, we
can use a Taylor expansion in space
cp(x~, tn) = CP(Xa, tn) + d1Pl . "Vcp(x~, tn) + O(Llt 2).
Here by "Vcp(x~, tn) we refer to the value of the gradient at Xa that is associated
with the reconstruction in sector Tl at x~. We may therefore rewrite (6) as the
fully discrete scheme
,d rna

'Pan+1 -n +
= 'Pa " 1 [d'lPl·
2:~1t 81 'tl'8 D
Vip
-( 1 n)
xa,t - H(D -( 1 n))] .
vip xa,t
(7)

In the limit ,dt -+ 0, (7) becomes a semi-discrete scheme:


Semi-discrete Schemes for Hamilton-Jacobi Equations 627

:t'Pa(t) = I:~'i ~Sl [d1P1 ' 'Vcp~(t) - H('Vcp~(t))] ,


Sl (8)

where for each l, 'Vcp~(t) denotes limdt--+O 'Vcp(x~,tn). All that remains is to
determine the coefficients Sl in (8). These coefficients will be obtained through
a consistency condition. The consistency of the scheme implies that if the
value of the gradient is identical in every sector that surrounds Xa then the
numerical Hamiltonian should reduce to the differential Hamiltonian. Hence
we are seeking for coefficients Sl such that
ma
Ls1d1P1 =0. (9)
1=1

Such coefficients indeed exist, and we can use, e.g., the results of Abgrall in [1]
to find them. The observation that was made there was that if /-l1+1/2 denoted
a unit vector in a direction that is aligned with the interface between the
sectors Tz and TI +1, and if 01 < 7r (which is the case with a triangulation, e.g.),
then ma

L Il+!/-ll+! = 0, (10)
1=1
provided that
Il+! = E [tan (~) + tan (0 1;1) ], (11)

for any E > O. In order to incorporate (10)-(11) into our framework, we split
each angle 01 into two parts ot
that are defined as

± . a1±
01 = arCSIn -,-, (12)
d1
(see figure 3).

Fig. 3. The angles around Xc>

The consistency condition (9) is then satisfied if the weights Sl are defined
as
628 D. Levy, S. Nayak

Sl = dE1 [tan( 8 t +81:-1)


2 + tan (8 1 +8
2 4 1 )] ' (13)

where 8t are given by (12). In our case, the coefficient E will anyhow cancel
out in the semi-discrete formulation (8), so it can be omitted from (13).
To summarize, if we define

f31 = [tan ( 8t+81:-1)


2 + tan (81+841)]
2 '

then the semi-discrete scheme is given by

(14)

Remarks.
1. A simple version of the scheme can be obtained by assuming that all the
speeds of propagation are identical. In this case, the local speeds are re-
placed by their maximum, i.e., a = maxI {at, al }. In this case
, a
d1 = --:-:--..,..--,-
sin(8 1/2) ,
and the semi-discrete scheme (14) can be written in the simpler form

d a rna [ sin !Z1. ]


dt 'Pa(t) = f3 . !Z1. Lf31 Pl' \l<p~(t) -
",rna _ 2 H(\l<p~(t)) . (15)
61=1 Ism 2 1=1 a
If, in addition to the velocities being identical, the angles are also assumed
to be identical, i.e. 8 = 81, 'Vi, then (15) can be further simplified into

dd 'Pa(t) = _1
t
~
[-!!--oPI'
ma 1=1 sm '2
\l<p~(t) - H(\l<P~(t))]. (16)

2. In [1] Abgrall derived a Lax-Friedrichs-type scheme on triangular meshes.


In our notation, his scheme is of the form

!!.-
dt 'Pa
(t) = ~ ~f3
1r ~ 1+"2 Pl +"2
1(\l<P~(t) + \l<p~+1(t))
l'
2
_ H (l:;:;21r81\l<P~).
1=1
(17)
Here Pl+1/2 is the unit vector in the direction of the interface between the
sectors Tl and Tl+1, and f31+1/2 = tan(8z/2)+tan(81+1/2). The derivation of
(17) involved evolution points that were located on the interfaces between
the sectors. This resulted with the form of the dissipative term in (17) that
contains averages of gradients in adjacent sectors. Also, the scheme (17)
involves a Hamiltonian that is evaluated at the average of the derivatives
that are computed in different sectors (with weights that are proportional
Semi-discrete Schemes for Hamilton-Jacobi Equations 629

to the angles). This term was postulated to be in this form, and could have
taken different forms. In our case (14), this term is replaced by an average
over the Hamiltonian that is evaluated in different sectors. In our case, the
form of this term is dictated by the derivation of the scheme.
3. We would like to emphasize that the scheme (14) does not require Riemann
solvers. This is due to the scheme's derivation, which places the evolution
points away from the boundaries of the angular sectors around every point
Xa·
4. If the grid is a Cartesian grid with equal spacing in the x- and y-directions,
the number of angular sectors at each point is ma = 4, and sin( /2) = e
sinCrr / 4) = v'2/2. In the simple case where all velocities are taken to be
identical in both direction, the scheme (16) becomes
_ a (+
d
dt ipa (t) -"2 ipx - ipx_ + ipy+ - ipy_) (18)
1
-"4 [H(ip;,ipt) +H(ip;;;,ipt) +H(ip;,ip;) +H(ip;;;,ip;)] ,
with the obvious notations, e.g. H( ip;, ipt) is the Hamiltonian evaluated
at the gradient at Xa that is taken from the first quadrant. The scheme
(18) is identical to the consistent and monotone semi-discrete scheme that
was derived for Cartesian grids in [4, 11].
5. The order of accuracy of the scheme (14) is determined by the order of ac-
curacy of the reconstruction and the order of the ODE solver. Other than
that, the formulation (14) is independent of the order of accuracy of the
scheme. The scheme is a Godunov-type scheme with a global reconstruc-
tion that is evolved in time in evolution points that are located away from
the interfaces. In practice, the final semi-discrete scheme (14) requires only
the values of the gradient at the grid points Xa that are computed in the
different angular sectors around Xa. Hence, all that one needs from the
reconstruction is the values of these gradients. It is therefore possible to
use the same reconstructions that were developed for upwind schemes for
HJ equations on triangular meshes. For examples, a high-order (third- or
fourth-order) weighted essentially non-oscillatory (WENO) reconstruction
for HJ equations on triangular grids was derived by Zhang and Shu [20].
It can be incorporated as it is into the present framework.

3 Conclusion

We have derived a new semi-discrete central scheme for HJ equations on un-


structured grids. This scheme is a generalization of the semi-discrete central
schemes on Cartesian grids [4, 10, 11]. It is a Godunov-type scheme where
a global reconstruction is evolved in time and then projected back to the grid
points. Since the evolution is performed away from the interfaces of the angu-
lar sectors, there is no need to use exact or approximate Riemann solvers. The
630 D. Levy, S. Nayak

formal accuracy of the scheme depends on the accuracy of the reconstruction


and the order of the ODE solver being used.

References
1. Abgrall, R. (1996): Numerical discretization of the first-order Hamilton-Jacobi
equation on triangular meshes. Commun. Pure Appl. Math., 49, 1339-1373.
2. Bryson, S., Levy, D.: Central schemes for multi-dimensional Hamilton-Jacobi
equations. SIAM J. Sci. Comput. (to appear).
3. Bryson, S., Levy, D. (2003): High-Order Central WENO Schemes for Multi-
dimensional Hamilton-Jacobi Equations. SIAM J. Numer. Anal., 41, 1339-1369.
4. Bryson, S., Levy, D. (2003): High-order semi-discrete central-upwind schemes for
multi-dimensional Hamilton-Jacobi equations. J. Compo Phys., 189, 63-87.
5. Crandall, M.G., Ishii, H., Lions, P.-L. (1992): User's guide to viscosity solutions
of second order partial differential equations. Bull. Amer. Math. Soc., 27, 1-67.
6. Crandall, M.G., Lions, P.-L. (1983): Viscosity solutions of Hamilton-Jacobi equa-
tions. Trans. Amer. Math. Soc., 277, 1-42.
7. Jiang, G.-S., Peng, D. (2000): Weighted ENO schemes for Hamilton-Jacobi equa-
tions. SIAM J. Sci. Comp., 21, 2126-2143.
8. Jiang G.-S., Shu C.-W. (1996): Efficient implementation of weighted ENO
schemes. J. Compo Phys., 126, 202-228.
9. Kossioris G., Makridakis Ch., Souganidis P.E. (1999): Finite volume schemes for
Hamilton-Jacobi equations. Numer. Math., 83,427-442.
10. Kurganov, A., Noelle S., Petrova G. (2001): Semi-discrete central-upwind
schemes for hyperbolic conservation laws and Hamilton-Jacobi equations. SIAM
J. Sci. Comp., 23, 707-740.
11. Kurganov, A., Tadmor, E. (2000): New high-resolution semi-discrete central
schemes for Hamilton-Jacobi equations. J. Compo Phys., 160, 720-742.
12. Lin, C.-T., Tadmor, E. (2000): L1-stability and error estimates for approximate
Hamilton-Jacobi solutions. Numer. Math., 87, 701-735.
13. Lin, C.-T., Tadmor, E. (2000): High-resolution non-oscillatory central schemes
for approximate Hamilton-Jacobi equations. SIAM J. Sci. Comp., 21, 2163-2186.
14. Lions, P.L. (1982): Generalized solutions of Hamilton-Jacobi equations. Pitman,
London
15. Lions, P.L., Souganidis, P.E. (1995): Convergence ofMUSCL and filtered schemes
for scalar conservation laws and Hamilton-Jacobi equations. Numer. Math., 69,
441-470.
16. Liu, X.-D., Osher, S., Chan, T. (1994): Weighted essentially non-oscillatory
schemes. J. Compo Phys., 115, 200-212.
17. Osher, S., Sethian, J. (1988): Fronts propagating with curvature dependent
speed: algorithms based on Hamilton-Jacobi formulations. J. Compo Phys., 79,
12-49.
18. Osher, S., Shu, C.-W. (1991): High-order essentially nonoscillatory schemes for
Hamilton-Jacobi equations. SIAM J. Numer. Anal., 28, 907-922.
19. Souganidis, P.E. (1985): Approximation schemes for viscosity solutions of
Hamilton-Jacobi equations. J. Diff. Equations, 59, 1-43.
20. Zhang, Y.-T., Shu, C.-W. (2002): High-order WENO schemes for Hamilton-
Jacobi equations on triangular meshes. Siam J. Sci. Comput., 24, 1005-1030.
Numerical Simulation of Dislocation Dynamics

Vojtech Mimirikl, Jan Kratochvil 2 , Karol Mikula 3 and Michal Benes l

1 Dept. of Mathematics, Faculty of Nuclear Sciences and Physical Engineering,


Czech Technical University, Trojanova 13, 12000 Praha 2, Czech Republic
[email protected], [email protected]
2 Dept. of Physics, Faculty of Civil Engineering, Czech Technical University,
Thakurova 7, 16629 Praha 6, Czech Republic [email protected]
3 Dept. of Mathematics, Slovak University of Technology, Radlinskeho 11, 81368
Bratislava, Slovakia [email protected]

Summary. The aim of this contribution is to present the current state of our re-
search in the field of numerical simulation of dislocations moving in crystalline materi-
als. The simulation is based on recent theory treating dislocation curves and dipolar
loops interacting by means of forces of elastic nature and hindered by the lattice
friction. The motion and interaction of one dislocation curve and one dipolar loop
placed in 3D space is considered. Equations of motion for a parametrically described
dislocation curve are discretized by the flowing finite volume method in space. The
interaction force is computed for each dipolar loop and along the discretized curve.
The resulting system of ordinary differential equations is solved by a higher order
time solver.

Physical background Plastic deformation of crystalline solids is a result of


the motion of dislocations. The theory of dislocations is described in num-
ber of text books, e.g. [8]. Here we recall only the basic mobile properties of
dislocations and the nature of their mutual interactions.
A dislocation is a line defect of the crystal lattice. Along the dislocation line
the regular crystallographic arrangement of atoms is disturbed. The dislocation
line is represented by a closed curve or a curve ending at the surface of the
crystal. At low homologous temperatures the dislocations can move only along
crystallographic planes (the slip planes) with the highest density of atoms. The
motion results in mutual slipping of the neighboring parts of the crystal along
the slip planes. The slip displacement carried by a single dislocation, called
Burgers vector, is equal to one of the vectors connecting the neighboring atoms.
The displacement field of atoms from their regular crystallographic posi-
tions around a dislocation line can be treated (except the close vicinity of the
line) as elastic stress and strain fields. On the other hand, a stress field exerts
a force on a dislocation. The combination of these two effects causes the elastic
interaction among dislocations.
One of the most distinguished features of plastic deformation at the mi-
croscale is a great overproduction of dislocations during a deformation process.
Only a small fraction of generated dislocations is needed to carry plastic defor-
632 v. Minarik et al.

mation, the rest is stored in the crystal. The deformed crystals supersaturated
with dislocations tend to decrease the internal energy by mutual screening
of their elastic fields. If dislocations possess a sufficient maneuverability pro-
vided by easy cross-slip (solids with wavy slip) the leading mechanism is an
individual screening. The dislocations are stored in the form of dipoles which
are transformed to prismatic dislocation dipolar loops of the prevailing edge
character or such loops are directly formed (the experimental evidence is sum-
marized in [9]).
The glide dislocations and the dislocation loops have much different char-
acteristic length scales and mobile properties:
- While the segments of glide dislocations extend over distances of microme-
ters, the size of the prismatic dipolar loops is of the order of 10 nm.
- The glide dislocations are moved by the shear stress resolved in the slip
plane, while the loops are drifted by stress gradients and/or swept by the
glide dislocations. The loops being prismatic they can move along the direc-
tion parallel to the direction of their Burgers vector only.
- During deformation, the glide dislocations become curved. The local curva-
ture of the glide dislocations seems to be one of the leading factors control-
ling the pattering [11, 12]. The loops can be approximately treated as rigid
objects.
Due to the above mentioned complexity the formation of dislocation struc-
tures as a consequence of the interactions among dislocations is still an open
problem. In this paper we will be concerned with a particular case: a disloca-
tion curve interacting with a dipolar loop.

Dislocation Curve and Dipolar Loop We consider a plane dislocation


segment with fixed ends; the segment represented by a plane curve can bow in
a slip plane which is identified with the xz-plane of the coordinate system, i.e.
y = O. If the dislocation segment approaches a loop, the curve can pass by or
the curve and the loop start to move together or the curve is stopped by the
loop [13].
As the loop is allowed to move along the direction parallel to its Burgers
vector b only (see Fig. la) just the force component in that direction causes
the loop motion. Additionally, the lattice friction acts against the motion. The
detailed condition for the dislocation curve and the loop is specified in Sect ..
The position of the loop is represented by 3 coordinates of its center. There
are two types of dipolar loops: a vacancy dipolar loop and an interstitial dipolar
loop; each type in two stable configurations [10] (see Fig. 2).
In vacancy loops a strip of atoms in regular crystallographic positions is
missing. On the other hand; in interstitial loops an extra strip of atoms is
added. For that reason the vacancy and interstitial loops produce different
stress fields. The Burgers vectors of vacancy and interstitial loops have opposite
directions. This fact we incorporate into our model by using negative value for
the x-axis component of the Burgers vector for interstitial loop.
Numerical Simulation of Dislocation Dynamics 633

-F
(a) Force interaction

Q
G::.'
• x
: 'if. "
(b) DL geometry
Fig. 1. Dislocation dynamics problem geometry: (a) Dislocation curve and dipolar
loop interaction; (b) Dipolar loop geometry

We denote the dipolar loop types and stable configurations by Vi, V2 for
the vacancy loop, and h, 12 for the interstitial loop, respectively (see Fig. 2).
In the mathematical model we represent the dipolar loop as a small rectangle

(t /)
y y

. .
. ..'. x
. .. "
'.
~
x

Fig. 2. Types and stable configurations of a dipolar loop. Longer sides of dipolar
loop are parallel to the z-axis and lie in different layers of the atomic lattice.

with two longer sides parallel to the z-axis and two shorter sides parallel either
to [1,1,0] or [1, -1,0] depending on the type ofloop. The dimensions of dipolar
loop are 2l and 2y2h, respectively, see Fig. lb.

Stress Field of Dipolar Loop Each type of dipolar loop produces a stress
field. The formula for this field will be needed in the numerical simulation of
the dislocation dynamics. In this work we use the stress field (Jij presented by
Kroupa et al [6, 7] (using Einstein's symbolic rule for sums):
634 V. Minarik et al.

In (1) we introduce following symbols (i,j,k,n E {1,2,3}):


aij aij = aij (x, y, z) - components of the stress field tensor,
depending on the position in space
J..l shear modulus
v Poisson's ratio
A area of the dipolar loop, with dA = 2hy2dt
bi , bj , bk components of the Burgers vector
ei, ej, ek, en components of the relative position vector, el = x, e2 =
y, e3 =z
e relative distance from the dipolar loop, e = eI vi + e§ + e~
Vi, Vj, Vk, Vn components of the dipolar loop normal vector
Dij Kronecker symbol
The normal unit vector v is chosen to be Jz [1,1,0] for dipolar loops of type
VI and It, and v = Jz [1, -1,0] for dipolar loops of type V2 and h.

Dipolar Loop and Dislocation Curve Interaction The interaction force


per unit length of dislocation line is given by the Peach-Koehler equation,
which written for the i-th component reads:

(2)
where we denote:
Ii i-th component of the interaction force per the unit length of
the dislocation line
Cijk Levi-Civita symbol
ajm components of the stress field tensor at the dislocation position
bm components of the Burgers vector
Sk components of unit vector s which has the direction of the dis-
location line

Mathematical Model The dynamics of the system of a dislocation curve


and a dipolar loop is governed by a system of two equations describing their
motion. The motion law for the dislocation curve is represented by the well-
known mean curvature flow equation (see e.g. [4, 1, 3])

Bv = r;,+F (3)
Numerical Simulation of Dislocation Dynamics 635

where 1I is the normal velocity of evolving curve, K, its curvature, B (drag


coefficient) is a constant given by material and F represents external driving
force.
The moving dislocation curve r t can be parameterized by a smooth time
dependent vector function X : I x S ----+ R2, i.e., at any time t it is given as the
Image(X(., t)) = {X(u, t), u E S} where S is a fixed parametrization interval
and I is a time interval. For a smooth curve IXu I > 0 and for unit arc-length
parametrization s, ds = IXuldu. Then Xs and X; represent unit tangent and
normal vectors, respectively. Using Frenet's formulae, the evolution equation
(3) can be rewritten to the form of intrinsic diffusion equation [4, 1, 3, 5]
BX t = Xss + FX; (4)
for the position vector X. If we denote by XX(t, s) and XZ(t, s) the components
of the dislocation curve position vector in the xz-plane, then X; = [X;, -X;].
The equation (4) will be solved numerically to model a complicated dislocation
curve dynamics.
Now we must explain exactly what is covered by the term F in (4). We
know that F incorporates the interaction between the dislocation curve and
the dipolar loop. To get detailed knowledge of F, we must go back to the
Peach-Koehler equation (2). Assuming the dislocation curve can move only
in the xz-plane and the dipolar loop can glide along the x-axis, we need to
put Ix and Iz from the Peach-Koehler equation into the governing equations.
Denoting the Burgers vectors of the dislocation curve and the dipolar loop
bcurve = [bcurve, 0, 0] and b = [b, 0, 0]' respectively, we get

(5)
where O"xy = 0"12, and Sx, Sz are the components of the dislocation curve's
tangential vector, which can be also written as
(6)
Thus, the term F = O"extbcurve + O"xybcurve covers the stress of the dipolar
loop exerted on the dislocation curve, as well as any external stresses which
may the material be exposed to. To be more precise, in order to obtain force
vector acting at given position of the dislocation curve, one needs to multiply
F and the dislocation curve's normal vector X; at that position. We also
r
explicitly write the dependence of F on the dislocation curve t because the
curve position is required for the evaluation of its relative position to the
dipolar loop. The obtained relative position is then used in the evaluation
of O"xy.
The stress O"xy is given by (1), but we can simplify this formula for our
specific situation - Burgers vector has only one non-zero component and the
dipolar loop is a rectangle which has one of the two possible configurations.
Under the assumption that the dipolar loop parameter h is very small, we can
use Taylor expansion in (1) and integrate it to obtain an algebraic formula for
the stress:
636 V. Minarik et al.

where

With the upper sign in (7) we get the stress formula for the dipolar loops VI
and II, while with the lower sign we get the formula for V2 and h dipolar
loops.
In order to obtain the equation governing the motion of the dipolar loop we
have to sum all the stress contributions of the dislocation curve. It is enough
to consider only the contributions in the x-axis direction because it is the only
direction the dipolar loop is allowed to glide in.
The stress contribution of the dislocation curve can be obtained according
to the action-reaction principle by simple reversing the sign of Ix in (5) and
integrating along the dislocation curve:

F~ = J
r,
O'xybcurvenxds , (9)

where nx is the x-axis component of the dislocation curve element normal


vector. Note it can be replaced using the derivatives of X with respect to the
parametrization s since it holds nx = X:. Besides F~ there is one other kind
of force - the friction force Fo which is a constant given by the material
and which acts against the gliding of the dipolar loop. Giving all the above
information together, we come to the equation governing the gliding of the
dipolar loop:
dx 1
dt = BpFx,total(rt,X(t)) , (10)

where x(t) is the x-axis position of the dipolar loop, P = 4(l + V2h) is the
perimeter of the dipolar loop, and

F~ - Fo if F~ > Fo
Fx,total(rt,X(t)) =
{
0 c ~f IFf I < Fo (11)
Fx + Fo If Fx < - Fo .
The complete dislocation dynamics problem for one dislocation curve and one
dipolar loop then follows when we put (4) and (10) together with initial and
boundary conditions.
Numerical Simulation of Dislocation Dynamics 637

Numerical Scheme For discretization of the problem described earlier, we


use the flowing finite volume method [5] in space and the method of lines [2]
in time. Discrete solution is represented by a moving polygon given, at any
time t, by plane points Xi, i = 0, ... , M. The values Xo and XM are prescribed
in case of fixed ends of the curve. The segments [Xi-I, Xi] are called flowing
finite volumes. We construct also dual volumes Vi = [Xi_~' Xi] U [Xi' Xi+~]'
2. = 1, .. , M - 1, were
h X i- ~ = X i - 12+X i ( see F'Ig. 3) .

Fig. 3. Piecewise linear approximation of the dislocation curve

Integrating evolution equation (4) in dual volume Vi we obtain

l Vi
aX
B-;:;-ds
ut
= 1 aX + 1
2
~ds
Vi uS Vi
F (aX)1-
-;:;-
uS
ds. (12)

(13)

where
(14)
and Fi is a constant approximation of F in dual volume Vi, Fi = CTxy(Ri)bcurve,
where Ri = Xi - [x(t), y, z] is the relative positional vector of Xi and the
dipolar loop center. If we replace the terms on the right-hand side by finite
differences and averaged values, respectively, we end up with the system of
ordinary differential equations

BdXi = 2 (Xi+1-Xi _ Xi-X i - 1) + 2 pxt+1-Xt-l (15)


dt di+di+l di+l di di+di+l 1 2 '
i = 1, ... ,M -1.

In discretization of the governing equation (9) for the dipolar loop we sum
contributions of every curve segment to obtain
M-l
F~ = L CTxy (x;+~ - x(t), -y, x:+~ - z) bcurve (Xt+l - Xn, (16)
i=O

where [x(t), y, z] is the center ofthe dipolar loop at time t. Next, we use formula
(11) applied to discrete F~ defined above and get
638 V. Minarik et al.

(17)

The complete discrete problem consists of (15) and (17) with accompanying
initial and boundary conditions.

Results of Numerical Experiments We made several numerical simula-


tions in which we used different settings. For the basic physical parameters we
used values which were experimentally measured for nickel crystals at room
temperature [10J: average length and width of the dipolar loop l = 60 nm,
h = 4 nm, Burgers vector b = 0.26 nm, shear modulus f1 = 80 GPa, and Pois-
son's ratio v = 0.33. When not specified otherwise, we used drag coefficient
B = 10- 5 Pa s.
In the simulations we were changing not only the type and initial position
of the dipolar loop, but also the initial shape of the dislocation curve and the
value of friction force. We observed following facts (not all of them can be
demostrated here):
- For the dislocation curve with fixed ends the curvature acts against the ex-
ternal stress. Therefore, there exists some equilibrium shape the dislocation
curve tends to.
- When no external stress is applied, the dislocation curve of any initial shape
tends to the straight line (potential energy minimization). Adding dipolar
loop, an oscillating motion (Fig. 4) of the dipolar loop as well as the dislo-
cation curve can occur.
- The direction in which the dipolar loop leaves the dislocation curve's inter-
action region depends on the type of the dipolar loop. Simply, V2 shifts to
the left where VI shifts to the right.
Acknowledgements The first author was partly supported by the NSF grant
NSFOO-138 Award 0113555, the second author was partly supported by the
project of the Grant Agency of the Czech Republic No. 201/01/0676, the
third author was partly supported by the project KONTAKT No.ME654 and
Grant Agency of the Czech Republic No.106/03/0826 and the fourth author
was partly supported by the grant VEGA 1/0313/03.

References
1. Angenent, S.B., Curtin, M.E. (1989): Multiphase Thermomechanics with an In-
terfacial Structure 2. Evolution of an Isothermal Interface. Arch. Rat. Mech.
Anal., 108, 323-391
2. Benes, M., Mikula, K. (1998): Simulation of Anisotropic Motion by Mean Curva-
ture - Comparison of Phase Field and Sharp Interface Approaches. Acta Math.
Univ. Comenianae, Vol. LXVII, 1, 17-42
3. Dziuk, C. (1994): Convergence of a Semi Discrete Scheme for the Curve Short-
ening Flow. Mathematical Models and Methods in Applied Sciences, 4, 589-606
4. Cage, M., Hamilton, R.S. (1986): The Heat Equation Shrinking Convex Plane
Curves. J. Diff. Ceom., 23, 69-96
Numerical Simulation of Dislocation Dynamics 639

T =15.0200 T =25.0220 T =35.0280


200 200 200

150 150 150

100 100 100

50 50 50

0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
~ ~ ~
'i' C)I C\J
'" 'i' C)I C\J
'" 'i' C)I C\J
'"
T =45.0480 T =55.0680 T =63.0840
200 200 200

150 150 150

100 100 100

50 50 50

0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
~ ~
'i' C)I
Fig. 4. Dipolar loop
C\J
'" 'i'
oscillations. Dipolar
loop
C)I C\J
'"
of type VI
starts to glide to the
'i' C)I C\J
'"
left
of the dislocation curve (timelevels T = 15.02, T = 25.022, T = 35.028). Then it
reverses as the attractive force of the dislocation curve gains the control over the
system for some time (T = 45.048, T = 55.068). Second reversing occurs before
T = 63.084.

5. Mikula, K., Sevcovic, D. (2001): Evolution of Plane Curves Driven by a Nonlinear


Function of Curvature and Anisotropy. SIAM J. App!. Math., Vo!' 61, No.5,
1473~1501
6. Kroupa, F. (1965): Long-Range Elastic Field of Semi-Infinite Dislocation Dipole
and of Dislocation Jog. phys.stat.sol., Vol. 9, 27~32
7. Verecky, S., Kratochvil, J., Kroupa, F. (2002): The Stress Field of Rectangular
Prismatic Dislocation Loops. phys.stat.so!. (a), Vo!' 191, 418~426
8. Hirth, J.P., Lothe, J. (1982): Theory of Dislocations. John Willey
9. Kratochvil, J., Saxlova, M. (1993): Dislocation Pattern Formation and Strain
Hardening in Solids. Physica Scripta, T49, 399~404
10. Tippelt, B., Bretschneider, J., Hiihner, P. (1997): The Dislocation Microstructure
of Cyclically Deformed Nickel Single Crystals at Different Temperatures. phys.
stat. so!. (a), 163, 1l~26
11. Saxlova, M., Kratochvll, J., Zatloukal, J. (1997): The Model of Formation and
Disintegration of Vein Dislocation Structure. Materials Science and Engineering,
A234-236, 205~208
12. Kratochvll, J. (2001): Self-organization Model of Localization of Cyclic Strain
into PSBs and Formation of Dislocation Wall Structure. Materials Sci. Eng. A,
A309-310, 331~335
13. Kratochvll, J., Kroupa, F., Kubin, L.P. (1999): The Sweeping of Dipolar Loops
in Cyclic Deformation: Kinetic Diagrams. In: Proceedings of the 20th Risoe Int.
640 V. Minarik et al.

T =00.0000 T =00.2000 T =00.6005


80 80 80

60 60 60

40 40 40

20 20 20

0 0 0

-20 -20 -20


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0
'"'i' 0
'i' '" 0 '? '" ~ '" '" '"'" '"'i'
0 0
'i' '" 0 '?
"';- "';- '" 0
~
"'''' 'i''"
0'" 0
'i' '" 0 '? '" 0
~
'" '"'"
0

T =01.0025 T =01.4045 T =01.8065


80 80 80

60 60 60

40 40 40

20 20 20 "-----.,
"--, '----..., ~ ~
0 0 0

Fig. 5. Dipolar loop swept by the curve; on the other hand, the curve is distorted
by the stress field of the loop. In this test there were used: J.L = 80 CPa, 1/ = 0.33,
B = 10- 4 Pa s, b = 0.707 nm, I = 35 nm, h = v'2 nm, Fa = 4 MPa m, applied stress
aa = -1.155 MPa. Initial position of dipolar loop was [0,40, -30]. The subsequent
stages are shown for increasing time T.

Symposium on Material Science: Deformation-Induced Microstructures: Analysis


and Relation to Properties. Risoe National Laboratory, Roskilde, Denmark, 387-
392
Implicit FEM-FCT algorithm for compressible
flows

Matthias Moller l , Dmitri Kuzmin and Stefan Turek

Institute of Applied Mathematics (LSIII), University of Dortmund, Vogelpothsweg


87, D-44227 Dortmund, Germany, 1 matthias. [email protected]

Summary. The flux-corrected transport (FeT) methodology is generalized to im-


plicit finite element schemes and applied to the Euler equations of gas dynamics.
The underlying low-order scheme is constructed by applying scalar artificial viscos-
ity proportional to the spectral radius of the cumulative Roe matrix. All conservative
matrix manipulations are performed edge-by-edge which leads to an efficient algo-
rithm for the matrix assembly. The outer defect correction loop is equipped with
a block-diagonal preconditioner so as to decouple the discretized Euler equations
and solve all equations individually. As an alternative, a strongly coupled solution
strategy is investigated in the context of stationary problems which call for large
time steps.

1 Introduction

The concepts of flux-corrected transport can be traced back to the celebrated


SHASTA scheme proposed by Boris and Book [1] in the early 1970s. Later,
their algorithm was superseded by Zalesak's multidimensional limiter [11] and
carried over to finite elements by Lohner and his coworkers [7].
In recent publications [3]' [4], [5] we presented a generalization of this ap-
proach to implicit finite element discretizations. A notable benefit of the new
FEM-FCT formulation was the representation of anti-/diffusive terms as sums
of skew-symmetric internodal fluxes. Moreover, an iterative limiting strategy
was introduced which prevents the limiter from getting overly diffusive for
large time steps. In this paper, we concentrate on flux correction for the Euler
equations of gas dynamics and discuss various algorithmic aspects pertinent
to the treatment of nonlinear hyperbolic systems.

2 FEM-FCT for scalar equations

As a model problem, consider the generic conservation law ~~ + 'V. (vu) = 0


where v = v(x, t) is a nonuniform velocity field. Let us employ standard
Galerkin FEM for the discretization in space and interpolate the convective
fluxes in much the same way as the sought solution [2]. After mass lumping,
we get an ODE system given by
642 M. Moller et at.

dUi
m·-
t dt = L k· ·(u· - u·) + oU·
tJ J t t" Oi = L kij (1)
#i j

where ML = diag{mi} denotes the 'lumped' mass matrix and K = {k ij }


stands for the discrete transport operator. The second expression in (1) cor-
responds to a single node i where the first term in the right-hand side is en-
gendered by the incompressible part of the transport operator. Provided that
k ij :2: 0, \:fj =I i, a system of such form is called local extremum diminishing in
the absence of the term OiUi which vanishes for divergence-free velocity fields
and is responsible for the physical growth of local extrema otherwise.
One major ingredient of any FCT algorithm is the nonoscillatory positivity-
preserving low-order scheme which can be constructed by applying artificial
diffusion D = {d ij } so as to render all off-diagonal entries of the linear operator
L = K + D nonnegative [3]. Hence, the optimal diffusion coefficients are given
by
dii =- L dij . (2)
#i
Since discrete diffusion operators have zero row/column sums [3] the diffusive
term can be decomposed into skew-symmetric fluxes (DU)i = 2:#i fij, where
fij = dij(uj - Ui). Thus, the modifications in (2) are conservative. Starting
with the Galerkin operator L := K, the low-order operator can be constructed
by applying artificial diffusion edge-by-edge for each pair of nodes i and j
whose basis functions have overlapping support

Iii := Iii - dij , lij := lij + dij ,


(3)
lji := lji + dij , ljj := ljj - dij .

The discrete upwinding technique presented above carries over to multidi-


mensions and yields the least diffusive linear LED scheme. However, linear
monotonicity-preserving methods can be at most first-order accurate. As a con-
sequence, compensating antidiffusion which constitutes the difference between
the discretizations of high and low order

Pi = Lfij, (4)
#i
is to be constructed in a nonlinear way so as to remove the excessive artificial
diffusion.
After an implicit time discretization we obtain a nonlinear algebraic system

0< B :::; 1 (5)

which can be solved via the fixed-point defect correction scheme

m = 0, 1,2, .... (6)


Implicit FEM-FCT algorithm for compressible flows 643

Here, A is a 'suitable' pre conditioner and r(m) = b(m+l) - Au(m) is the defect
vector for the m-th iteration cycle. The latter incorporates the constant right-
hand side stemming from the low-order scheme plus compensating anti diffusion

b(m+l) = bn +L O!ij fi~m), (7)


#i

Varying the correction factors O!ij between zero and unity, one can blend the
high-order method with the concomitant low-order one. The latter should be
used in the vicinity of steep gradients where spurious oscillations are likely
to arise. The construction of the solution-dependent correction factors and
of the fully discretized antidiffusive fluxes fi~m) is elucidated in [5]. As an
alternative to (7), an iterative limiting strategy was proposed for implicit FEM-
FCT schemes operated at large time steps. Roughly speaking, the amount
of previously accepted anti diffusion is taken into account so that only the
rejected portion of the antidiffusive flux needs to be limited at subsequent
defect correction steps.

3 Euler equations

Compressible flows are governed by the Euler equations which represent a sys-
tem of conservation laws for the mass, momentum and energy of an inviscid
fluid. These hyperbolic PDEs are typically written in divergence form

aU 3 apd
at + V· F = 0, where V·F= La'
d=l Xd
(8)

The vector of conservative variables U and the triple of fluxes F = (PI, p2, p3)
for each direction of the Cartesian coordinate system are defined as follows

F = [pV&t:+PIj.
pHv
(9)

Here, p, v, p, E and H = E + p/ p stand for the density, velocity, pressure,


total energy per unit mass and stagnation enthalpy, respectively. This system
is completed by an equation ofstatep = (-y-1)p(E-lvI 2 /2), where r = cp/cv
denotes the ratio of specific heats for a polytropic gas (-y = 1.4 for air).
By application of the chain rule, the Euler equations can be written in
an equivalent quasi-linear formulation in terms of the Jacobian matrices A =
(A 1 ,A2 ,A3 )

aU
at + A . VU = 0, where (10)
644 M. Moller et al.

4 (jalerkin nnatrix assennbly


In what follows, an efficient edge-based assembly technique for the standard
Galerkin discretization of the Euler equations is presented. Let us start with
the divergence form (8) and interpolate the fluxes using the group finite ele-
ment formulation [2] which yields an ODE system similar to (1)

(11)

Here, Me denotes the block-diagonal consistent mass matrix for the coupled
system and K is a discrete counterpart of the operator -A· \7 for the quasi-
linear formulation (10). Let the entries of the mass matrix and the vector of
coefficients coming from the discretization of space derivatives be defined as
follows
(12)

As long as the mesh is fixed, the coefficients Cij remain constant and thus the
operator K can be assembled efficiently without resorting to a costly numerical
integration.
Recall that basis functions sum to unity, so that the sum of their derivatives
vanishes. Hence, the coefficients Cij satisfy Cii = - L:#i Cij and the right-hand
side of the five coupled equations for node i is given by

(13)

In his pioneering work on approximate Riemann solvers [9], Roe showed that
the differences between the components of F and U are related by F j - Fi =
, . . ' '1 '2 '3
Aij(Uj - Ui)' where the tnple of matnces Aij = (Aij,Aij,Aij) corresponds
to the Jacobian tensor A evaluated for the special set of density-averaged
variables

This enables us to express the nodal value (KU)i in terms of the conservative
variables

(KU)i = - L Cij . Aij(Uj - Ui), (15)


#i

The dot product can be interpreted as a 'projection' of the triple Aij onto the
numerical edge ij. For our purposes, it is expedient to introduce the splitting
Cij . Aij = -(Aij + Bij), where the two components of the cumulative Roe
matrix are defined by [5]
Implicit FEM-FCT algorithm for compressible flows 645

(16)
b 'J.. -__ Cij + Cji (17)
2
A similar decomposition can be performed for the contribution of the edge ij
to (Ku)j

where Cji· Aij = Aij - Bij' (18)

Integration by parts reveals that in the interior of the domain aij = -Cij while
b ij = 0 [5], so that only the skew-symmetric part Aij needs to be evaluated
for interior edges. The symmetric part Bij only applies to the cumulative Roe
matrices for boundary edges. According to (15)-(18), the contribution of the
edge ij to the term Ku reads

(Aij + Bij)(Uj - Ui) ---> (KU)i' (19)


(Aij - Bij)(Uj - Ui) ---> (Kuk (20)

Together with the fact that the coefficients Cij remain constant and thus can
be assembled and stored once and for all during the initialization process,
(19)-(20) suggest an efficient edge-based algorithm for the matrix assembly.
The underlying data structure can be generated from the sparsity pattern of
the finite element matrix and contains entries for all pairs of nodes whose
basis functions have overlapping supports [5]. In contrast to the scalar case
this connectivity exists not only between basis functions for different nodes
but also between those for different variables. Hence, each coefficient of the
discrete operator is given by a square matrix of dimension equal to the number
of variables.
It can be readily inferred from (19)-(20) that the contribution of the nu-
merical edge ij to the global matrix K E jR5Nx5N is given by

(21)

These local Jacobians are evaluated edge-by-edge and their entries K~j, k, l =
1, ... ,5 are scattered to the corresponding positions in the 25 blocks K kl E
jRNxN [5].

5 Artificial viscosity

To a large extent, the ability of a FEM-FCT algorithm to withstand the forma-


tion of wiggles depends on the quality of the underlying low-order method. For
scalar transport equations we derived the least diffusive positivity-preserving
646 M. Moller et al.

scheme by elimination of negative off-diagonal entries from the discrete trans-


port operator.
In [5] the LED-principle was generalized to hyperbolic systems by render-
ing all off-diagonal matrix blocks positive semi-definite. If we perform mass
lumping and replace the high-order operator K in (11) by the low-order one
we obtain the ODE system

where (22)

The low-order operator L is constructed in much the same way as (21) by


applying tensorial artificial viscosity Dij E lR 5X5 to the Roe matrices. The
global matrix assembly can be adopted from the previous section. The missing
symmetric boundary part Bij of the cumulative Roe matrix is incorporated
into the raw antidiffusive fluxes

F··
'1 = - '1 dt +D'
(M'~ '1 -B") (u· -U)
'J J ' ,
(23)

where Mij = mij I denotes the local diagonal mass matrix.


The hyperbolicity of the Euler equations implies that any linear combi-
nation of the three Jacobian matrices is diagonalizable with real eigenvalues,
such that the cumulative Jacobian matrix admits the following factorization

(24)

Let us 'project' the density-averaged velocity Vij onto the numerical edge ij
and define the local speed of sound as follows

Cij = (r -1) (Hi j _ IV~12). (25)

Here, laij I denotes the Euclidean norm of the coefficient vector aij. As a con-
sequence, the diagonal matrix of eigenvalues can be readily computed as

(26)
In [5] we gave a detailed description of how to derive a generalization of Roe's
approximate Riemann solver from (24) by elimination of negative eigenvalues.
A much cheaper alternative is to add scalar dissipation proportional to
the spectral radius of the Roe matrix d ij = laij I(IVij I + Cij) [5]. The result-
ing artificial viscosity operator Dij = dijl, which in fact is the same for all
components, needs to be applied only to the five diagonal blocks of the finite
element matrix. Numerical examples demonstrate, that in the framework of
flux correction the final solution even benefits from this slightly over diffusive
low-order scheme because of an improvement in the phase accuracy [4], so that
the application of a costly Riemann solver does not payoff.
Implicit FEM-FCT algorithm for compressible flows 647

6 Defect correction
After an implicit time discretization we obtain a nonlinear algebraic system
similar to (5) which can also be solved by the defect correction scheme

(27)
In a practical implementation the 'inversion' of A is performed by applying
some inner iteration to solve the linear subproblem for the solution increment
(an improvement of the residual by 1-2 digits suffices) and update the last
iterate thereafter
u(m+l) = u(m) + L1u(m) , (28)
The matrix in the left-hand side of this linear system can be replaced by
a (block-diagonal) pre conditioner c(m), so as to decouple the discretized Euler
equations [5]. As an alternative we can apply the (preconditioned) BiCGSTAB
method directly to the coupled system (28).
Let us split the low-order operator into its diagonal, sub diagonal and su-
perdiagonal parts A (m) = w(m) + j(m) + E(m). In [5], a block-Jacobi precon-
ditioner c(m) = J(m) was suggested for the defect correction scheme, so that
only the five diagonal blocks need to be assembled and stored

and (29)
As a consequence, the linear system (28) resolves into a sequence of scalar
subproblems which can be solved separately or at best in parallel. For this
purpose an iterative method, e.g. (preconditioned) BiCGSTAB or geometric
multigrid, is applied to
j (m) (m) (m)
k
A
£..lU k = rk , k = 1, ... ,5 (30)
Uk
(m+l)
= (m)
Uk
+ £..lU k(m) ,
A U(O) _ Un
k - k·

However, this segregated solution approach disqualifies for larger time steps
since severe convergence problems of the outer iteration can be observed.
Longing for the full potential of the iterative limiter and the unconditional
positivity of fully implicit time-stepping we have been testing some coupled
solution strategies for (28) by means of a preconditioned BiCGSTAB method.
In this case, additional blocks of the low-order operator may need to be as-
sembled and stored or there should be another (direct) way to carry out the
matrix-vector multiplications for updating the residual.
Let the pre conditioner for the BiCGSTAB solver be given by w(m) + j(m),
where

w~;n) = ML - eL1tLi';), 'Vi < k, and w~;n) = 0, 'Vi ~ k, (31)


Ek';') = ML - eL1tLi';), 'Vi > k, and Ek';') = 0, 'Vi :::; k. (32)
648 M. Moller et al.

This corresponds to the block-Gauss-Seidel scheme


J(m) ..::1u(m+l) = R(m) _ w(m) ..::1u(m+1) _ E(m) ..::1u(m). (33)

Swapping the sub- and superdiagonal block-matrices in the equation above


gives another variation of this algorithm. The alternating application of both
sub diagonal and superdiagonal block-matrices results in a symmetric block-
Gauss-Seidel approach. Recall that J(m) in equation (33) is a block-diagonal
matrix for which each block corresponds to a scalar problem similar to (30).
They can be solved in much the same way as for the segregated solution
approach. The design of 'optimal' pre conditioners is a nontrivial task which
constitutes an important field for future research.

7 Numerical examples
Let us illustrate the potential of the implicit FEM-FCT algorithm by con-
sidering a steady two-dimensional supersonic flow over a wedge. Here, the
free-stream Mach number is M = 2.5 and the deflection angle is () = 15°. The
results presented at the top of Figure 1 are computed on a mesh of 128 x 128
bilinear elements by the low-order scheme (left) and the implicit FEM-FCT
algorithm (right), respectively. It can be readily seen that the shock wave is
unacceptably smeared by the low-order method. Nevertheless, both the up-
stream and downstream Mach numbers are predicted correctly. The iterative
flux limiter resolves the shock very precisely within as few as 3-4 elements.

Fig. 1. Compression corner. Oblique shock at Ml = 2.5, e= 15°.

To demonstrate the potential ability of our discrete FEM-FCT approach


to deal with unstructured meshes, we constructed an adaptive coarse grid with
Implicit FEM-FCT algorithm for compressible flows 649

the grid points clustered in the vicinity of the shock wave, see Figure 1 (bottom
left). After two steps of global refinement this gives the computational mesh
of approximately 10,000 vertices. The resulting numerical solution (bottom
right) exhibits superb accuracy and remains absolutely free of oscillations.
Numerical results for a variety of standard gas dynamic test cases encom-
passing both transient and stationary flows are presented in [5]. Moreover, an
in-depth investigation of scalar problems can be found in the same publication.

8 Conclusions
To our knowledge, most of the finite element schemes for solving the Eu-
ler equations on unstructured grids are explicit and, consequently, subject to
an restrictive CFL condition. In this paper, an implicit high-resolution finite
element scheme for hyperbolic systems was presented making use of the flux-
corrected transport paradigm. The underlying low-order operator was con-
structed by applying scalar artificial viscosity proportional to the spectral ra-
dius of the cumulative Roe matrix for each edge of the sparsity graph. An
efficient edge-based approach to matrix assembly was proposed. The design
of suitable preconditioners for both segregated and coupled solution proce-
dures was addressed. The performance of the new FEM-FCT algorithm was
illustrated for a steady supersonic flow without and with adaptive mesh re-
finement. The development of robust and efficient iterative solvers for implicit
FEM-FCT schemes including FAS-FMG multigrid [6]'[8] and an analog of the
local MPSC smoother for the incompressible Navier-Stokes equations [10] will
be addressed in forthcoming publications.

References
1. J. P. Boris and D. L. Book, Flux-corrected transport. I. SHASTA, A fluid trans-
port algorithm that works. J. Comput. Phys. 11 (1973) 38-69.
2. C. A. J. Fletcher, The group finite element formulation. Comput. Methods Appl.
Mech. Engrg. 37 (1983) 225-243.
3. D. Kuzmin and S. Turek, Flux correction tools for finite elements. J. Comput.
Phys. 175 (2002) 525-558.
4. D. Kuzmin, M. Moller and S. Turek, Multidimensional FEM-FCT schemes for
arbitrary time-stepping. Int. J. Numer. Meth. Fluids 42 (2003) 265-295.
5. D. Kuzmin, M. Moller and S. Turek, High-resolution FEM-FCT schemes for mul-
tidimensional conservation laws. Technical report No. 231, University of Dort-
mund, 2003, Submitted to: Comput. Methods Appl. Mech. Engrg.
6. P. W. Hemker and B. Koren, Defect correction and nonlinear multigrid for steady
Euler equations. In: W.G. Habashi and M.M. Hafez (ed.). Computational fluid
dynamics techniques. London: Gordon and Breach Publishers, 1995, 699-718.
7. R. Lohner, K. Morgan, J. Peraire and M. Vahdati, Finite element flux-corrected
transport (FEM-FCT) for the Euler and Navier-Stokes equations. Int. J. Numer.
Meth. Fluids 7 (1987) 1093-1109.
650 M. Moller et al.

8. J.F. Lynn, Multigrid Solution of the Euler Equations with Local Preconditioning.
PhD thesis, University of Michigan, 1995.
9. P. L. Roe, Approximate Riemann solvers, parameter vectors and difference
schemes. J. Comput. Phys. 43 (1981) 357-372.
10. S. Turek, Efficient Solvers for Incompressible Flow Problems: An Algorithmic
and Comp utational Approach, LNCSE 2, Springer, 1999.
11. S. T. Zalesak, Fully multidimensional flux-corrected transport algorithms for flu-
ids. J. Comput. Phys. 31 (1979) 335-362.
A Singular Limit Method for the Stefan
Problems

Hideki Murakawa 1 and Tatsuyuki Nakaki 2

1 Graduate School of Mathematics, Kyushu University, Japan


[email protected]
2 Faculty of Mathematics, Kyushu University, Japan [email protected]

Summary. This paper proposes an approximation scheme to a classical one-phase


Stefan problem. The scheme is constructed to represent a singular limit of certain
reaction-diffusion system approximating to the Stefan problem. To this end, a time-
discrete operator-splitting methodology is used. Numerical experiments demonstrate
that the scheme would be useful in practical computations.

1 Introduction

A classical one-phase Stefan problem is one of mathematical models which


describe the melting of a body of ice maintained at temperature O"C. One of
attractive and important subjects of research on this problem is the analy-
sis of interface between ice and water. The interface is hypersurface and the
topological structure on ice and water regions may change. This fact induces
numerical difficulties, and many numerical schemes are proposed to track the
interfaces [1, 5, 8, 12, 13].
The aim of this paper is to propose an approximation scheme to the follow-
ing one-phase Stefan problem, which is formulate by Eymard et al. [2, 4, 6]:

Wt = Ll(w+) in QT,
(SP) { w+ = A on an x (0, T),
w(x,O) = wo(x) for x En,

where n is a bounded domain in Rd (d E N) with smooth boundary an,


T is a positive number, QT = n x (0, T], A and Wo are given functions and
a± = max{ ±a, O}. The temperature of water is indicated by u := w+, and
v := w- is the function such that the support of v coincides with the mushy
region, which have been studied by Bertsch et al. [2]. The ice-water interface
is given by r(t) := an+(t) n n, where n+(t) = {x En I u(x, t) > O}.
The idea of our scheme is a time-discrete operator-splitting methodology to
a reaction-diffusion equation which is given by Eymard et al. We also introduce
a singular limit solution to construct an approximation scheme to Problem
(S P). The advantages of our numerical method to our approximation scheme
are
652 H. Murakawa, T. Nakaki

1. It has low computational cost,


2. It may be used on arbitrary geometries on D,
3. There is no artificial parameters,
4. We can track the interface even in high-dimensional problems,
5. Topological changes and complicated interfacial shapes can be handled
easily.
In this paper, the convergence of the scheme is shown by numerical ex-
periments in one-dimensional case. Quite recently, we propose a similar ap-
proximation scheme to moving boundary problems (including classical Stefan
problems), and prove the convergence of the scheme [10]. We remark that
our approximation scheme or numerical method is similar to the diffusion-
generated approach for the mean curvature flow [3, 9] or the threshold com-
petition dynamics method [7].
At the end of this section, we explain the detail of initial function woo The
classical one-phase Stefan problem is often described by

in UO<t::;T D(t) x {t},


on UO<t<T T(t) x {t},
u=O on UO<t::;T T(t) x {t}, (1)
u=A on aD x (0, T],
u(x,O) = uo(x) for x E Do,
D(O) = Do,

where D(t) c D is an unknown domain which describes the water region,


T(t) = aD(t) n D, A is the latent heat, Vl/ is the normal speed of the interface
T(t), l/ stands for the exterior unit normal vector to T(t), uo is a positive
function representing the initial heat distribution, Do is an initial water region.
We note that Problem (SP) is equivalent to Problem (1) if the interface r(t)
is smooth surface such that r(t) c D for all t E (0, T], and vary smoothly
with t, the function w+(t) smooth up to r(t) for t E (0, T), and

wo(x) = {uo(x), if x E Do,


(2)
-A, otherwise

(see [4]). Our numerical simulations are made by using (2).

2 Our scheme

First of all, we introduce our approximation scheme to the solution of Problem


(SP).
Approximation Scheme 1. Let N be a positive integer and 0 = to < tl <
.. , < tm < ... < tN = T. An approximation w m to w(t m ) is defined by
A Singular Limit Method for the Stefan Problems 653

w m := K(t m ) 0 K(t m -1) 0 ..• 0 K(tI)wo,

where K(t) is an operator with domain Loo(D) defined by

fortm<t::;tm+l (m=0,1,2, ... ,N-1),

and 1im (t)z denotes the solution of the following heat equation

Ut = Llu in D x (t m , tm+l],
{ u=A on aD x (tm, t m+1], (3)
u(x, t m ) = z(x) for x E D.
Most of computational cost of Approximation Scheme 1 is spent on the
heat equation (3), and the first advantage '1. Our numerical method has low
computational cost' shown in Section 1 follows. If we use a finite element
method or finite volume method to solve (3), one can say '2. Our numerical
method may be used on arbitrary geometries on D'. Furthermore, we have '3.
There is no artificial parameters' as shown the above.
In the following we explain the basic idea of our scheme (see also [11]).

!
Eymard et al. [4] introduce the following reaction-diffusion system.

Ut = LlU - kUV in QT,


vt = -kUV in QT,
(RD k ) U= A on aD x (0, T],
U(x,O) = uo(x) := w;j(x) for xED,
V(x, 0) = vo(x) := wo (x) for xED,

where k is a positive parameter. They show the relationship between Problems


(RDk) and (SP) in the following

Proposition 1 (Eymard, Hilhorst, van der Hout and Peletier [4]). Let
us assume
A E Wi,l(QT) n C(QT),
{ uo(x) = A(x, 0) for xED,
o ::; Vo ::; K in D,
for some positive constant K. Then (RDk) has a unique weak solution
(U(k) , V(k)) E Wi,l(QT) x C O,l([O, T]; Loo(D)) for every k > 0, and

as k ----+ 00,

strongly in L 2 (QT), where w is a weak solution of Problem (SP).

To construct an approximate solution of Problem (RDk) for k = 00 (sin-


gular limit solution), we apply a standard operator-splitting methodology to
Problem (RD k ), that is, we split Problem (RDk) into diffusion and reaction
parts as follows:
We construct families of functions {um};;:=o and {vm};;:=o by defining
Step 1: Let UO(x) = uo(x) and VO(x) = vo(x).
654 H. Murakawa, T. Nakaki

Step 2: For m = 0,1, ... ,N - 1,


Step 2-1 (diffusion part): Let tJm(-, t m+l) = 'Hm(tm+1)U m .
Step 2-2 (reaction part): Solve the following initial value problem:

in D x (tm, t m+1],
in D x (tm' t m+l],
(4)
for xED,
for xED.

Step 2-3: Put

{
um+l(x) = qm(x, t m+1),
V m+1(x) = vm(x, tm+d.
We are now in the position to construct an approximation scheme to Prob-
lem (SP). Let e = k(t - t m ). Then (4) becomes

{
(Tf!' = - Um vm in D x (0, kT],
Vr = _umvm in D x (0, kT].

Passing to the limit as k ---> 00, that is e ---> 00, we obtain

Here we use the facts that (U m - vm)o = 0, tJm(x, tm+d 2':: and Vm(x) 2':: 0.
From the above formal calculations, we obtain Approximation Scheme 1.
°

3 Numerical results

In this section, some numerical results obtained by our scheme are shown. We
deal with a one-dimensional case with a known exact solution and a three-
dimensional case. In the former, we show accuracy of our numerical interfaces
by comparing with the exact ones. In the latter, it is demonstrated that the
moving boundaries can be track even in high-dimensional problems.

3.1 One-dimensional results

°
Let s(t) be the position of the interface at time t. We deal with a simple one-
dimensional problem with s(o) = and u(O, t) = C for t 2':: 0, where C is
a positive constant (see [14]). In this case, it is already known that

s(t) = 2aVt, (5)


where a is a constant satisfying
A Singular Limit Method for the Stefan Problems 655

2ae,,21" e- Z2 dz = ~. (6)

Let us compare our numerical interfaces with (5) when a = >.. = 1, [l = (0,2)
and T = 1. We compute a numerical approximation to the integral (6) using
Simpson's rule, then we employ 4.0601569 as an approximation to C. Fixed
uniform grids in space and time are adopted, that is, the spatial mesh size
is Ox = 21M and the time mesh size is Ot = liN, where M is a positive
integer. Let ui and vi (0 :s: i :s: M, and 0 :s: n :s: N) be the approximations
to u(iox, nOt) and v(iox, nOt), respectively, which implies that wi := ui - vf
is the approximation to the weak solution w(iox, nOt) of Problem (SF). The
initial data are given by
ug = C, vg = 0,
u? = 0, v? = 1 (0 < i :s: M).
We employ the implicit finite difference technique to obtain the numerical
solution of heat equation. Then our numerical method deduces iterating the
following two steps.
Step 1: Define iii by
iii+l - 2iii + iii_l
ox 2
(0 < i :s: M),

Step 2: Compute u~+l and v~+l (0 :s: i :s: M) by

0.9 r-.--...---.---r---,--r---r--:l'7--;;w.-,

0:'·8::6.'#~
Exact
Ilx=10·2, at=10-3
0.8 ruc=10- 2/2, AI=10· 3/4
6x:::10· 2/4, 61=10. 3/16 ~""H'~"H'
ruc=10· 2/8, At=10· 3/64 •.•.•.•.
0.6

0.4

0.2 ~:.~.F_ Ax=10·2/2, AI=10· 3/4


Ax=10·2/4, .6.1=1 0. 3/16 M •• _ _ .M .. ..

Ax=10· 2/B, .6.t=1 0. 3/64 ........


o~~~~--~-----J----~ 0.85 L--'----'---'----'--J__~_'___'__-'-_'
o 0.5 1.5 1.851.8551.861.8651.871.8751.881.8851.891.895 1.9

Fig. 1. Numerical interfaces for some Fig. 2. Close-up of the numerical in-
meshes terfaces in Fig. 1

We show the exact and numerical interfaces in Fig. 1, where we employ


the isosurface of level ->"/2 of {wi }o:Si:SM, O:Sn:SN as the numerical interface.
656 H. Murakawa, T. Nakaki

Fig. 2 is a close-up of Fig. 1. One can say that the numerical interface converges
to the exact one.

3.2 Three-dimensional simulation

Our numerical method can be easily adopted multi-dimensional problem. In


this subsection we compute three-dimensional case with the computational
domain n = (0,1)3. Fig. 3 shows our numerical simulation, which demon-
strates how heat fluxes melt ice. In this simulation, we employ the explicit
finite difference technique to obtain the numerical solution of heat equa-
tion because of our computational environment. The spatial mesh sizes are
8x = 8y = 8z = 1/100 and the time step is 8t = 3 X 10- 6 • The boundary
condition is ulan = 1. The initial data are shown at the top left of Fig. 3,
where (uo(x, y, z), vo(x, y, z)) = (1,0) holds in the liquid phase (black regions)
and (uo(x, y, z), vo(x, y, z)) = (0,1) does in the solid one (white ones).
We may observe that the interface with complex behavior can be computed,
that is, one can say '4. We can track the interface even in high-dimensional
problems' and '5. Topological changes and complicated interfacial shapes can
be handled easily'.

References
1. Beckett, G., Mackenzie, J. A., Robertson, M. L. (2001): A moving mesh finite
element method for the solution of two-dimensional Stefan Problems. J. Compo
Phys., 168, 500-518
2. Bertsch, M., de Mottoni, P., Peletier, L.A. (1984): Degenerate diffusion and the
Stefan problem. Nonlinear Analysis TMA, 8, No.11, 1311-1336
3. Evans, L.C. (1993): Convergence of an algorithm for mean curvature motion.
Indiana Univ. Math. J., 42, No.2, 533-550
4. Eymard, R., Hilhorst, D., van der Hout, R., Peletier, L.A. (2000): A reaction-
diffusion system approximation of a one-phase Stefan problem. In: Menaldi, J.L.,
Rofman, E., Sulem, A. (ed) Optimal control and partial differential equations.
156-170
5. Grossmann, C., Noack, A. (2002): Smoothing and Rothe's method for Stefan
problems in enthalpy form. J. Compo Appl. Math., 138, 347-366
6. Hilhorst, D., van der Hout, R., Peletier, L.A. (1996): The fast reaction limit for
a reaction-diffusion system. J. Math. Anal. Appl., 199, 349-373
7. Ikota, R., Mimura, M., Nakaki, T.: A methodology for numerical simulations to
a singular limit. preprint
8. Kawarada, H. (1989): Free boundary problem: Theory and numerical method
(in Japanese). University of Tokyo Press
9. Merriman, B., Bence, J.K., Osher, S.J. (1994): Motion of multiple junctions: A
level set approach. J. Compo Phys., 112, 334-363
10. Murakawa, H. (2004): A numerical method for Stefan type moving boundary
problems (in Japanese). MA Thesis, Kyushu University, Japan
A Singular Limit Method for the Stefan Problems 657

Fig. 3. Initial data and numerical solutions in a three-dimensional case. Time runs
left to right and up to down.

11. Murakawa, H., Nakaki, T. (2003): A singular limit approach to moving boundary
problems and its applications. Theoretical and Applied Mechanics Japan, 52,
255-260
12. Nogi, T. (1974): A difference scheme for solving the Stefan problem. Publ. Res.
Inst. Math. Sci. Kyoto Univ. series A, 9, 543-505
13. Verdi, C., (1994): Numerical aspects of parabolic free boundary and hysteresis
problems. Lecture Notes in Mathematics, 1584, 213-284
14. Yamaguchi, M., Nogi, T. (1977): Stefan problem (in Japanese). Sangyotosho
Higher-Order Split-Step Schemes for the
Generalized Nonlinear Schrodinger Equation

Gulcin M. Muslu l and Husnu A. Erbay 2

1 Istanbul Technical University, Department of Mathematics, Maslak 34469,


Istanbul, Turkey [email protected]
2 Istanbul Technical University, Department of Mathematics, Maslak 34469,
Istanbul, Turkey [email protected]

Summary. The generalized nonlinear Schrodinger (GNLS) equation is solved nu-


merically by a split-step Fourier method. The first, second and fourth-order versions
of the method are presented. A classical problem concerning the motion of a sin-
gle solitary wave is used to compare the first, second and fourth-order schemes in
terms of the accuracy and the computational cost. This numerical experiment shows
that the split-step Fourier method provides highly accurate solutions for the GNLS
equation. Furthermore, two test problems concerning the interaction of two solitary
waves and an exact solution which blows up in finite time are investigated by using
the fourth-order split-step scheme and particular attention is paid to the conserved
quantities as an indicator of the accuracy.

1 Introduction
The generalized nonlinear Schrodinger (GNLS) equation is a nonlinear partial
differential equation given by

where i = yCI, w is a complex valued function of the spatial coordinate x


and the time t, the parameters ql, q2, q3 and q4 are real constants and the
subscripts t and x denote differentiation with respect to time and space, re-
spectively. Compared to the usual nonlinear Schrodinger equation with a cu-
bic nonlinearity, the GNLS equation possesses both cubic and quintic nonlin-
earities and nonlinear terms that contain derivatives. It has been derived as
a model equation governing the modulation of a quasi-monochromatic wave
train in a weakly nonlinear, dispersive medium.
Some properties of the GNLS equation are summarized here. Assume that
wand all its derivatives converge to zero sufficiently rapidly as x ---4 ±oo.
Solutions of the GNLS equation subjected to these boundary conditions are
known to satisfy some conservation laws [1,2]. According to these conservation
laws, the conserved quantities

(2)
Higher-Order Split-Step Schemes 659

where the symbol * denotes complex conjugation, remain constant in time.


Note that h represents the theoretical L2 norm of the system. Although the
GNLS equation is generally known as a nonintegrable equation in the sense of
the inverse scattering method, certain cases of the GNLS equation are com-
pletely integrable and such equations possess soliton solutions and an infinite
number of conservation laws. For certain values of the coefficients and certain
initial conditions, solutions of the general equation (1) experience finite-time
blow up [1]. Only a few analytical solutions corresponding to some special
cases of the GNLS equation are available [1, 2]. Therefore, numerical studies
are essential to develop an understanding of the phenomena related to the
GNLS equation.
One of the numerical methods employed for nonlinear dispersive wave equa-
tions is the split-step method proposed by Tappert [3]. The basic idea in the
split-step method is to decompose the original problem into subproblems which
are simpler than the original problem and then to compose the approximate
solution of the original problem by using the exact or approximate solutions
of the subproblems in a given sequential order. For nonlinear dispersive wave
equations which are derived by balancing the effects of dispersion and non-
linearity, such as the GNLS equation that we will be solving, an appropriate
approach is to split the original problem into linear and nonlinear subproblems
which take into account purely dispersive and purely nonlinear effects, respec-
tively. While various numerical methods have been employed for the numerical
solutions of the cubic NLS equation in which the split-step method profits from
the existence of a simple analytical solution for the nonlinear subproblem, less
attention has been paid to the numerical solution of the GNLS equation of
which the cubic NLS equation is a special case. A first-order split-step method
was suggested by Pathria and Morris for the GNLS equation in [2].
The main purpose of this study is to introduce higher-order split-step
Fourier schemes for the GNLS equation and is to compare these schemes from
a computational efficiency viewpoint. To this end, the initial and boundary-
value problem is decomposed into linear and nonlinear subproblems. A Fourier
method is employed for the spatial discretizations of both linear and nonlinear
subproblems. While the linear subproblem is treated exactly, a fourth-order
Runge-Kutta scheme is used for the time integration of the nonlinear sub-
problem. Three different numerical schemes which are basically the first-order,
second-order and fourth-order versions of the present split-step Fourier method
are proposed. For an application of the present split-step Fourier method to
the complex modified Korteweg-de Vries equation, we refer the reader to [4].
660 G.M. Muslu, H.A. Erbay

2 The Numerical Method


2.1 Review of the split-step method

It is best to present the split-step method as applied to a general evolution


equation in the form
Wt = (L + N)w, (5)
where Land N are linear and nonlinear operators, respectively, and Land N
do not commute with each other. For instance, we have

for the GNLS equation. If, for the moment, Land N are assumed to be t
independent, a formally exact solution of equation (5) is given by

w(x, t + Llt) = exp[Llt(L + N)]w(x, t) (6)


where Llt is the time step between the initial and final times. The linear equa-
tion Wt = Lw and the nonlinear equation Wt = N w have known exact solutions

w(x, t + Llt) = exp(LltL)w(x, t) (7)


and
w(x, t + Llt) = exp(LltN)w(x, t), (8)
respectively. The main idea in the split-step method is to approximate the
exact solution of equation (5) by solving the purely linear and purely nonlinear
equations in a given sequential order, in which the solution of one subproblem
is employed as an initial condition for the next subproblem. This may be
realized by replacing the exponential operator exp[Llt(L + N)] in equation (6)
by a solution operator 'Pn(Llt) which includes an appropriate combination of
products of the exponential operators exp( LltL) and exp( LltN). This produces
a splitting error due to the noncommutativity of Land N, and at this stage the
celebrated Baker-Campbell-Hausdorf (BCH) formula is very useful to reduce
noticeably the splitting error. In what follows, we study the first-, second-
and fourth-order versions of the method. According to the BCH formula, the
first-order approximation of the exponential operator in equation (6) is given
by
'Pl(Llt) = exp(LltL) exp(LltN) . (9)
Therefore, for the first-order version of the split-step method, the advancement
in time is carried out in two steps. In the first step, a so-called intermediate
solution is computed by advancing the solution according to the purely non-
linear equation. In the second step, the solution is advanced according to the
linear dispersive equation in which the intermediate solution is used as an
initial condition. In the second-order version of the method, the exponential
operator in equation (6) is approximated by
Higher-Order Split-Step Schemes 661

which is symmetric, that is, rp2(Llt)rp2( -Llt) = 1. A fourth-order splitting is


given in the form

(11)

where w = (2 + 2 1 / 3 + 2- 1 / 3 )/3. Note that the number of products of expo-


nential operators increases with the order of decay of splitting error.

2.2 Space discretization

Application of the numerical method requires truncation of the infinite interval


to a finite interval [a, b]. We assume that w(x, t) satisfies the periodic boundary
condition w(a, t) = w(b, t) for t E [0, T]. If the spatial period is, for conve-
nience, normalized to [0,271'] using the transformation X = 271'(x - a)/(b - a),
the GNLS equation becomes

where
(13)

The interval [0,271'] is divided into N equal subintervals with grid spacing
LlX = 271'1N where the integer N is even. The spatial grid points are given
by Xj = 271'j1N, j=0,1,2,oo.,N. The approximate solution to w(Xj,t) is
denoted by Wj (t). The discrete Fourier transform of the sequence {Wj} is
defined as
1
L
N-1
A
N N
Wk = .1'k[Wj ] = N Wjexp(-ikXj), --<k<--1.
2 - - 2 (14)
j=O

The inversion formula for the discrete Fourier transform (14) is

~-1
Wj =.1'j- 1[Wk]= L Wkexp(ikXj ), j=0,1,2,oo.,N-1. (15)
k=-~

Here .1' denotes the discrete Fourier transform and .1'-1 its inverse. These
transforms can be realized efficiently via a fast Fourier transform (FFT) algo-
rithm. For the FFT algorithm used here, the integer N must have only prime
factors 2 and 3. In both linear and nonlinear subproblems we approximate spa-
tial derivatives in both linear and nonlinear subproblems using discrete Fourier
transforms.
662 G.M. Muslu, H.A. Erbay

2.3 Time integration

We consider a split-step method for the GNLS equation, in which the linear
equation
Wt - ipwxx = 0 (16)
and the nonlinear equation

are solved in a given sequential order corresponding to one of the splitting for-
mulas (9)-(11). The linear equation (16) can be solved by means of the discrete
Fourier transform and the advancements in time are performed according to

(18)

Here L1t is time step and Wr denotes the approximation to w(Xj ,mL1t). The
spatial discretization of the nonlinear equation (17) by a Fourier pseudospectral
method can be written as
dW
d/ = i (q1 IWj l2 Wj + q2 IWj l4 Wj ) - .1'j-1[ik'hFdI WjI2llWj
-.1'j-1[ikq4.1'k[Wj llIWj 12, j = 0,1,2, ... , N - 1. (19)

For the time integration of this equation, instead of using an approximate ana-
lytical technique [2] we adopt rather a different approach and employ a fourth-
order Runge-Kutta method. Now the total error involved in integrating from
time t to time t + L1t will be the sum of the splitting error and the temporal
discretization error of the nonlinear equation (17).
The first-order split-step Fourier method for the GNLS equation can be
summarized as follows: Given the data Wj at any time step t = t m , first
advance the solution according to the nonlinear part, namely solve equation
(19) using the fourth-order Runge-Kutta method for time integration. This
becomes the initial data for the linear problem which is solved by the discrete
Fourier transform as indicated by equation (18). The extension of the first-
order split-step scheme based on equation (9) to the second-order and fourth-
order split-step schemes based on equations (10) and (11), respectively, is
straightforward.

3 Numerical Experiments

To gain insight into the performance of the suggested split-step schemes we per-
form the following three numerical experiments. The conservation properties
of the split-step schemes are examined by calculating discrete analogues of the
conserved quantities 11 , hand h. The relative errors in discrete approxima-
tions to the conservation integrals (2), (3) and (4) are denoted by J1 , J2 and J3 ,
Higher-Order Split-Step Schemes 663

respectively, and they are defined by 51 = 111 -1101/11101,52 = 112 -1201/11201


and 53 = 113 -1301/11301 where 1 1,12,13 and 1 10 ,120,130 represent the calcu-
lated values of the conserved quantities h,h, h at times t and t = 0, respec-
tively.

3.1 Solitary wave solution


The purpose of the present numerical experiment is to verify numerically that
the proposed split-step schemes exhibit the expected first-order, second-order
and fourth-order convergence in time. The GNLS equation has a travelling
solitary wave solution [1, 2, 5], which has the form
4 .
w(x t) - [ ]1/2 exp[t"-(x t)] (20)
, - 4+3sinh2 (x-2t-15) If' ,

_ 1
r/J(x, t) = 2 tanh 1['2 tanh (x - 2t -15)] + x -15 (21)

for the choice of coefficients q1 = 1/2, q2 = -7/4, q3 = -1, q4 = -2. This


solution represents a solitary wave initially at x = 15 moving to the right with
velocity 2. The problem is first solved on the space interval 5::; x ::; 35, as
in [1, 2], for times up to t = 3. We present in Figure l(a) the Leo-errors of
the first-order, second-order and fourth-order split-step schemes as a function
of N for the final time t = 3 on a loglo -log10 scale. We use the relation
Llt = v(Llx)2 to determine the value of Llt for a given Llx (= (b-a)/N), where
the value of v is fixed at l/ = 0.1 . We observe that the Leo-errors decrease
with increasing N until the boundaries start exerting their influence. The
nondecreasing error behavior for the second-order and fourth-order schemes
after the value of N = 96 is due to the limited space interval 5::; x ::; 35 .
To show that this behavior can be eliminated by balancing the error due
to boundary effects with the error due to internal resolution, we repeat the
experiment of Figure 1 (a) for the space interval - 20 ::; x ::; 60. The results
are presented in Figure 1 (b) on a loglO -log10 scale again. But this time the
effect of the boundaries disappear and the Leo-errors continue to decrease with
increasing N.
To test whether the split-step schemes exhibit the expected convergence
rates in time we perform some numerical experiments for various values of
time step Llt and a fixed value of N. In these experiments we take N = 512
to keep spatial accuracy high. The results are shown in Table 1. We present
the Leo-errors for the terminating time t = 3. The convergence rates agree
well with the expected rates for the first-order, second-order and fourth-order
split-step schemes. The orders of decay of the Leo-errors are the ones of the
splitting formulae employed for the temporal integration.
To compare the proposed split-step Fourier schemes in terms of compu-
tational efficiency, we fix Llt and Llx and measure the computing times, the
L 2 -error, the Leo-error and the conservation errors 51,52 and 53 at the termi-
nating time t = 3. Trapezoidal rule is used for the numerical quadrature of
664 C.M. Muslu, H.A. Erbay

-2
-1

-4
-2

log Loo log Loo -6


-3
-8

-4 -10 SS4

-5 -12
1 10 (x 10') 1 10 (xl0')
Fig. 1. The Loo-erroN> at t = 3 as a function of the number of spatial grid points
for the first-order (SSl), second-order (SS2) and fourth-order (SS4) split-step Fourier
schemes. (a) The space interval: 5:S:; x :s:; 35, (b) The space interval: - 20 :s:; x :s:; 60

Table 1. Comparison of the convergence rates in time for the first-order, second-
order and fourth-order split-step Fourier schemes in the case of a single solitary wave
-20:S:; x :s:; 60.

First-order Second-order Fourth-order


.:1t Loo Order Loo Order Loo Order
0.0500 1.604E-2 - 2.079E-3 - 4.425E-4 -
0.0100 3.109E-3 1.019 7.975E-5 2.026 1.098E-6 3.727
0.0050 1.552E-3 1.002 1.99IE-5 2.002 7.179E-8 3.935
0.0030 9.306E-4 1.001 7.167E-6 2.000 9.436E-9 3.972
0.0010 3.100E-4 1.001 7.962E-7 2.000 1.176E-10 3.991
0.0005 1.549E-4 1.001 1.990E-7 2.000 7.637E-12 3.945

the integrals. The results are represented in Table 2. Note that the computing
times in Table 2 are normalized so that the computing time of the first-order
split-step scheme is one unit. The results show that each of the conserved quan-
tities is very well preserved by the split-step schemes. Furthermore, we observe
that the computing time increases with the increasing order of the split-step
method. We conclude that the fourth-order split-step scheme is computation-
ally more efficient than the first-order and second-order schemes.

Table 2. Comparison of the Loo-error, the L2-error, the conservation errors 81,
82 and 83 and the computing times for the first-order, second- and fourth-order
split-step Fourier schemes (N = 512, - 20 :s:; x :s:; 60, .:1t = 0.61038 x 10- 3 ).

Method Loo L2 81 82 83 Normalized cpu


First-order 1.892E-04 1.234E-04 1.69IE-13 2.243E-07 9.984E-09 1.0
Second-order 2.966E-07 1.134E-07 2.198E-13 1.865E-13 1.38IE-13 1.7
Fourth-order 1.716E-ll 5.156E-12 5.913E-13 9.309E-14 3.985E-13 4.1
Higher-Order Split-Step Schemes 665

3.2 Interacting Solitons

In the second numerical experiment we study the interaction of two solitons


for the integrable case of GNLS equation, in which the coefficients are q1 = 1,
q2 = 1, q3 = -2, q4 = 0 . The initial condition is given by

1 1 1 1
w(x,O) = Insech[ -(x - 15)] exp i{ -(x - 15) + tanh[ -(x - 15)]}
v2 2 4 2
1 1 . 1 1 1
+-sech[-(x - 35)] expt{ --(x - 35) + -tanh[-(x - 35)]} .
2v2 4 2 2 4

This equation corresponds to two solitons, the one initially located at x = 15


and moving to the right with speed 1/2 and the one initially located at x =
35 and moving to the left with speed 1. The exact values of the conserved
quantities for this problem are h = 3, 12 = 3/16, and h = 0 . The problem
is solved on the interval - 60 ::::: x ::::: 110 for times up to t = 20 using the
fourth-order split-step scheme. The numerical results show that the solitary
waves are stable under the collision. Also each of the conserved quantities is
very well preserved up to 10- 12 for hand 12 and up to 10- 10 for h by the
fourth-order split-step Fourier scheme. This behavior provides a valuable check
on the numerical results.

3.3 Blow-up

For certain values of the coefficients and certain initial conditions, solutions
to the GNLS equation experience finite time blow-up [1]. We now apply the
fourth-order split-step scheme to a case of the GNLS equation in which the
exact solution blows up in finite time. The initial condition is the Gaussian
function w(x,O) = exp( _x 2 ) and the coefficients are q1 = -2, q2 = 20,
q3 = 0, q4 = 0 . The exact values of the conserved quantities h, 12 and 13 are
h = J7r /2, h = ft(9v2 + 9 - 20v'6)/IS and h = 0 for this problem.
In [1], it has been shown analytically that the exact solution w(x, t) for this
problem will blow up in finite time and furthermore, an upper bound on the
blow-up time is t ::::::: 1.7.
In the present study, the above problem is solved on the interval -7.5 ::::: x :::::
7.5 for times up to t = O.OS. We present the numerical results obtained using
the fourth-order scheme on Table 3. Although a formal proof of the existence
of the blow-up is not presented here, the numerical results strongly indicate
that a blow-up is well underway by time t = O.OS. This is consistent with
the numerical results presented in [1] and [5]. The fact that the three results
about the predicted time of blow-up, which were obtained by totally different
methods, are in complete agreement makes one believe in their correctness. As
in [1] we conclude that the upper bound given in [1] is not sharp.
666 C.M. Muslu, H.A. Erbay

Table 3. Variation of discrete approximations of the conserved quantities h, hand


h and I W(O, t) I with time for the fourth-order split-step Fourier scheme (N = 432,
L1t = 0.005).

t It h h I W(O,t) I
0.00 1.253314 -2.684467 -9.793286E-17 1.000000
0.01 1.253314 -2.684467 -6.651917E-13 1.007348
0.06 1.253314 -2.684467 -4.673614E-12 1.526254
0.07 1.253314 -2.684448 -4.395637E-12 2.376429
0.08 1.253352 -2.829258 -6.085108E-1O 3.430374

4 Conclusions

In this study we have applied the well-known split-step Fourier method to


the GNLS equation. We have presented three split-step schemes in which the
main difference among the three schemes is in the order of the splitting ap-
proximation used. The method is easy to implement on a computer and one
can easily introduce higher-order splitting formulae to increase greatly the ac-
curacy of split-step method. The numerical experiments reported here show
that the fourth-order split-step Fourier scheme is advisable in situations where
accuracy rather than the computational cost is of prime importance.
The numerical solutions obtained by using the present numerical schemes
for the case of one solitary wave are compared with the exact solutions in
order to assess the accuracy of these schemes. In addition, the performance of
the numerical schemes has been monitored by computing both the conserved
quantities and the computational costs. We have found that the numerical re-
sults are in a good agreement with the exact solutions and the results reported
in the literature and that the schemes have remarkable conservation proper-
ties for global invariants. Moreover, the collision of two solitons and a finite
time blow-up problem are investigated numerically and particular attention is
paid to the behavior of the conserved quantities as an indicator of numerical
difficulties.
The approaches presented in previously published papers related to numer-
ical solutions of the GNLS equation have been mostly limited to the absence of
the nonlinear derivative terms [6, 7, 8]. The numerical results presented above
show that the nonlinear derivative terms do not create any special difficulties
in the split-step Fourier method.

References
1. Pathria, D., Morris, J. L1. (1989): Exact solutions for a generalized nonlinear
Schrodinger equation, Physica Scripta 39, 673-679.
2. Pathria, D., Morris, J. L1. (1990): Pseudo-spectral solution of nonlinear
Schrodinger equations, 1. Comput. Phys. 87, 108-125.
Higher-Order Split-Step Schemes 667

3. Tappert, F. (1974): Numerical solutions of the Korteweg-de Vries equation and


its generalizations by the split-step Fourier method, Lect. Appl. Math. Amer.
Math. Soc. 15, 215.
4. Muslu, G. M., Erbay, H. A. (2003): A split-step Fourier method for the complex
modified Korteweg-de Vries equation, Computers Math. Applic. 45, 503-514
5. Robinson, M. P. (1997): The solution of nonlinear Schrodinger equations using
orthogonal spline collocation, Computers Math. Applic. 33, 39-57.
6. Cloot, A., Herbst, B. M., Weideman, J. A. (1990): A numerical study of the
nonlinear Schrodinger equation involving quintic terms, J. Comput. Phys. 86,
127-146.
7. Chang, Q., Jia, E., Sun, W. (1999): Difference schemes for solving the generalized
nonlinear Schrodinger equation, J. Comput. Phys. 148, 397-415.
8. Sheng, Q., Khaliq, A. Q. M., AI-Said, E. A. (2001): Solving the generalized non-
linear Schrodinger equation via quartic spline approximation, J. Comput. Phys.
166, 400-417.
Numerical Methods and Simulation Techniques
for Flow with Shear and Pressure Dependent
Viscosity

Abderrahim Ouazzi 1 and Stefan Turek 2

1 Institute of Applied Mathematics, University of Dortmund, 44227 Dortmund,


Germany. Abdermhim. [email protected]
2 Institute of Applied Mathematics, University of Dortmund, 44227 Dortmund,
Germany. Stefan. [email protected]

Summary. In this note we present some of our recent results concerning flows with
pressure and shear dependent viscosity. From the numerical point of view several
problems arise, first from the difficulty of approximating incompressible velocity fields
and, second, from poor conditioning and possible lack of differentiability of the in-
volved nonlinear functions due to the material laws. The lack of differentiability can
be treated by regularisation. Then, Newton-like methods as linearization technique
can be applied; however the presence of the pressure in the viscosity function leads to
an additional term introducing a new non-classical linear saddle point problem. The
difficulty related to the approximation of incompressible velocity fields is treated by
applying the nonconforming Rannacher-Turek Stokes element. However, then we are
facing another problem related to the nonconforming approximation for problems
involving the symmetric part of gradient: the classical discrete 'Korn's Inequality'
is not satisfied. A new and more general approach which involves the jump across
the inter-element boundaries should be used, which requires a small modification of
the discrete bilinear form by adding an interface term, penalizing the jump of the
velocity over edges. This is achieved via a modified procedure in the derivation of
a Discontinuous Galerkin formulation. As a solver for the discrete nonlinear systems,
a Newton variant is discussed while a 'Vanka-like' smoother as defect correction in-
side of a direct multigrid approach is presented. The results of some computational
experiments for realistic flow configurations are provided, which contain a pressure
dependent viscosity, too.

1 Introduction

The flowing of powders brings a new challenging and interesting problem to


the CFD community: at very high concentrations and low rate-of-strain, grains
are in permanent contact, rolling on each other. Therefore a frictional stress
model must be taken into account. This can be done using plasticity and sim-
ilar theories in which the material behavior is assumed to be independent of
the velocity gradient or the rate-of-strain. This is in contrast to viscous New-
tonian flow where stress specifically depends on a rate-of-strain. Furthermore,
flowing powders do not exhibit viscosity and, again, this shows that a Newto-
Flow with Shear and Pressure Dependent Viscosity 669

nian rheology cannot describe granular flow accurately. It is assumed that the
material is incompressible, dry, cohesionless, and perfectly rigid-plastic. Such
properties are relevant for modelling the granular flows via special models for
continuum mechanics, as for instance the Schaeffer model [9].

1.1 Equations of motion

The general equations of describing the motion of incompressible powders read:


Conservation of mass: %f
= '1!t + \7. (pu) = 0, ~; is the material derivative
and u is the velocity vector.
Incompressible material: The bulk density, p, is a constant, so that
\7. u = O.
Equation of motion: p~~ = -\7 . T + pg with T = 5 + pl.

1.2 Constitutive equations

The constitutive equation is devoted to correlate between the deviatoric ten-


sor, 5, and the velocity, through the second invariant of the rate deformation
Dn = !D: D, where the rate of deformation is given by D = !(\7u + \7Tu ).
Newtonian law: 5 = 2voD
Power law: 5 = 2v(D n)D, v(z) = z~-l, r ~ 1
Schaeffer's law: For a powder a constitutive equation which was first intro-
duced by Schaeffer [9], has to obey a
- yield condition; 115 II = v2p sin 4>,
- flow rule; 5 = AD.
We use this correlation to obtain the constitutive equation

. D
T =h p sm4>j[Dj[ + pI.

1.3 Generalized Navier-Stokes equations

The problem can be stated in the framework of the generalized incompressible


Navier-Stokes equations:
p~~ = -\7p + \7. (v(p, Dn)D) + pg, \7. u = 0
If we define the nonlinear pseudo viscosity v(".) as a function of the sec-
ond invariant of the rate deformation Dn and the 'pressure' p, we can show
that different materials can be ranged within different viscosity laws including
powder;
- Power law defined for v(z,p) = voZ~-l
- Bingham law defined for v(z,p) = voz-~
- Schaeffer's law (including the 'pressure') defined for v(z,p) = v2sin4> pz-~
670 A. Ouazzi, S. Turek

2 Problem formulation

Let us consider the flow of the stationary (!) generalized Navier-Stokes problem
in (1.3) in a bounded domain D C ]R2. If we restrict the set V of test functions
to be divergence-free and if we take the constitutive laws into account, the
above equations from (1.3) lead to:

In 2v(Dn(u),p)D(u) : D(v) dx + In (u· 'Vu)v dx = In fvdx, \Iv E V


(1)

It is straightforward to penalize the constraint div v = 0 to derive the equiva-


lent mixed formulations of (1):
Find (u,p) E X x M (with the spaces X = HJ(D) and M = L2(D)) such
that:

In 2v(Dn(u),p)D(u) : D(v) dx + In (u· 'Vu)v dx + In pdiv v dx

= In fvdx, \lVEX, (2)

In qdivudx = 0, \lqEM,

2.1 Nonlinear solver: Newton iteration

In this approach, the nonlinearity is first handled on the continuous level. Let
u 1 being the initial state, the (continuous) Newton method consists of finding
u E V such that

In 2v(Dn(u 1),pl)D(u) : D(v)dx

+ In 201 v(D n(u 1),pl)[D(u1) : D(u)][D(u 1) : D(v)]dx

In
+ 202v(Dn(u 1),pl)[D(u l ) : D(v)]pdx

= In In
fv - 2v(Dn(u l ),pl)D(u 1) : D(v)dx, \Iv E V, (3)

where OiV(·, .); i = 1,2 is the partial derivative of v related to the first and
second variables, respectively. To see this, set X = D(u 1), X = D(u), Y =
pi, Y = p, F(x, y) = v(~lxI2, y)x and f(t) = F(X + tx, Y + ty), so that

oxjFi(x,y) = oXjv(~lxI2, Y)XjXi + v(~lxI2, y)b"ij (4)


oyFi(x,y) = oyv(~lxI2,Y)Xi
Flow with Shear and Pressure Dependent Viscosity 671

where Oij stands for the standard Kronecker symbol. Having

f: (t) = L:fJxjFi(X + tx, Y + tY)Xj + 8yFi(X + tx, Y + ty)y


= v("2IX + txl 2 , Y + tY)Xi (5)
+ 81 v( tlx + txl 2 , Y + ty)(X + tx, X)(Xi + tXi)
+ 82 v( "2IX + txl 2 , Y + ty)y(Xi + tXi)
we decrease t towards zero, such that we obtain the Frechet derivative:

\7 .[ 2v(Drr(u 1),pl)D(u)
+ 281 v(D rr (u 1),pl)(D(u1) : D(u))D(u1) (6)
+ 282 v(Drr(u 1),pl)pD(u1)]

2.2 New linear auxiliary problem

The resulting auxiliary subproblems in each Newton step consist of finding


(u,p) E X x M as solutions of the linear (discretetized) systems

where R u (-'·) and R p ("') denote the corresponding nonlinear residual terms
for the momentum and continuity equations, and the operators A<u1,pl), B,
A*(u1,pl) and B*(u1,pl) are defined as follows:

(A(U1,pl)U, v) = In 2v(D rr (u),p)D(u) : D(v) dx (8)

(Bp,v) = Inp\7· vdx (9)

(A*(u1,pl)u,v) = In 281 v(Drr(u 1),pl)[D(u1): D(u)][D(u1): D(v)]dx (10)

(B*(u1,pl)v,p) = In 282 v(Drr(u 1),pl)[D(u1) : D(v)]pdx (11)

3 Discretization

We consider a subdivision T E Th consisting of quadrilaterals in the domain


[h E ]R2, and we employ the rotated bilinear Rannacher- Turek element [5].
For any quadrilateral T, let (~, TJ) denote a local coordinate system obtained by
joining the midpoints of the opposing faces of T. Then, in the nonparametric
case, we set on each element T
-
Ql(T):=span { 2 -TJ 2} .
1,~,TJ,~ (12)
672 A. Ouazzi, S. Turek

The degrees offreedom are determined by the nodal functionals {Fj.,a,b) (.), r c
8Th },

F'} := 1F1- 1 l vd--y or Fj,:= v(mr) (mr midpoint of edge r) (13)

such that the finite element space can be written as

W;,b:= {v E L2(fh),v E (h(T),'v'T E Th,v continuous w.r.t. all


(14)
nodal functionals Fra~b (.), and Fra,b(v)
~,J ~O
= 0, 'v'riO}.
Here, ri,j denote all inner edges sharing the two elements i and j, while riO
denote the boundary edges of 8fh. In this paper, we always employ version 'a'
with the integral mean values as degrees of freedom. Then, the corresponding
discrete functions will be approximated in the spaces

Due to the nonconformity of the discrete velocities, the classical discrete


'Korn's Inequality' is not satisfied which is important for problems involving
the symmetric part of the gradient [4]. Therefore, appropriate edge-oriented
stabilization techniques (see [1, 2, 8]' have to be included which directly treat
the jump across the inter-elementary boundaries via adding the following bi-
linear form
(16)

for all basis functions ¢i and ¢j of W;,b. Taking into account an additional
relaxation parameter s = s(v), the corresponding stiffness matrices are defined
via:
(Su, v) = s L I~I L[U][V]dO" (17)
EEErUED

Here, the jump of a function U on an edge E is given by

u+ . n+ + U- . n- on internal edges EJ ,
[u] = { U· n on Dirichlet boundary edges ED, (18)
o on Neumann boundary edges EN,

where n is the outward normal to the edge and (-)+ and (-)- indicate the value
of the generic quantity (.) on the two elements sharing the same edge.

4 Linear solver
This section is devoted to give a brief description of the involved solution tech-
niques for the resulting linear systems. For the nonconforming Stokes element
Flow with Shear and Pressure Dependent Viscosity 673

QdQo, a 'local pressure Schur complement' preconditioner (see [7]) as gen-


eralization of so-called 'Vanka smoothers' is constructed on patches [li which
are ensembles of one single or several mesh cells, and this local preconditioner
is embedded as global smoother into an outer block Jacobi/Gauss-Seidel iter-
ation which acts directly on the coupled systems of generalized Stokes, resp.,
Oseen type as described in [8]. If we denote by Ru and Rp the discrete resid-
uals for the momentum and continuity equation which include the complete
stabilisation term due to the modified bilinear form 5 as described in (17), one
smoothing step in defect-correction notation can be described as

with matrix F = A + JdA * and A, B, A* and B* are the discrete matrices cor-
responding to the operators in (8), (9), (10) and (11). For the preconditioning
step only a part of the matrix, i.e. F + S*, is involved. All other components in
the multigrid approach, that means inter grid transfer, coarse grid correction
and coarse grid solver, are the standard ones and are based on the underly-
ing hierarchical mesh hierarchy and the properties of the nonconforming finite
elements (see [7] and [8] for the details).

5 Numerical tests

5.1 Newtonian case

In this case, the gradient and tensor formulations are equivalent; the accuracy
and efficiency of the stabilized tensor discretization is checked by comparisons
with the gradient formulation (see Table 1); the tests have been performed for
the 'flow around cylinder' benchmark configuration [10]. For all three formu-
lations the lift and drag forces are very similar.

Table 1. Efficiency of the stabilized nonconforming FEM: Lift and Drag forces
Level 5
1/1/ grad tensor stab. tensor
1 Drag 31252 x 10 -1 31221 x 10 31231 x 10
Lift 30898 x 10- 3 30924 X 10- 3 30936 X 10- 3
NL/MG 3/3 7/200 3/3
1000 Drag 55657 x 10 -4 55531 x 10 -'i 55535 x 10 -'i
(Re = 20) Lift 10180 x 10- 6 10259 X 10- 6 10277 X 10- 6
NL/MG 11/4 11/12 11/3
674 A. Ouazzi, S. Turek

5.2 Effect of convection


The average number of inner multigrid sweeps (MG) per outer nonlinear sweep
(NL) increases with mesh refinement (see Table 2), due to the more dominant
influence of the kernel function in the second order differential operator. Since,

Table 2. Nonlinear iteration (NL)/Averaged multigrid sweeps (MG) per nonlinear


iteration for different viscosity parameter (Re numbers) and various formulations
(gradient, tensor and stabilized tensor) and for different mesh levels
1/1/ 1 10 1000
Level Formulation NL/MG NL/MG NL/MG
4 grad 3/3 4/3 11/4
tensor 3/15 5/17 11/4
stab. tensor 3/3 5/3 11/4
5 grad 3/3 4/3 11/3
tensor 4/140 5/35 11/10
stab. tensor 4/3 5/3 11/3
6 grad 3/3 4/3 11/3
tensor 7/200 4/161 11/12
stab. tensor 3/3 4/3 11/3

in contrast, the convection dominates with the increase of the Reynolds num-
ber, the average number of multigrid sweeps per nonlinear sweep decreases,
as the influence of the kernel function is getting irrelevant. This may explain
why many people from the CFD community did not pay much attention to
this problem before.

5.3 Power law case


In this case the nonlinear viscosity has the form v(z) = voz~-I,z = D rr , and
the gradient and tensor formulation are not equivalent any more. The quality
of the solution is checked by comparisons with the well-known and stable
conforming Q2/ PI approximation; the extended description can be seen in [3].
The accuracy of the nonconforming FEM is saved with the stabilized tensor
discretization, see Table 3.

5.4 Pressure dependent viscosity


Finally, the nonlinear (pseudo) viscosity has the form v(p, z) = exp((3p), and
we list the number of resulting nonlinear iterations and the averaged number
of multigrid sweeps per nonlinear iteration for both Newton and Fixpoint
methods as outer nonlinear solver. Table 4 shows that the presence of the new
linear operator B* cannot be ignored; otherwise, we destroy the efficiency of the
Newton method which is necessary for the robust treatment of the significant
nonlinearity.
Flow with Shear and Pressure Dependent Viscosity 675

Table 3. Comparison of the aproximation results for lift, drag and pressure differ-
ence for two FEM approaches, the stabilized nonconforming {JI/Qo and the classical
conforming Q2/Pl (see [3]).
ILevel IElements I Drag I Lift I 6. p INN/NLII Drag I Lift I 6. p INN/NLI
Power r = 1.5 r = 1.1
4 Ql/QO 1594.20 14.25 24.56 9/2 916.02 3.7381 15.74 12/2
Q2/H 1635.80 14.39 25.09 8/140 953.94 3.9217 15.82 19/294
5 Ql/QO 1615.60 14.43 24.81 8/2 935.13 3.9954 15.82 15/3
Q2/P1 1637.60 14.44 25.07 9/723 957.64 4.0587 15.87 18/1162
6 QI/Qo 1626.20 14.46 24.94 8/2 946.22 4.0592 15.85 13/5

Table 4. Corresponding results for the number of nonlinear iterations and the av-
eraged number of linear sweeps per nonlinear cycle
v(z,p) = exp(,6p) Fixpoint Newton
Level (3 0.1 0.3 0.5 0.1 0.3 0.5
5 stab. tensor 6/2 12/2 33/2 3/3 4/2 4/3
gradient 6/2 11/2 34/2 3/3 4/2 4/3
6 stab. tensor 5/3 11/3 65/2 3/3 3/3 3/3
gradient 5/3 9/3 76/2 3/3 3/3 5/3

6 Conclusion and outlook

We can conclude our present numerical analysis as follows:


- The proposed stabilization technique is stable and accurate for the used
FEM spaces.
- The full (!) Newton method seems to be necessary for this type of nonlinear
problem.
- The multigrid convergence behaviour for this new class of auxiliary linear
subproblems is
- (almost) identical for both gradient and deformation tensor formulations:
The stabilization for nonconforming FEM works fine!
- depending on the involved pressure terms for both fixed point and Newton
methods: More investigation should focus on the linear algebraic
problem, beside the nonlinear solution procedure!
In future, we will cover a wider range of granular materials (see [6] for
a discussion):

- General equation of motion for a powder


pDu
Dt
= -Vp+ V· [ Ilo-nV'.'UIII
q<["p) (D -.lV·
n
UI)] + pg, with

- Continuity equation
~ + V· (pu) = 0, and
- Normality condition
676 A. Ouazzi, S. Turek

\7. u = &q~p) 110 - ~\7. uI11


- The yield condition q(p, p) is given by:

Powder properties Non-cohesive Cohesive


Incompressible psin¢ psin ¢ + ccos ¢
Compressible p sin ¢ [2 - j; ] psin¢p7J1 - C (17;)2
p-p
o7J

References

1. Brenner, C. S. (2002), Korn's inequalities for piecewise HI vector fields, IMI


Research Reports, 5, 1-21
2. Hansbo, P. and Larson, M. C. (2002), Discontinuous Calerkin methods for in-
compressible and nearly incompressible elasticity by Nitsche's method, Computer
Methods in Applied Mechanics an Engineering, 191(17-18), 1895-1908
3. Hron, J., Ouazzi, A. and Turek, S. (2002), A computational comparison of two
FEM solvers for nonlinear incompressible flow, proceedings of CISC2002 (to ap-
pear)
4. Knobloch, P. (2000), On Korn's inequality for nonconforming finite elements,
Technische Mechanik, 205-214
5. Rannacher, R. and Turek, S. (1992), A simple nonconforming quadrilateral Stokes
element, Numer. Meth. Par. Diff. Eq., 8, 97-111
6. Tardos, C.I., McNamara, S. and Talu, I. (2003) Slow and intermediate flow of a
frictional bulk powder in the couette geometry, in press, Powder Technology
7. Turek, S. (1998), Efficient solvers for incompressible flow problems: An algorith-
mic and computational approach, Springer,6, LNCSE
8. Turek, S., Ouazzi, A. and Schmachtel, R. (2002) Multigrid method for stabilized
nonconforming finite elements for incompressible flow involving the deformation
tensor formulation, JNM, 10, 235-248
9. Schaeffer, D. C. (1987), Instability in the evolution equation describing incom-
pressible granular flow, J. of Differential Equations, 66, 19-50
10. Schafer, M. and Turek, S. (1996) Benchmark computations of laminar flow
around cylinder. In E.H. Hirschel, editor, Flow Simulation with High Perfor-
mance Computers II, Note of Numerical Fluid Mechanics, 52, 547-566
Piecewise Polynomial Approximations for
Linear Volterra Integro-Differential Equations
with N onsmooth Kernels*

Arvet Pedas

Institute of Applied Mathematics, University of Tartu, Liivi 2, 50409 Tartu,


Estonia arvet. [email protected]

Summary. The piecewise polynomial collocation method is discussed to solve lin-


ear Volterra-Basset integro-differential equations with weakly singular or other nons-
mooth kernels. Using special graded grids, global convergence estimates are derived.
The error analysis is based on certain regularity properties of the solution of the
initial value problem.

1 Introduction

Volterra integral equations and integro- differential equations arise naturally


in many mathematical models of various physical and biological phenomena.
The study of their numerical methods has received considerable attention in
the past. The survey articles [1, 2] and the monograph [3] convey a good pic-
ture of these developments and contain an extensive bibliography. The present
paper is most closely related to the works [3, 4, 5, 6, 7, 8, 9] where a discus-
sion about the convergence of collocation methods for the numerical solution
of linear Volterra integro-differential equations with weakly singular kernels
is given. In the present paper we extend these investigations to a wider class
of equations. First we study the regularity properties of the solution (Section
2). Then we use these results in the construction and analysis of a piecewise
polynomial collocation method for solving such equations numerically (Sec-
tion 4). Using graded grids and an equivalent integral equation reformulation,
we derive global convergence estimates for the numerical solutions. Our aim
is to construct approximations which possess maximal convergence order on
the whole interval of integration. The main results of the paper extend the
corresponding results of [7, 8, 9] and are formulated in Theorems 1, 2 and 3.
* This work was supported by the Estonian Science Foundation (Research Grant
No. 5859).
678 A. Pedas

2 Integro-differential equation and smoothness of the


solution.

Let b E R = (-00,00), b > 0 and set.:1b = {(t,s) E R2: O:s; t:s; b,O:S; s < t},
.:1b = {(t,s)
E R2: O:S; s:S; t:s; b}. We consider an initial-value problem for a
linear integro-differential equation of the form

J J
t t

y'(t) = p(t)y(t)+q(t)+ K1(t, s)y(s)ds+ K 2(t, s)y'(s)ds, O:s; t :s; b, (1)

with given initial condition


° °
y(O) = Yo, Yo E R. (2)
Observe that, in contrast to "standard" Volterra integro-diffrential equations,
the integrand K 2(t, s)y'(s) in (1) depends on the derivative y' instead of the
solution y itself. We assume that K 1,K2 E W m,V(.:1b), p,q E Cm,V[O,b], m E
N = {1, 2, ... }, v E R, v < 1. Here wm,v(.:1 b), mEN, v < 1, is defined as the
set of all m times continuously differentiable functions K : .:1b -+ R satisfying

I( ut~)i(~ut + uS~)jK(t,S)1 :S;C{~+IIOg(t-:-S)I!~~:~:~:


(t-s)-V-' ifv+i>O,
(3)

with a constant c = c(K) for all (t, s) E .:1b and all non-negative integers i and
j such that i + j :s; m.
It follows from (3) (with i = j = 0, 0 :s; v < 1) that the kernels K1(t, s)
and K 2 (t, s) of (1) may possess a weak singularity as s -+ t. In case v < 0 the
kernels Kl and K2 are bounded on .:1b but their derivatives may be singular
as s -+ t. In particular, Kl and K2 may have the form
K",,{3(t,s) = K(t,S)(t - s)-allog(t - s)I{3, O:s; a < 1,,6 2:: 0,
where K : .:1b -+ R is a m times continuously differentiable function. Clearly,
Ka,o E wm,a(.:1 b), 0 < a < 1,. KO,l E wm,O(.:1 b) and K a,{3 E wm,a+,,(.:1 b)
for 0 :s; a < 1, ,6 > 0, with a small c > 0 (c < 1 - a). Especially, if Kl = 0
and K2 = Ka,o, 0 < a < 1, then equation (1) is of type which is often
referred to as the Basset equation; the last one is playing important role in
the mathematical modelling of the diffusion of discrete particle in a turbulent
fluid (see, for example, [10, 11]).
The set Cm,V[O, b], mEN, v < 1, consists of functions 1 y E C[O, b] which
are m times continuously differentiable in (0, b] and such that
m
L sup (Wj_(l_v)(t)ly(j)(t)l) :S;c. Here
j=lO<t~b

1 By C[a, b] we denote the Banach space of continuous functions x : [a, b] ---> R


with the norm Ilxll = max{lx(t)I : a::; t::; b}. By C,Cl,C2, ... we denote positive
constants, which may be different in different inequalities.
Piecewise polynomial approximations 679

I if A < 0,
w)..(t) = { (1 + Ilogtl)-l if A = 0,
e· if A > 0,
m
Equipped with the norm Ilyllm,LI = max ly(t)l+ E sup (Wj_(l_Ll)(t)ly(j)(t)I) ,
O~t~b j=lO<t~b
em,LI[O, b] is a Banach space. Thus, if a function y belongs to em,LI[O, b], mEN,
l/ < 1, then its derivatives can be estimated by
1 if j < 1 - l/ ,
ly(j)(t)1 :::; c { 1 + Ilogtl if j = 1 - l/, (4)
t 1 - Ll - J if j > 1 - l/ ,

°
where < t :::; band j = 0,1, ... ,m. Note that em[O, b], the set of m times
continuously differentiable functions y : [a, b] ----+ R, belongs to em,LI[O, b] for
arbitrary l/ < 1.
Introducing a new unknown function
z =y', (5)
and using (2), equation (1) may be rewritten as a linear Volterra integral
equation of the second kind with respect to z,

J J J
t s t

z(t) = K1(t, s) z(T)dTds + [p(t) + K 2 (t, s)]z(s)ds + f(t), t E [0, b],


0 0 0
(6)
which may also expressed in the form

J
t

z(t) = K(t, s)z(s)ds + f(t) , t E [0, b] , (7)


o

J
where t

f(t) = q(t) + YOp(t) + Yo K1(t, s)ds, t E [0, b], (8)


o
and

J
t

K(t, s) = p(t) + K1(t, T)dT + K 2 (t, s), (t, s) E L1b, (9)


s

We will employ (6) in the construction of numerical solutions for problem


{(1),(2)} (see Section 4). For the smoothness analysis of the solution of
{(1),(2)} is more convenient to use (7) which we write in the form (I -T)z = f,
where I is the identity transformation and

J
t

(Tz)(t) = K(t, s)z(s)ds, t E [0, b]. (10)


o
680 A. Pedas

In the sequel, for given Banach spaces E and F we denote by £(E, F)


the Banach space of linear bounded operators A : E - t F with the norm
II All = sup{IIAzll : z E E, Ilzll :::; I}.
Lemma 1. Let K l , K2 E Wm,V(.db), p E Cm,V[O, b], mEN, v E R, v < 1.
Then T is linear and compact as an operator from LOO (0, b) to C[O, b]. More-
over, T is compact as an operator from cm,v [0, b] to cm,v [0, b].

Proof. We present T (see (9) and (10)) in the form T = TOlTo2 + Tl + T2,
where the linear operators T Ol , T02 , Tl and T2 are defined by settings

J
t

(TOlZ)(t) = p(t)z(t) , (T02Z)(t) = z(s)ds,


o

J J
t t

(TlZ)(t) = Ll(t, s)z(s)ds with Ll(t,s) = Kl(t,T)dT,


o s

J
t

(T2Z)(t) = K2(t, s)z(s)ds.


o
It follows from K l , K2 E W m,v(L1b) that Ll(t, s) is bounded for (t, s) E .db
and K 2(t,s) is at most weakly singular: IK2(t,s)1 :::; c(t - s)-a,(t,s) E .db,
° < a < 1. Therefore T l , T2 : LOO(O, b) - t C[O, b] are compact. Clearly,
TOI E £(C[O, b], C[O, b]) and T02 : LOO(O, b) - t C[O, b] is compact. This im-
plies TOlTo2 E £(LOO(O, b), C[O, b]) is compact. In summary, TOlTo2+Tl +T2 =
T E £(LOO(O, b), C[O, b]) is compact.
Further, it follows from Kl E Wm,V(.db) that

and hence Ll E Wm,v-l(.db) C Wm,V(.db). Since L l , K2 E Wm,V(.db),


T l , T2 E £(Cm,V[O, b], Cm,V[O, b]) are compact (see [7] for details). Since
1 E Wm,V(.db), we also deduce that T02 : Cm,V[O, b] - t Cm,V[O, b] is com-
pact. If Yl,Y2 E Cm,V[O,b], mEN, v < 1 then, by (4), YlY2 E Cm,V[O,b]
and IIYlY21Im,v :::; c IlyIilm,vIIY21Im,v, with a constant c which is indepen-
dent of Yl and Y2. This implies TOI E £(Cm,V[O, b], Cm,V[O, b]). In summary
TOlTo2 + Tl + T2 = T E £(Cm,v[O, b], Cm,V[O, b]) is compact. Lemma 1 is
proved.

The regularity of the solution of equation (1) is described in the following

Theorem 1. Let K l , K2 E Wm ,V(.db), p, q E Cm,V[O, b], mEN, v E R,


v < 1. Then equation (7) has a unique solution z E Cm,v [0, b] implying that
problem {(I), (2)} has a unique solution Y E Cm+l,v-l[O, b] for every Yo E R.
Piecewise polynomial approximations 681

Proof. It follows from p, q E Cm,V[O, b] and Kl E wm,v(Lh) that f E


Cm,V[O, b]. Indeed, f = h + h where (see (8)) h(t) = q(t) + yop(t), t E [0, b],
and 12(t) = YoJ~ Kl(t,s)ds, t E [O,b]. Clearly, h E Cm,V[O,b] and 12 = YoT1,
with T defined by (10). Since 1 E Cm,V[O, b] and T is bounded as an op-
erator from Cm,V[O,b] to Cm,V[O,b] (see Lemma 1),12 E Cm,V[O,b]. As the
homogenous equation z = Tz has only the trivial solution z = 0, it fol-
lows from f E Cm,V[O, b] and Lemma 1 that 1- T has a bounded inverse
(I - T)-l E .C(Cm,V[O, b], Cm,V[O, b]), and equation (I - T)z = f has a unique
solution z = (I - T)-l f E Cm,V[O, b]. In other words, y' E Cm,V[O, b], implying
y E Cm+l,v-l[O,b]. Theorem 1 is proved.

3 Piecewise polynomial interpolation

For given N EN, r E R, r 2': 1, let 1T'N = {to, ... , t N : 0 = to < ... < t N = b}
be a partition (a grid) of the interval [0, b] given by the grid points
tj = b(j / Nr, j = 0, ... , N . (11)
Here r (also called the grading exponent) characterizes the non-uniformity of
the grid IIN: if r > 1 then the gridpoints (11) are more densely clustered near
the left endpoint of the interval [0, b]. Let
(12)
For given integers m 2': 0 and -1 ::; d ::; m - 1, let s5::.) (IIN) be the spline
space of piecewise polynomial functions on the grid IIN:

s5::.)(IIN ) = {u: ul uj E 1T'm, j = 1, ... ,N;

(ul,,.)(k)(tj) = (UIUJ+Yk)(tj), 0 ::; k::; d;j = 1, ... ,N -I},

where 1T'm denotes the set of polynomials of degree not exceeding m and ul uj
is the restriction of u to the subinterval OJ, j = 1, ... , N. Note that elements
of S$;:l) (IIN) = {u: ul uj E 1T'm,j = 1, ... ,N} may have jump discontinuities
at the interior points t 1, ... , t N -1 of the grid IIN.
In every subinterval OJ (j = 1, ... ,N) we introduce mEN interpolation
points
l = 1, ... ,m (j = 1, ... ,N) (13)
where Til, ... , Tim do not depend on j and N and satisfy
o ::; Til < . . . < Tim ::; 1 . (14)
To a given continuous function z : [0, b] ~ R we assign a piecewise poly-
nomial interpolation function PN z E s~-!i (IIN) which interpolates z at the
points (13): (PNZ)(tjl) = Z(tjl), l = 1, ... , m; j 1, ... , N. Thus, PNZ is
independently defined in every subinterval aj (j = 1, ... ,N) and (PNZ)(t)
682 A. Pedas

may be discontinuous at t = tj , j = 1, ... , N - 1; we may treat PNz as a two


valued function at these points. Note that in case T)l = 0, T)m = 1 (see (14)),
PNz E [0, bj. We introduce also an interpolation operator PN which assigns
for every function z E C[O, bj its piecewise polynomial interpolation function
PNz.
Lemma 2. [7, 8) Let z E Cm,V[O, b], mEN, v E R, v < 1. Then sup Iz(t)-
tE[O,b]
(PNz)(t)1 :s; csr;:,v,r), where c a constant not depending on Nand
for m < 1 - v, r ;:::: 1 ;

1
N-m
N-m(1 +logN) for m = 1- v,r = 1;
sr;:,v,r) = N-m for m = 1- v,r > 1; (15)
N-r(l-v) for m > 1 - v, 1 :s; r < m/(1 - v);
N-m form>l-v,r;::::m/(I-v).
Lemma 3. Let T : LOO(O, b) ----) C[O, bj be a linear compact operator. Then
°
liT - PNTII,C(LOO(o,b),LOO(O,b» ----) as N ----) 00,
Proof. An easy observation shows that
IlpNII,C(C[O,bJ,LOO(O,b» :s; c, N EN, (16)

°
where c is a constant not depending on N. On the base of (16) and Lemma 2 we
obtain that liz - PNZIILOO(o,b) ----) as N ----) 00 for every z E C[O, b]. Together
with the compactness of T : LOO(O, b) ----) C[O, b] this yields the assertion of
Lemma.

4 Collocation method
We look for an approximation v to the solution z of equation (6) in the space
s;;;~i (lIN) determing v = v(N,m,r) E s;;;~i (lIN)' m ;:::: 1, from the following
conditions:
tjl S

V(tjl) = f(tjl) +j K1(tjl,S) j v(T)dTds+


o 0 (17)
tjl

+ j [p(tjl) + K2(tjl' s)]v(s)ds, 1 = 1, ... , m;j = 1, ... , N,


o
with {tjz}, given by (13). Having determined the approximation v for z, we can
also determine the approximation u for y, the solution of initial value problem
{(I) ,(2)}, setting (see (5))
t

u(t)=Yo+jV(S)dS, tE[O,bj. (18)


o
Piecewise polynomial approximations 683

Note that the choice of the collocation points (13) with 'T]l = 0, 'T]m = 1 in (14)
actually implies that the resulting collocation approximation v belongs to the
smoother polynomial spline space S~~l (1I~).
Theorem 2. Let Yo E R, K 1 , K2 E Wm,V(Ll b), p, q E Cm,V[O, b], mEN,
v E R, v < 1, and assume that the collocation points (13), with the grid
points (11) and parameters (14), are used.

°
Then, for all sufficiently large N, say N ~ No, and for every choice of
parameters (14) with'T]l > or'T]m < 1, the equalities (18) and (17) determine
unique approximations u E S~)(lI'N) and v E S;:~i(lI'N) (with vllTj = (ullTJ',
j = 1, ... , N) to the solution y of problem {(I), (2)} and its derivative y',
respectively. If'T]l = 0, 'T]m = 1, then u E S;;;) (lI'N) and v = u' E S~~l (lI'N).
For all N ~ No the following error estimates holds.'
Ile(i) 1100 :::; CEC;;,v,r) , i E {O, I}. (19)

Here c is a constant not depending on N, EC;;,v,r) is given by (15) and

Ile(i) 1100 = J=l,


. max (maxlu;i)(t)-y(i)(t)l) ,
... ,N tElTj
Uj=ul lT ,
J
iE{O,I}. (20)

Proof. As we know from Section 2, problem {( 1), (2)} is equivalent to the


integral equation (7) where z = y' and the forcing function f and the kernel K
are given by (8) and (9), respectively. We rewrite (7) in the form z = Tz + f,
with T defined by (10). We find that f E Cm,V[O, b] C LOO(O, T). It follows
from Lemma 1 that T is compact as an operator from LOO(O, b) to LOO(O, b).
Therefore, z = Tz + f has a unique solution z E LOO(O, b). Moreover, on the
base of Theorem 1 we obtain that z E Cm,V[O, b].
Further, conditions (17) are equivalent to the operator equation represen-
tation v = PNTv + PN f, with PN defined in Section 3. From Lemma 3 and
from the boundedness of (1 - T)-l in LOO(O, b) we obtain that 1- PNT is in-
vertible in LOO(O, b) for sufficiently large N, say N ~ No. Moreover, the norms
of (I - PNT)-l are uniformly bounded in N:
11(1 - PNT)-lll.C(Loo(o,b),LOO(O,b)) :::; c, N ~ No. (21)
Thus, for N ~ No, equation v = PNTv + P N f provides a unique solution
v E S;:~i(lI'N) (v E S~~l(lI'N) if'T]l = 0, 'T]m = 1). For v and z, the solutions
of equations v = PNTv + P N f and z = Tz + f respectively, we have
(22)
Now (21) yields Ilv - zIILOO(O,b) :::; CIIPNZ - zIILOO(O,b), N ~ No, with a constant
c which is independent of N. Applying Lemma 2 we obtain the estimate (19)
with i = 1.
t
Further, due to (18) and (2), y(t) - u(t) = f[y'(s) - v(s)]ds, t E [0, b].
Applying (19) with i = °
1 we obtain the estimate (19) with i = 0. Theorem 2
is proved.
684 A. Pedas

Thus, according to Theorem 2, in case m > 1- v, the approximation order


Ily-ull oo ::; cN-m is guaranteed for r 2': m/(I- v). For v close to 1, v < 1, this
condition on r may be too restrictive. To obtain the order IIY - ull oo ::; cN-m,
the condition on r can be considerable relaxed, as shown in the following

Theorem 3. Let Yo E R, K1 E Wm,V(Ll b), K2 E W m ,V-1(Ll b), p,q E


Cm,V[O, b], mEN, v E R, v < 1, m> 1 - v, and assume that the collocation
points (13) with the gridpoints (11) and parameters (14) are used. Then, with
the notation of Theorem 2, we have the following estimates for the error Y - u:
1) if 1 - v < m < 2 - v then IIY - ull oo ::; cN-m for r 2': 1;
2) if m = 2 - v then

II - ull < c { N-m(1 + log N) for r = 1 ;


Y 00 - N-m for r > l',
3) if m > 2 - v then
N-r(2-v) for 1 ::; r < m/(2 - v),
Iiy - ull oo ::; c { N-m(1 + log N) for r = m/(2 - v) ,
N-m for r>m/(2-v).
Proof. Using the equality (1 - PNT)-l = 1 + (I - PNT)-l PNT, we rewrite
the error (22) in the form v - z = PNZ - Z + (1 - PNT)-lPNT(PN z - z),
N 2': No. Due to continuity and boundedness of K(t, s) on Llb, T is bounded,
as an operator from L1(0, b) to C[O, b] (see (9) and (10)). Together with (5),
(16) and (21) we obtain that

J JI
t b

Iy(t) - u(t)1 = I [z(s) - V(S)]dsl ::; c (PNZ)(S) - z(s)lds,


o
°: ; t ::; b. Since z
0

where E cm,l/[O, b], m > 1 - v, then (see [12], p.116,[7, 8])

max I(PNZ)(t) - z(t)1 ::; c(tj - tj_1)m+lt}-v-m, j = 1, ... , N,


tEUj

with {t j }, given by (11). It follows from (11) that


(t ). - t·)-1 )m+1 t )1-v-m <
_ c
N- r(2-l/) J·r(2-l/)-m-1 , j = 1, ... ,N.
Therefore, for t E [0, b], we have

Iy(t) - u(t)1 ::; c L


N

)=l tj _
J tj

1
N
I(PNZ)(S) - z(s)lds::; crN- r(2-v) Lf(2-l/)-m-1,
)=1

(23)
with c and C1 not depending on N. Furthermore, for a number 0 E R we have
N {NO:+ 1 ifo>-I,
LjO:::;c 1+llogNI~fo=-I, (24)
j=l 1 1fo<-I,
Piecewise polynomial approximations 685

where C is a constant which does not depend on N. Applying (24) with a =


r(2 - v) - m - 1 to (23) it is easy to see that the statements of Theorem 3
hold.

Remarks. 1) The equalities (17) form a system of algebraic equations


whose exact form is determined by the choice of a basis in s;:-':i
(17;[) (or in
S~~l (lIN) if '1]1 = 0, '1]m = 1). For instance, in each subinterval [tj-l, tj] (j =
1, ... , N) we may use the representation v(tj_ 1 + 'Thj) = 2::;:1 cjlLi m - 1 ) ('T),
'T E [0,1]' where Li m - 1\'T) denotes the lth Lagrange fundamental polynomial
of degree m - 1 associated with the parameters (14), that is Li m - 1 )('T) =
1I~1('T - '1]i)/('1]1 - '1]i), 'T E [0,1]. The conditions (17) then lead to a linear
system of equations for the coefficients Cjl = c;~), l = 1, ... , m; j = 1, ... , N.
2) Method {(17),(18)} where we have discretized the integral equation (6)
is equivalent to the collocation method applied directly to problem {(1),(2)}.

°
In the latter form the collocation method in more particular case (K2 = 0,
Kl E W m,V(L1 b), < v < 1, p, q E Cm[O, b], mEN) has been examined in
[3,4,5,6].
3) The convergence results established by Theorems 2 and 3 are derived
under the assumptions that all needed integrals in (17) can be evaluated ana-
lytically. Since this is rarely possible in concrete applications, there arises the
question how to approximate these integrals so that the resulting fully dis-
cretized collocation method converges under the same conditions and with the
same rate as it is proved for the "exact" collocation method in Theorems 2
and 3. This question will be discussed elsewhere.

References
1. Brunner H. (1982): A survey of recent advances in the numerical treatment of
Volterra integral and integro-differential equations. J. Comput. Appl. Math., 213-
229.
2. Baker C. T. H. (2000): A perspective on the numerical treatment of Volterra
equations. J. Comput. Appl. Math., 125, 217-249.
3. Brunner H., van der Houwen P. J. (1986): The Numerical Solution of Volterra
Equations, CWI Monographs 3, North-Holland, Amsterdam.
4. Makroglou A. (1981): A block-by-block method for Volterra integro-differential
equations with weakly-sin gular kernels. Math. Comp., 37, 95-99.
5. Brunner H. (1986): Polynomial spline collocation methods for Volterra integra-
differential equations with weakly singular kernelss. IMA J. Numer. Anal., 6,
221-239.
6. Tang T. (1993): A note on collocation methods for Volterra integro-differential
equations with weakly singular kernels. IMA J. Numer. Anal., 13, 93-99.
7. Brunner H., Pedas A., Vainikko G. (2001): Piecewise polynomial collocation
methods for linear Volterra integro-differential equations with weakly singular
kernels. SIAM J. Numer. Anal., 39, 957-982.
686 A. Pedas

8. Brunner H., Pedas A., Vainikko C. (2001): Spline collocation method for linear
Volterra integra-differential equations with weakly singular kernels. BIT, 41, 5,
891-900.
9. Parts 1., Pedas A. (2003): Spline collocation methods for weakly singular Volterra
integra-differential equations. In: Brezzi F., Buffa A., Corsara S., Murli A. (eds)
Numerical Mathematics and Advanced Applications, Enumath 2001, Springer-
Verlag, Italia, Milano, 919-928.
10. McKee S., Stokes A. (1983): Praduct integration methods for the nonlinear Bas-
set equation. SIAM J. Numer. Anal., 20, 1, 143-160.
11. Brunner H., Tang T. (1989): Polynomial spline collocation methods for the non-
linear Basset equation. Computers Math. Applic., 18, 5, 449-457.
12. Vainikko C. (1993): Multidimensional Weakly Singular Integral Equations. Lec-
ture Notes in Mathematics, 1549, Spronger-Verlag, Berlin, Heidelberg, New-York.
On a Discontinuous Galerkin Method for
Radiation-Diffusion Problems

Haria Perugia 1 , Dominik Schotzau 2 and James Warsa3

1 Dipartimento di Matematica, Universita di Pavia, Via Ferrata 1, 27100 Pavia,


Italy, email: [email protected].
2 Mathematics Department, University of British Columbia, Vancouver, BC V6T
lZ2, Canada, email: [email protected].
3 Transport Methods Group, Los Alamos National Laboratory, Los Alamos, NM
87545, USA, email: [email protected]. Supported by the U. S. Department of
Energy.

Summary. In this paper, we show that the discontinuous finite element method
recently developed by Warsa, Wareing and Morel for radiation-diffusion problems
belongs to a class of generalized local discontinuous Galerkin methods. We then
derive a priori error bounds for this method and numerically confirm them to be
sharp.

1 Introduction

In the recent work [7] and [6]' Warsa, Wareing and Morel introduced a discon-
tinuous finite element method for the discretization of a radiation-diffusion
problem that is represented by a system of two coupled first order equations
for the zeroth and first angular moments of the particle distribution (the scalar
flux and current, respectively). These so-called PI equations arise out of an
angular Galerkin approximation to the Boltzmann transport equation based
upon a spherical-harmonic trial space of first order; see [3] or [4] for more
details. In the absence of time dependence (the case considered here) the re-
sulting problem finds the current J = J(x) and the scalar flux iP = iP(x)
satisfying
\7 . J + (J"a(x)iP = Qo, (1)
subject to so-called vacuum and reflecting boundary conditions, respectively,
1 1
-iP--J·n=O
4 2
onrv, J. n = 0 on rR. (2)

Here, D is a bounded polygonal (d = 2) or polyhedral (d = 3) domain with out-


ward normal unit vector n on the boundary r = aD, which is partitioned into
two parts = r rVUrR with disjoint interiors. The right-hand sides Qo E L2(D)
and QI E L 2 (D)d are the zeroth and first angular moments of an inhomoge-
neous source. We assume that the material coefficients (J"t and (J"a belong to
688 I. Perugia et al.

LOCJ(Jl) and satisfy O"t(x) ::::: 0"* > 0 and O"a(x) ::::: 0 in Jl (O"a = 0 in purely
scattering subregions). For simplicity, we further assume that irv ds > o.
The method of Warsa, Wareing and Morel approximates both the un-
knowns J and tP by piecewise linear functions. It is designed in such a way that
the radiation energy as well as the radiation momentum are conserved over
each cell, as in standard, upwind Godunov schemes. In combination with effi-
cient preconditioning techniques, the results in [5-7] indicate that the method
of Warsa, Wareing and Morel can be applied to a wide range of problems.
In this note, we show that the discrete formulation of the PI equations
of Warsa, Wareing and Morel belongs to the general class of mixed discon-
tinuous Galerkin (DG) methods analyzed by Castillo, Cockburn, Perugia and
Schotzau in [1]. This class extends and generalizes the local discontinuous
Galerkin (LDG) method proposed by Cockburn and Shu [2]. In the original
LDG approach the vector unknown can be eliminated from the equations in
a local and element-wise manner. In contrast, the method of Warsa, Wareing
and Morel belongs to the" truly" mixed variants of the LDG method described
in [1] for which such a local elimination is no longer possible. While this can be
seen as a shortcoming of truly mixed DG methods it in fact leads to better and
nearly optimal convergence rates for the approximation of the vector variable;
see [1]. Furthermore, as described in [6]' the PI equations are used to accel-
erate the iterative convergence of the Boltzmann transport equation solution.
The transport equation is discretized with a spatial DG method and in many
cases effective acceleration strongly depends on how well the vector unknown
of the acceleration equations is approximated. This application is what makes
the use of the truly mixed LDG discretization for the PI equations necessary
and motivates the study presented in this paper.
We apply the theoretical results in [1] and conclude that the method of
Warsa, Wareing and Morel is well-posed. Moreover, the following a priori error
bounds hold: for an approximation order k ::::: 0, the method exhibits conver-
gence rates in the mesh size of order k + ~ in a suitable energy norm, and
of order k + 1 in the L2-norm of the scalar flux. We present a set of numer-
ical convergence tests for a three dimensional model problem on tetrahedral
meshes that verify the theoretical predictions. We must point out that these
tests complete the tests in [1] where no numerical results were shown for truly
mixed DG methods.

2 Discontinuous Galerkin discretization


In this section, we detail the mixed discontinuous Galerkin discretization pro-
posed by Warsa, Wareing and Morel [6,7] and cast the method in the setting
of [1].
We consider shape regular meshes Th of mesh size h that partition the
domain Jl into triangle and/or parallelograms. We allow for irregular nodes,
in general, but assume that the local mesh sizes are of bounded variation. Using
On a Discontinuous Galerkin Method for Radiation-Diffusion Problems 689

the same notation as in [1], we let Ey be the union of all interior faces of Th,
Ev and ER the union of all boundary faces of Th on Tv and TR, respectively,
and set E = Ey U Ev U ER . For piecewise smooth vector- and scalar-valued
functions wand u, we introduce the following trace operators. Let e c Ey be
an interior face shared by two elements K+ and K-, and write n± for the
outward normal unit vectors to the boundaries 8K±, respectively. Denoting
by w± and u± the traces on 8K± taken from K±, respectively, we define the
jumps across e by [w] = w+ . n+ + w- . n- and [u] = u+n+ + u-n-, and the
averages {{wJt = (w+ +w-)/2 and {{u]} = (u+ +u-)/2. On a boundary face

°
e c Ev U ER , we set [w] = w· n, [u] = un, {{w]} = wand {{u]} = u.
Let Th be a triangulation Th of [l and k ::::: an approximation order. We
wish to approximate (J,<P) by a piecewise polynomial function (Jh' <Ph); that
is, (Jh' <Ph)IK E Pk(K)d x Pk(K) for all K E Th. Here, Pk(K) denotes the
set of polynomials of degree at most k on K. This approximation is defined
by imposing that, for all elements K E Th and all test functions (w,u) E
Pk(K)d x Pk(K),

3 {17tJh·wdx- {<Ph'il·wdx+ { $hw.nKds=3 {Ql·wdx,


JK JK J8K JK
- {Jh·'iludx+ { uX,Kds+ {17a<Phudx= {Qoudx.
JK J8K JK JK
(3)

Here, Ih,K and $h are the so-called numerical fluxes, that are approximations
to the traces of J. nK and <P on the element interfaces and are chosen as follows
(see [6]).
First, for an element K+ and an interior face e shared by K+ and a neigh-
boring element K-, we define the following inwardly and outwardly directed
discrete flows (partial currents) by
1 A;- 1 - 1 A;+
j in
e,K+ = 4'¥h - 2J h . nK+, j out
e
,K+
_
- -'¥h
4 + -21 J h+ . nK+·
If the face e of K+ is contained in Ev U ER, the outwardly directed flow jouKt
e,
+
is defined as j~,lh = ;j<Pt + Pt . nK+· Further, if the face e of K+ belongs
to Ev , we set jin
e, K
+ = 0, whereas jin
€, K
+ is not needed for e c ER .
The numerical fluxes are then taken as

with ~ = 0, if e C Ey U Ev, and ~ = 1, if e C ER.


This completes the definition of the DG method proposed in [6,7] for prob-
lem (1)-(2) (where only the case k = 1 was considered). Notice that, for e
shared by K+ and K-, Ih,K+ Ie = -lh,K-le, whereas the definition $hle does
not depend on which side of e it is taken from. This is the reason for the
subscript K in Ih,K. The choice in (4) of the numerical fluxes can be motived
by physical arguments corresponding to the so-called Marshak approximation;
690 I. Perugia et al.

obviously the choice is not unique, but consistency with the intended applica-
tion in which solution of the PI equations is embedded requires that we make
this choice; see [6] for more details. Note that the fluxes can further be easily
adapted to take into account inhomogeneous boundary data.
We now cast the fluxes in (4) in the setting of [1]. Denote by Jh a vector
field such that Jhle . nK = lh,K, for all e c 8K (the definition of Jhle no
longer depends on which side it is taken from). Then we have
~ 1
Jhl e = {{J h}} + 4[<h], $hle = {{<Ph}} + Ph], if e C [T,
~ 1 1 ~ 1
Jhle = 2Jh + 4<Ph n, <Phle = 2<Ph + Jh . n, if e C [v,

Jhl e = 0, $hle = <Ph + 2Jh' n,


The form of these fluxes shows that the method in (3) belongs to the class of
mixed DG methods investigated in [1] and the theoretical results there can be
used to analyze it. In particular the formulation (3) is consistent and uniquely
solvable; see [1, Proposition 2.1].

3 Error analysis

In this section, we discuss the a priori error bounds that are obtained for
the formulation (3). For a piecewise smooth function (w, u), we define the
seminorm
2 12 2 2 1212
I(w, u)lh = 3110't wllo,D+11 [w] Ilo,EIUEV +211 [w] Ilo,ER +IIO'J UIl O,D+ 41 [u] IIEIUEV'
2

Whenever O'a > 0, 1(-, ')Ih actually defines a norm. Our main result is given
in Theorem 1, which derives directly from the analysis in [1, Theorem 2.2].
Strictly speaking, that analysis was carried out for O'a == 0 and simpler bound-
ary conditions but extension to the current situation poses no difficulties.

Theorem 1. Assume the exact solution (J,<P) of (1)-(2) to belong to


HS+l(fl)d x Hs+ 2(fl), with s 2:: o. Let (Jh,<Ph) be the DG approximation
obtained by (3), for an approximation degree k 2:: O. Then we have the error
bound

with a constant C > 0 independent of the mesh size h.


Furthermore, if the domain and the coefficients O't and O'a are sufficiently reg-
ular, we also have the L2-bound

with a constant C > 0 independent of the mesh size h.


On a Discontinuous Galerkin Method for Radiation-Diffusion Problems 691

The order of convergence in the approximation of the seminorm 1(-, ')Ih is


half a power of h better than for the standard LDG method, due to the truly
mixed nature of the numerical fluxes; cf. [1]. Moreover, the above result also
holds true for k = 0, i.e., for piecewise constant approximations. For the LDG
method, no convergence has been observed either theoretically or numerically
in this case.

4 Numerical results

In this section, we present the results of a series of numerical experiments that


demonstrate the theoretical error estimates of Theorem 1.

4.1 Smooth solution

We start by testing the performance of the method for a smooth solution.


We consider the radiation~diffusion system (1) on n = (0,1)3, with reflecting
boundary conditions on the faces {x = O} and {x = I}, and vacuum boundary
conditions on the remaining boundary faces. The material coefficients are (Jt =
1 and (Ja = 1O~4. The right~hand sides Qo and Ql are chosen so that the
exact solution (J, iP) is a polynomial of order 7 and thus arbitrarily smooth.
The corresponding numerical solutions for k = 0 and k = 1 are computed on
a sequence of tetrahedral meshes {'Ii}~=l constructed by uniformly dividing
the domain into Cartesian grids with 2i , i = 1, ... ,5, equal intervals in each
dimension. Each cube in the grid is then subdivided into six tetrahedra of
equal volume. The mesh size hi on mesh 'Ii is therefore proportional to 2~i.
In Table 1 we report the errors and the numerical convergence rates ri
in the I . Ih~seminorm, in the L2~norm of iP, as well as in the L2~norm of
J. Clearly, we obtain convergence order k + ~ in the I . Ih~seminorm, as well
as order k + 1 in the L2~norm for iP. This confirms the theoretical results in
Theorem 1. Also, the IIJ -J hllo,!? part of the l'lh~seminorm actually converges
more rapidly than the whole I . Ih~seminorm, namely with the optimal rate
k + 1. The convergence behavior of the different jump contributions to the
I 'Ih~seminorm is shown in Figure 1. Here, we plot the different errors against
n = 2i , which is proportional to hi l . We use the abbreviations I, R and V
for [y, [R and [v, respectively. Comparison with the line y = n~k~l/2 clearly
shows that the jumps over interior faces converge with order k + ~. On the
other hand, comparison with the line y = n~k~l shows that all the boundary
jumps exhibit a better convergence of the order k + 1. This indicates that
the errors in the interior jump contributions are the ones that render the first
estimate in Theorem 1 sharp. Finally, we note the similar behavior is observed
for solutions that are piecewise in H2-regular; for brevity, these numerics have
been omitted.
692 I. Perugia et al.

Table 1. Smooth solution: errors and convergence rates.


I(J -
J h, <P - <Ph)lh 11<p - <Ph Ilo.J? IIJ - Jhllo,J?
error
i ri error ri error ri
1 7.11 e-2 - 1.45 e-2 - 3.13 e-2 -
24.91 e-2 5.34 e-1 8.90e-37.00e-1 1.90e-27.18e-1
k=O 33.21 e-2 6.14e-1 4.71e-39.17e-1 1.02 e-2 9.04 e-1
42.14e-2 5.82 e-1 2.40 e-3 9.73 e-1 5.21 e-3 9.67 e-1
51.47 e-2 5.47 e-1 1.21 e-3 9.90e-1 2.63 e-3 9.87 e-1
14.40e-2 - 7.22 e-3 - 1.21 e-2 -
2 1.91 e-2 1.20e+0 2.36 e-3 1.61 e+O 3.71 e-3 1.71 e+O
k=l 37.41 e-3 1.36e+0 6.33 e-4 1.90 e+O 9.78e-41.92e+0
42.74e-3 1.43 e+O 1.61 e-4 1.97 e+O 2.47 e-4 1.99 e+O
59.91 e-4 1.47 e+O 4.06 e-5 1.99 e+O 6.16 e-5 2.00 e+O

4.2 A two-material problem

In practice the coefficient (Jt may have strong discontinuities at the interfaces

between different materials. In these cases, where the vector-valued quantities


-13 Vtf> and J are typically smoother than the scalar flux tf>, the use of a truly
G"t
mixed method, such as the one considered in this paper, is of particular im-
portance. In this series of numerical experiments, we consider a problem with
material discontinuities. The radiation-diffusion equations (1) are solved with
(Ja = 0 and a discontinuous coefficient (Jt. We again set D = (0,1)3, and specify

vacuum boundary conditions on the faces {X = O} and {x = 1}, with reflecting


boundary conditions on the remaining boundary faces. We define (Jt = a, for
o :S x :S 0.5, (Jt = b, for 0.5 < x :S 1, with positive parameters a and b, model-
ing a material discontinuity at x = 0.5. The right-hand sides Q1 and Qo are
chosen so that the solution (J, tf» is given as follows. Denoting by cp(y, z) the
polynomial cp(y, z) = y2 z 2, we take J(x, y, z) = (Ha +x(b - a)) . cp(y, z), 0, 0)
and
FF.(
~ x,y,z
) = { 3a(x - ~)cp(y,z)
1
in (0,0.5) x (0,1) x (0,1),
3b(x - 2)CP(y, z) in (0.5,1) x (0,1) x (0,1).

Notice that tf> is piecewise smooth, but only belongs to H~-c(D), c > 0,
whereas 3~tVtf> = (cp(y,z),(x - ~)8ycp(y,z),(x - ~)8zcp(y,z)) is a smooth
function. We use the same sequence of tetrahedral meshes as the numerical
experiments presented in Section 4.1.
We consider the two cases a = 1, b = 0.01 and a = 100, b = 1. Notice
that the jump in the normal derivative of tf> at the surface x = 0.5 is equal to
3(a - b)y2 z2. This jump is almost two orders of magnitude larger in the case
a = 100, b = 1 than in the case a = 1, b = 0.01. For this reason deterioration
of the convergence rates due to the lack of smoothness of the exact solution
might be more serious in the second case than in the first one.
On a Discontinuous Galerkin Method for Radiation-Diffusion Problems 693
k:::O
10- 1 r----,____---,____---,____-----,

_ 1[[<I\ll Dur
......... W[[<1\11 Io.v
¢"'%'" a[[Jkll lo.r
............ . [[Jl ll lo.v
- ~----1i R [[J.ll ~_,

10\'-----'---------,'-------:'':-6------c':

IO-Ic---~--,____---,____---,____--______:J

_11[[(I>~JlIl0.l
..... 11[[<I'b11Ilo,v
¢'''V¢ II [[J~llll o"
........ 1I[[J)Jll o.v
+~"'<i II [[JhllIlo,J(

'6 32

Fig. 1. Smooth solution: L 2 -errors of the jumps in J and iP

Table 2 shows the errors and the numerical convergence rates Ti in the l'lh-
seminorm and in the L2-norm of iP, as well as in the L2-norm of J, for the case
a = 1, b = 0.01, Order k+ ~ in the l'lh-seminorm, and order k+ 1 convergence
in the L2-norm for iP are obtained. These results agree with the theoretical
estimates in Theorem 1, even though the elliptic regularity assumptions of
Theorem 1 for the estimate of the L2-error in iP are not satisfied. Furthermore,
the L2-error in J also converges with the optimal order k + 1. In Figure 2,
the errors in the jumps of iP and J are shown. Again, we see that the interior
jumps converge at the rates O(hk+~) while the boundary jumps show the full
convergence rate of O(hk).
694 I. Perugia et al.

Table 2. Two-material problem, a = 1, b = 0.01: errors and convergence rates.


I(J - Jh,P - Ph)lh lip - Ph Ilo,n IIJ - Jhlio,n
i error Ti error Ti error Ti
1 1.76 e-1 - 6.15 e-2 - 5.14 e-2 -
21.27e-1 4.67 e-1 3.18 e-2 9.50 e-1 2.92e-28.17e-1
k=O 38.95e-2 5.04 e-1 1.61 e-2 9.85 e-1 1.54 e-2 9.21 e-1
46.30e-2 5.08e-1 8.07 e-3 9.95 e-1 7.91 e-3 9.61 e-1
54.44e-2 5.05e-1 4.04e-3 9.98e-1 4.01 e-3 9.80 e-1
19.42e-2 - 1.59 e-2 - 1.39 e-2 -
23.66e-2 1.36e+0 4.06 e-3 1.97 e+O 3.56 e-3 1.97 e+O
k=l 3 1.33 e-2 1.46 e+O 1.02 e-3 1.99 e+O 8.73 e-4 2.03 e+O
44.77 e-3 1.49 e+O 2.55 e-4 2.00 e+O 2.13 e-4 2.04 e+O
5 1.69 e-3 1.4ge+0 6.38 e-5 2.00 e+O 5.26 e-5 2.02 e+O

Now consider the case a = 100, b = 1. The numerical rates in Table 3


indeed paint a different picture than the previous case shown in Table 2 for
a = 1 and b = 0.01. For both k = 0 and k = 1 the rates in the I· Ih-seminorm
have deteriorated slightly, but they are still close to k + ~. Similar behavior is
observed in the L2-norm of J. These results are in agreement with Theorem 1.
The differences in the L 2-errors in P are more remarkable: the convergence
rates are smaller than 0.5 for k = 0 and better than 2 for k = 1. However,
the oscillations in the numbers might indicate that the asymptotic regime has
not been reached yet. Finally, we show in Figures 3 the behavior of the errors
in the jumps of P and J for k = 0 and k = 1, respectively. For k = 0 the
jumps of P on the reflective boundary in this case converge more slowly - and
suboptimally - than in the case a = 1, b = 0.01, while the other jumps exhibit
rates of convergence similar to those for the case a = 1, b = 0.01. For k = 1
the jumps of P on the reflective boundary exhibit convergence similar to those
for k = O. However, convergence is at the optimal rate of O( h ~) that is, half
an order less than the L2-norm. All the other jumps show a convergence rates
as for the k = 0 case.

5 Conclusions

The discontinuous Galerkin method of Warsa, Wareing and Morel introduced


in [7] and [6] is a (truly) mixed discontinuous Galerkin method that belongs
to the general class of methods analyzed in [1]. The analysis there ensures
well-posedness and a priori error estimates, which have been verified in a se-
ries of numerical experiments. It is important for the specific applications in
which solutions of the PI equations are needed that convergence has been
demonstrated for the vector unknown, even in the case k = O.
On a Discontinuous Galerkin Method for Radiation-Diffusion Problems 695
k=O, a= I, h=Q,Ol
100~----'-----'------;r:====::::::::;l
_11[{4>hll llu..
_ 1 1[[<t>hlll~v
~ 1 1[[JhJJIIo.I
........... 1I[[JhlJ l!o,v
~-4 I1[[JhlJllo.lI.

10" 2L-- 6------"32


- - - ' - - - - - "g------c].L
n
k= l, 3=1. b=O.Ol
10'.,------.----.,.------,----,

.... 1I{[41 hll ll,1I


_I1([4>hlllinv
~", ...... II([JkIlJID,l
........ 1I[[Jhllllo.v
<·"'4 I1 {(JhIl IlQ,lt

10"2~----:-----!;------,]!-::6-------:!32

Fig. 2. Two- material problem, a = 1, b = 0.01: L 2 -errors of the jumps in J and <P

References
1. P. Castillo, B. Cockburn, I. Perugia, and D. Schotzau. An a priori error analysis.
of the local discontinuous Galerkin method for elliptic problems. SIAM J. Numer.
Anal., 38:1676-1706, 2000.
2. B. Cockburn and C .-W. Shu. The local discontinuous Galerkin method for time-
dependent convection-diffusion systems. SIAM J. Num er. Anal., 35:2440-2463 ,
1998.
3. B. Davison. Neutron Transport Theory. Clarendon Press, 1957.
4. E. E. Le~is and Jr. W. F. Miller. Computational Methods of Neutron Transport.
American Nuclear Society, 1993.
696 I. Perugia et al.

Table 3. Two-material problem, a = 100, b = 1: errors and convergence rates.


I(J - Jh,P - Ph)lh lip - Ph Ila,n III - Jhlla,n
i
error ri error ri error ri
11.47e+1 - 7.76e+0 - 3.83e+0 -
21.05 e+1 4.84e-1 5.05 e+O 6.21 e-1 2.01 e+O 9.29 e-1
k=O 37.4ge+0 4.90 e-1 3.49 e+O 5.34 e-1 1.04 e+O 9.56 e-1
45.32e+0 4.95 e-1 2.59 e+O 4.29 e-1 5.33 e-1 9.57 e-1
53.7ge+0 4.87 e-1 1.94e+0 4.17e-1 2.78e-1 9.42 e-1
15.36e+0 - 2.68e+0 - 8.23e-1 -
22.09 e+O 1.36 e+O 6.91 e-1 1.95 e+O 2.17e-1 1.92e+0
k=l 38.44e-1 1.31 e+O 1.56 e-1 2.14 e+O 5.72 e-2 1.93 e+O
4 3.43 e-1 1.30 e+O 3.35 e-2 2.22 e+O 1.53 e-2 1.90 e+O
5 1.36e-1 1.33 e+O 7.37 e-3 2.19 e+O 4.15 e-3 1.88 e+O

5. J. S. Warsa, M. Benzi, T. A. Wareing, and J. E. Morel. Preconditioning a mixed


discontinuous finite element method for radiation diffusion. In F. Brezzi, A. Buffa,
S. Corsaro, and A. Murli, editors, Numerical Mathematics and Advanced Appli-
cations, Proceedings of ENUMATH 2001, pages 967-978. Springer-Verlag, 2003.
6. J.S. Warsa, T.A. Wareing, and J.E. Morel. Fully consistent diffusion synthetic
acceleration of linear discontinuous transport discretizations on three-dimensional
unstructured meshes. Nucl. Sci. Engrg., 141:236-251, 2002.
7. J.S. Warsa, T.A. Wareing, and J.E. Morel. Solution of the discontinuous H
equations in two-dimensional Cartesian geometry with two-level preconditioning.
SIAM J. Sci. Comput., 24:2093-2124, 2003.
On a Discontinuous Galerkin Method for Radiation-Diffusion Problems 697

102 F ---,------,-----;r:::=====:;:J _II[[<I>~)] lo.l


....... II [[<I>h11 Io.v
*"'4"II [[Jhll Ib.l
......... 11 [[Jhlll~.v
II [[Ih]} llo.R

1O- 12';-------------;----------!8,--------::'6,---------;3-2
n
k:::l, a",lOO. 1>=1

10

. . . . 1I[[<I>k11 1lu!
. . . . . . 11 [[<lIhll Uo,v
10-2 "~,,,,+ II [[Jb]] "0.[
........ 11 [[Jl )] nos
II [[Jhllllll,R

1O"'z';-------------:-------------;,--------!16,----------=32

Fig. 3. Two-material problem, a = 100, b = 1: L 2 -errors of the jumps in J and P


Modeling of Multi-Phase Flows
with a Level-Set Method

Sander P. van der Pijl*, A. Segal and C. Vuik

Delft University of Technology, Delft, The Netherlands

Summary. Multi-phase flows are frequently modeled in engineering fluid mechanics.


In this work incompressible two-phase flows are considered. The present research
aims to model high density-ratio flows with complex interface topologies, typically
air/water flows. Applications are mixtures of bubbles and droplets. Aspects which
are taken into account are: a sharp front (density changes rapidly), arbitrary shaped
interfaces, surface tension, buoyancy and coalescence of drops/bubbles. Attention is
paid to mass-conservation and integrity of the interface.
A survey of available computational methods is performed in [1]. The computa-
tional method used in this paper is the Mass Conserving Level-Set method (MCLS,
[2]). The MCLS method is based on the Level-Set methodology, using a VOF-function
to conserve mass. This function is advected without the necessity to reconstruct the
interface. The ease of MCLS is based on an explicit relationship between the Volume-
of-Fluid function and the Level-Set function. The method is straightforward to apply
to arbitrarily shaped interfaces, which may collide and break up.

1 Introduction
Various methods have been put forward to treat multi-phase flows. A classifi-
cation is given in [1]. The two methods that are most suitable for the current
research are the Volume-oj-Fluid (VOF) method and the Level-Set method.
For both methods a marker function is used to define the interface. In case of
the Volume-of-Fluid method, a marker function, say lJt, indicates the fractional
volume of a certain fluid, say fluid '1', in a computational cell. It can be seen
as the concentration of the marker particles of the MAC-method, when the
number of particles goes to infinity.
An alternative to the Volume-of-Fluid method is the Level-Set method
([3, 4]). The interface is now defined by the zero level-set of a marker function,
say t!>: t!> = 0 at the interface, t!> > 0 inside fluid '1' and t!> < 0 elsewhere.
The function t!> is chosen such that it is smooth near the interface. This eases
the computation of interface derivatives. Also, methods available from hyper-
bolic conservation laws can be used to advect the interface. The interface is
(implicitly) advected by advecting t!> as if it was a material constant:
at!>
at + u . \It!> = o. (1)

* Research granted by NWO.


Modeling of Multi-Phase Flows with a Level-Set Method 699

The Level-Set method has some advantages over the Volume-of-Fluid


method. Especially when solving the flow-field is concerned, since interface
normals, curvature and distance towards the interface can be expressed eas-
ily in terms of iP or derivatives of iP. Also, advecting the interface is possible
by the application of 'of-the-shelf' techniques available from hyperbolic con-
servation laws. For these reasons, the Level-Set method has been chosen as
the basis of our work. However, mass-conservation is not an intrinsic prop-
erty and is considered the major drawback of the Level-Set method. Our work
focuses on a mass-conserving way to advect the interface by means of the
Mass-Conserving Level-Set method (MCLS, [2]).
The MCLS method has a shared foundation with the CLSVOF method ([5,
6]) and to a lesser extend with the combined Level-Set/particle method ([7]) in
the sense that it is based on Level-Set and additional effort is made to conserve
mass. The difference with CLSVOF is that here there is no combination of
two existing methods. The method takes full advantage from all additional
information provided by the Level-Set function iP, rather than coupling Level-
Set with Volume-of-Fluid/PLIC. In fact we use the Volume-of-Fluid function
lJi as a help variable to conserve mass, without applying the difficult convection
(namely interface reconstruction) which makes the VOF so elaborate. The key
issue of our method is that we define a simple relationship between the Level-
Set function iP and Volume-of-Fluid function lJi. This relation is obtained by
assuming piecewise linear interfaces within a computational cell:

lJi = f(iP, ViP). (2)

It makes the advection of the Volume-of-Fluid function lJi easy (i.e. without
interface reconstruction) and finding iP from lJi a straightforward task. This is
carried out by well-known numerical tools, like Picard and Newton iterations.
The PLIC method is not adopted (unlike CLSVOF), yet mass is conserved
in the same manner. Note that the CLSVOF method might not be easily
extendible to 3D space. Yet the extension of MCLS to three-dimensional space
can be done in a straightforward way. Note also that with this approach, it is
not necessary to smooth (or regularize) lJi, which is common for other methods.

2 Governing Equations

Consider two fluids '0', and '1' in domain [l E IR? which are separated by an
interface S. Both fluids are assumed to be incompressible, i.e.:

V· u = 0, (3)

where u = (u, v)t is the velocity vector and u and v are the velocities in
x- and y-direction respectively. The flow is governed by the incompressible
Navier-Stokes equations:
700 S.P. van cler Pijl et al.

au l I t
-m + u· Vu = --p Vp + -p V 'IL (Vu + Vu ) + g, (4)

where p, p, IL and g are the density, pressure, viscosity and gravity vector
respectively. The density and viscosity are constant within each fluid. Using
the Level-Set function iP these can be expressed as

(5)
and similar for p, where the subscript indicates the corresponding fluid and H
is the Heaviside step function.

2.1 Interface conditions

The interface conditions express continuity of mass and momentum at the


interface:
[u] = 0
(6)
[pn - n'lL (Vu + Vu t )] = O"I"m,
where the brackets denote jumps across the interface, n is a normal vector
at the interface, 0" is the surface tension coefficient and 1"£ is the curvature
of the interface. Clearly, the velocity u is continuous at the interface. If the
viscosity IL is continuous at the interface, it can be shown that the derivatives
of the velocity components are continuous too ([8, 9]). In that case Eqn. (6)
reduces to [u] = 0 and [p] = 0"1"£. To achieve that, the viscosity is forced to be
continuous by smoothing Expression (5):

(7)

where Ha is the smoothed (or regularized) Heaviside step function

x <-0;

Ixl::::; 0; (8)
x > 0;

and 0; is a parameter proportional to the mesh width. Here 0; is chosen as


(following [10]) 0; = ~h, where h is the mesh width. According to [11], the
viscosity is then smoothed over three mesh widths, provided IViPl = 1. Note
that only the viscosity is smoothed, not the density p. Note also that when
the density is not regularized, mass is conserved when the volume of a certain
fluid or phase is conserved. In fact, the MCLS method conserves volumes
by construction. Due to the non-regularized density-field, mass is conserved
too. Instead of taking into account the pressure-jump at the interface due
to the surface tension forces, the continuous surface force/stress (CSF, [12])
methodology is adopted.
Modeling of Multi-Phase Flows with a Level-Set Method 701

3 Computational Approach

The Navier-Stokes equations are solved on a Cartesian grid in a rectangular


domain by the pressure-correction method ([13]). The unknowns are stored in
a Marker-and-Cell (staggered) layout ([14]). For the interface representation
the Level-Set methodology is adopted. The interface conditions are satisfied by
means of the continuous surface force (CSF) methodology. The discontinuous
density field is dealt with similarly to the GhostFluid method for incompress-
ible flow ([8]). Further information about the flow-field computations can be
found in [2].

3.1 Interface advection

The strategy of modeling two-phase flows is to compute the flow with a given
interface position and subsequently evolve the interface in the given flow field.
In the foregoing, it has been described how the flow is computed with a given
interface position. Next we consider the evolution of the interface.

Level-Set The interface is implicitly defined by a Level-Set function if.}. More


precisely, the interface, say S, is the zero level-set of if.}:

S(t) = {x E IR?Iif.}(x, t) = o}. (9)

The interface is evolved by advecting the Level-Set function in the flow field
as if it were a material constant (Eqn. (1)):
8if.)
at +u, Vif.}= O. (10)

A homogeneous Neumann boundary condition for if.} is imposed at the bound-


aries. It will be clear that accuracy of the approximation of Eqn. (10) de-
termines the accuracy of the interface representation. The accuracy will also
determine the mass errors. For this purpose, the discretization of the gra-
dient of if.} can be either first order upwind, or second or third order ENO
([10, 11, 15]). In case of the first-order spatial discretization, a forward Euler
temporal discretization is sufficient. In case of the higher order spatial dis-
cretization, a Runge-Kutta scheme is applied (e.g. [16]).

MCLS The difficulty with the Level-Set method is, that although if.} might
be conserved, this does not imply that mass is conserved. On the other hand,
with the Volume-of-Fluid method, mass is conserved when tj/ is conserved. In
order to conserve mass with the Level-Set method, corrections to the Level-
Set function are made by considering the fractional volume tj/ of a certain fluid
within a computational cell. First the usual Level-Set advection is performed:
first-order advection and unmodified re-initialization. Low order advection and
re-initialization will ensure numerical smoothness of if.}. Furthermore, when the
702 S.P. van der Pijl et al.

flow-field is computed, higher order accuracy might not be expected when the
CSF method is applied and viscosity is regularized. In that respect, higher
order discretization of Eqn. (10) will only lead to improved mass conservation
for the pure Level-Set methods. Since the obtained Level-Set function <pn+l,*
will certainly not conserve mass, corrections to <pn+l,* are made such that
mass is conserved. This requires three steps:
1. the relative volume of a certain fluid in a computational cell (called
'volume-of-fluid' function l}f) is to be computed from the Level-Set function
<p n : l}f = J(<p, V<p);
2. the volume-of-fluid function has to be advected conservatively during
a time step towards l}fn+l;
3. with this new volume-of-fluid function l}fn+l, corrections to <pn+l,* are
sought such that J(<pn+l, V<pn+l) = l}fn+l holds.
These three steps will be explained subsequently.

Step 1: Volume-oj-FLuid function A relationship between the Level-Set func-


tion <p and the so-called volume-of-fluid function l}f is found by consider-
ing the fractional volume of a certain fluid in a computational cell fh. In
this paper, a Cartesian mesh is employed consisting of computational cells
fh, k = 1,2, .... By Xk = (Xk, Yk)t the center node of fh is meant and L1x
and L1y are the mesh sizes in x and Y direction respectively. The volume-of-fluid
function l}fk is defined in terms of Level-Set function <p by

(11)

where H is the Heaviside step function. The Level-Set function <p is linearized
around <Pk, which leads to

(12)

Note that in contrast with other approaches, the Heaviside step function is
not regularized. After some mathematical manipulations, the function J is
evaluated as

<Pk '5-<Pmaxk

-<Pmaxf(. <Pk <E-<Pmidk

(13)

where
Modeling of Multi-Phase Flows with a Level-Set Method 703

tPmaxk = ~(IDxkl + IDykl) tPmidk = ~IIDYkl-IDXkll


(14)
Dxk = Llx ~;: Ik DYk = Lly ~! lk '
which are approximated by central differencing.

Step 2: Volume-oj-Fluid advection At a certain time instant the volume-of-fluid


function can be computed from tP by means of Eqn. (12). The volume-of-fluid
function after a time step is found by considering the flux of fluid F that flows
through a boundary r of a computational cell during time-step Llt:

,hn+l _ ,T,n 1 ( F F
'1"+1 l + (15)
2: ,J'+12' -'1"+1 '+1- Xi+lJ'+ xiJ'+
A A l -
't 't 2 ,] 2 .LJX£...ly '2 '2
FY 't'+ l2 ,)'+1 - FY 't'+ l2 ,).),

The fluxes are again computed by linearizing tP (just like Eqn. (13)). In fact,
the fluxes are computed by the straightforward application of J.
It is possible that fluid is fluxed more than once through different faces,
which would cause unphysical values of Jjf. As reported in e.g. [1], this can be
solved by employing either a multidimensional scheme or flux-splitting. For
reasons of simplicity we have chosen for the second approach. The order of
fluxing is: first in x-direction, then in y-direction. Currently the flux-splitting
of [5] is adopted. As reported in [5], undershoots and/or overshoots can still
occur, which leads to unphysical values of Jjf, namely < 0 and> 1. If these
values are replaced by 0 and 1 respectively, mass errors arise which are of
order 10- 4 . This is also experienced in the current research. Mass errors are
completely avoided by redistributing Jjf ([2]).

Step 3: Inverse Junction Having found a new Volume-of-Fluid function Jjfn+l,


the initial guess of the Level-Set function tP n +1,* (after Level-Set advection) is
modified, such that mass is conserved within each computational cell. In other
words, find (tPl' tP2, ... ), such that

Vk = 1,2, ... , (16)


where E is some tolerance. It will be clear that due to the behavior of Jjf no
unique solution tP exists. However, a (small) correction to tP* is searched, where
tP* comes from Level-Set advection. A solution tP is found by the following
iteration (until convergence): leave tP unmodified in a grid point when the
Volume-of-Fluid constraint is satisfied and make corrections locally when this
constraint is not satisfied. This is achieved by using the inverse function g of
J as given in Eqn. (13) with respect to argument tPk:

J(g(Jjf, VtP), VtP) = Jjf. (17)


704 S.P. van der Pijl et al.

4 Applications

The behavior of the MCLS approach is shown by a couple of standard advec-


tion tests, with a prescribed velocity field. Thereafter, the method is applied to
the complete set of equations by considering a falling drop and a rising bubble
respectively.

4.1 Advection tests

Linear advection The first advection test is a circle, which is advected by an


uniform velocity field. The velocity field is prescribed by (u, v) = (0, -1). The
dimensions of the computational domain are: Lx = 10 and Ly = 100. which
is discretized by a 10x100-mesh. Initially a circle of radius Ro is placed at
x = Lx/2 and y = L y -2Ro. For the case of Ro = 4 (a circle with a diameter of
8 mesh sizes), the relative mass is plotted in Fig. 1 as function ofthe traversed
distance of the circle. First-order, second-order and third-order pure Level-Set

10·'

L
i
-10'" -e- l~O(d ~ r
-+- l!11ord~r. re_mit
--0-- 2 n oordeT
10'" .. ~ .. 2nd order. re-init.
H· 3rtlorder
~- ~ordar.re-init
10.1 +- MClS

number of traversed Crid cells

Fig. 1. Relative mass for the linear advection test; E = 10- 8 (every 10th iteration
marked)

simulations (with and without re-initialization) are compared with the MCLS
method. ENO discretization is adopted for the pure Level-Set method (see
aforementioned references). The order of re-initialization is in agreement with
the order of advection. The tolerance in the VOF advection is taken to be:
E = 10- 8 . Globally speaking it can be said that mass is always lost for the
pure Level-Set advection. Mass losses are smaller for higher accuracy and re-
initialization causes much higher mass losses. The MCLS method conserves
mass up to the specified tolerance.
Modeling of Multi-Phase Flows with a Level-Set Method 705

4.2 Air/water flow


In [8] a two-dimensional rising air bubble in water is considered. The dimen-

gravity and material constants are: g = 9.8 S-, (j = 0.0728


Pa = 1.226 !i&,
*'
sions and sizes are: Lx = 0.02 m, Ly = l~Lx, Ro = iLx, Xo = Yo = ~Lx. The
Pw = 10 3 ::.&'
J.-Lw = 1.137 10- 3 ~ and J.-La = 1.78 10- 5 ~. where subscripts
wand a indicate water and air respectively.
Results are shown in Fig. 2(a) for three different mesh sizes. We take E =
10- 8 . Relative mass losses are of the same order and in agreement with the
advection tests. Note that the number of grid cells is much smaller than in
[8]. The results are the same for t ::::: 0.025 for all mesh sizes. Thereafter small
differences occur. The results compare well with [8]. The MCLS method seems
to result in a more coherent structure at the highly curved part of the interface
at t = 0.05. This is thought to be caused by the low resolution of the grids
used here.

BEJEJGlDD (a) Rising bubble (b) Falling droplet

Fig. 2. Air/water flows; - . - : 30 x 45; - - : 40 x 60; - : 60 x 90 mesh

In Fig. 2(b) results are shown for a falling droplet. The conditions are
the same as for the rising bubble, except for the sign of if> at t = 0 and
Yo = Lx. Mass conservation properties are the same as before. The result are
the same until the droplet hits the bottom. Thereafter differences occur. This
is thought to be due to limited number of grid cells available to capture the
flow-phenomena near the wall. The results compare well with [8]. Note that
the results in [8] span t ::::: 0.05; no results after collision are presented.

5 Conclusion
The mass Conserving Level-Set (MCLS) has been presented. The method is
based on the Level-Set methodology, where mass is conserved by considering
706 S.P. van der Pijl et aI.

the fractional volume of a certain fluid within a computational cell. Advection


tests were used to compare the method with the Level-Set method. Mass is
conserved up to a specified (vanishing) tolerance. The MCLS method combines
the attractiveness of the Level-Set method with the mass-conserving proper-
ties of the Volume-of-Fluid methods, without adopting the latter. This makes
the implementation much easier than for a Volume-of-Fluid (based) method,
especially in three-dimensional space. The applicability of the MCLS method
was illustrated by the application to air-water flows. It is possible to capture
bubbles or droplets within a limited number of grid cells without mass losses
up to the prescribed tolerance. This is an important feature, since future work
will concern three-dimensional problems/geometries, where the amount of grid
cells available to an individual entity will decrease considerably.

References
1. S.P. van der PijI. Free-boundary methods for multi-phase flows. AMA report
02-13, Delft University of Technology, 2002. http://ta.twLtudelft.nI.
2. S.P. van der Pijl, A. Segal, and C. Vuik. A mass-conserving level-set method for
modeling of multi-phase flows. AMA report 03-03, Delft University of Technol-
ogy, 2003. http://ta.twi.tudelft.nI.
3. W. Mulder, S. Osher, and J.A. Sethian. Computing interface motion in com-
pressible gas dynamics. Journal of Computational Physics, 100:209-228, 1992.
4. S. Osher and R.P. Fedkiw. Level set methods: An overview and some recent
results. Journal of Computational Physics, 169:463-502, 2001.
5. M. Sussman and E.G. Puckett. A coupled level set and volume-of-fluid method
for computing 3D and axisymmetric incompressible two-phase flows. Journal of
Computational Physics, 162:301-337, 2000.
6. M. Sussman. A second order coupled level set and volume-of-fluid method for
computing growth and collapse of vapor bubbles. Journal of Computational
Physics, 187: 110-136, 2003.
7. D. Enright, R. Fedkiw, J. Ferziger, and I. Mitchell. A hybrid particle level set
method for improved interface capturing. Journal of Computational Physics,
183:83-116, 2002.
8. M. Kang, R.P. Fedkiw, and X.-D. Liu. A boundary condition capturing method
for multiphase incompressible flow. Journal of Scientific Computing, pages 323-
360, 2000.
9. Z. Li and M.-C. Lai. The immersed interface method for the navier-stokes equa-
tions with singular forces. Journal of Computational Physics, 171:822-842, 2001.
10. M. Sussman, P. Smereka, and S. Osher. A level set approach for computing
solutions to incompressible two-phase flow. Journal of Computational Physics,
114:146-159, 1994.
11. Y.C. Chang, T.Y. Hou, B. Merriman, and S. Osher. A level set formulation of
eulerian interface capturing methods for incompressible fluid flows. Journal of
Computational Physics, 124:449-464, 1996.
12. J.U. Brackbill, D.B. Kothe, and C. Zemach. A continuum method for modeling
surface tension. Journal of Computational Physics, 100:335-354, 1992.
Modeling of Multi-Phase Flows with a Level-Set Method 707

13. J.J.I.M. van Kan. A second-order accurate pressure correction method for viscous
incompressible flow. SIAM J. Sci. Stat. Comp., 7:870-891, 1986.
14. F.H. Harlow and J.E. Welch. Numerical calculation of time-dependent viscous
incompressible flow of fluid with free surfaces. Physics of Fluids, 8:2182-2189,
1965.
15. M. Sussman, E. Fatemi, P. Smereka, and S. Osher. An improved level set method
for incompressible two-phase flows. Computers and Fluids, 27:663-680, 1998.
16. M. Sussman and E. Fatemi. An efficient, interface-preserving level set redistanc-
ing algorithm and its application to interfacial incompressible fluid flow. SIAM
Journal on Scientific Computing, 20(4) :1165-1191, 1999.
Numerical Modeling of Bypass Flow

Vladimir Prokopl and Karel Kozel 2

1 Czech Technical University, Faculty of Mechanical Engineering


[email protected]
2 Czech Technical University, Faculty of Mechanical Engineering

Summary. This paper deals with problem of numerical solution of laminar viscous
incompressible stationary and unstationary flows through vessel with bypass. One
could describe these problems using model of N avier-Stokes equations and find steady
solution of unsteady system by using multistage Runge-Kutta method together with
time dependent artificial compressibility method. Non-stationary solution is achieved
from initial stationary solution by prescribing of nonstationary outlet conditions.
Some results of numerical solution of cardiovascular problems are presented: station-
ary and unstationary 2D flows in a vessel and a bypass.

1 Mathematical model
In the cardiovascular system we could find many different types of vessels like
large arteries, vessels of medium size and capillaries. They differ in diameter
and in thickness and composition of the wall. In larger vessels the blood flow
can be assumed to behave as an incompressible continuum. One can describe
this type of flow using system of momentum and continuity equation written
in conservation form:
Dw
p - - \7. T = pf (1)
Dt
\7·w=O (2)
where T is stress tensor of the fluid, w is velocity vector and f is vector of
external forces, which is later not taken into account. Density of the fluid p
is supposed to be constant in physiological conditions, although it depends on
the red cells concentration. The functional dependence of the stress tensor T
on velocity vector wand the blood pressure p is descented by the following
relations:

(3)

(4)

Equations (3) and (4) describe Newtonian fluid. Important feature of blood
flow is pulsatility caused by the periodic motion of the heart. It is also known
[2] that there is scarcely any turbulence in vessels except some special cases.
Numerical Modeling of Bypass Flow 709

The walls of a tube which is the model of a vessel are supposed to be riggid
and the velocity vector w is null on them. Blood flow can be assumed to be
laminar [2]. Indeed, in physiological conditions, the values of speed involved
are low enough. Morover, generally, the periodicity of the flow, together with
short length of vascular districts, do not give rise to fully developed turbu-
lence. Reynolds number Re = d':;* is important feature of the flow behaviour.
Quantity w* is characteristic velocity, v = /J/ p is kinematic viscosity and d
is a lenght scale. In large and medium human vessels, the Reynolds number
ranges from 400 up to 10000. In stationary case pulsating nature of blood flow
is not considered. Elasticity of vessel tubes is not considered in both cases. The
flow could be then described as viscous, incompressible, laminar and stationary
(unstationary) in 2D by the system of Navier-Stokes equations without influ-
ence of exterior forces and heat exchange. The system is written in conservative
non-dimensional vector form rewritten from (1), (2)
- - 1
RWt + Fx + G y = R Re (Wxx + W yy ), (5)

where W = (p, u, v)T is the vector of solution, R = diagllO, 1, 111 and F =


(u,u 2 + p,uv)T,G = (v,uv,v 2 + pf denote inviscid fluxes, (u,v) is velocity
vector, p denotes pressure, Re = U* L: p* is Reynolds number, where U* is speed
of upstream flows, L * denotes width of a channel, 1)* is a reference kinematic
viskosity (* means reference dimensional quantity). For upstream boundary
conditions we use velocity vector (u, v), along the walls vector of velocity is
equal zero because of viscosity of fluid and impenetrability of wall, downstream
boundary condition is p = P2, which should ensure pressure gradient.

2 Numerical model
Solution of the system (5) is obtained using method of artificial compressibility,
then equation of continuity is completed with term ;&Pt, where a 2 E R+.
Rewritten in vector form the improved system (5) is following
- 1
Wt + Fx + G y = R Re (Wxx + W yy ) , (6)

where W = (:2' u, v)T. Finding steady solution one could solve unsteady sys-
tem (6) by finite volume method together with time dependent method. System
of equations (6) could be solved using three stage Runge-Kutta method using
given steady boundary conditions. At the inlet extrapolation of pressure is
used. At the outlet the value of pressure is set constant for stationary case and
for unstationary case the pressure is prescribed by sinus function in form:

P2 = P2o(1 + 0: sin 21l'wt), (7)


where w is a frequency and 0: is an amplitude. Multistage Runge-Kutha method
is stabilized by artificial viscosity term (Jameson's type):
710 V. Prokop, K. Kozel

w n . = W(O)
1",) 1,,) (8)
Wi~~) = Wi~~) - arL1tRWi~~-l), (r = 1, ... , m) (9)
n +!
W 2,J = W(m)
t,) ,
m = 3, (10)

where
RW(r-1) = RW(r-1) _ DW(r-1) (11)
1"J 1,) 1,,)

and coefficients 001 = 0.5,002 = 0.5,003 = 1.0, so the numerical method is


second order in time and space. The form of steady residual RWi~j depends
on the method used for solving space derivatives:
4

RWi,j = 1. 2)(F~ - ~eFk)L1Yk


1/ - (Gi - ~eGk)L1Xk]' (12)
'-'J k=l

where Fi = F,G i = G and FV = (O,ux,vxf,Gv = (O,uy,Vy)T. Artificial


viscosity term DWi~j in this case depends on the second derivatives of pressure
and is used for improving stability of the solution:

DWi,j = Ebi(WH1,j - 2Wi,j + Wi-l,j) + I'j (Wi,J+1 - 2Wi,j + W i,j-1)]


(13)
E = diagII0,E1,E211,E1,E2 E R (14)
I'i = max(')'il,l'i2), I'j = max(')'j1,l'j2) (15)
IPH1,j - 2Pi,j + Pi-l,jl Ipi,j - 2Pi-1,j + Pi-2,jl (16)
l'i1 = IPi+1,j + 2Pi,j + Pi-1,j I ,l'i2 = IPi,j + 2Pi-1,j + Pi-2,j I '
Ipi,j+! - 2Pi,j + Pi,j-11 Ipi,j - 2Pi,j-1 + Pi,j-21 (17)
I'j1 = IPi,j+! + 2Pi,j + Pi,j-1 I ,l'j2 = IPi,j + 2Pi,j-1 + Pi,j-2 I·
Time step is obtained from the formula, which is needed for the stability
limitation of the RK method, where C F L = 2:

(18)

where J.1,ij = fJD .. dxdy. The convergence of iterative process is followed using
'J
the behavior of residual in space L2

(19)

where M N is the number of finite volumes.


Numerical Modeling of Bypass Flow 711

3 Some numerical results


In this section we present numerical results achieved using above described
numerical methods. Firstly we show results of stationary cases, where the
outlet condition is stationary. For lower Reynolds numbers like 500 or 1000 we
could see good convergence of scheme and smooth results, for higher Reynolds
numbers we have good convergence when using artificial viscosity. We must
take into acount that the edges between bypass and vessel are sharp. This
problem should be eliminated in the future. We could see zones of separation
in bypass after bifurcation and also in the domain after contraction of vessel.
In the second part of our results we present our first results of unstationary
flow. We start from steady solution for specific Reynolds number and using
the same method as before only changing the outlet condition from stationary
to unstationary we obtain results of unstationary flow. Results presented here
are for 20 percent contraction of vessel and for bypasses which are about 40
or 30 percent of diameter of vessel. Also the distance between the vessel and
bypass is in the presented cases constant. For unsteady solution is necessary
to use a ---> (Xl or computation in dual (artifitial) time.

Acknowledgement: This wok was partly supported by grant GACR No. 201/00/0684
and Research Plan MSM 210000010.

=---_=..-

Fig. 1. Bypass flows no.1a, Re=1000, vector field of velocity

Fig. 2. Bypass flows no.1b, Re=1000, izolines of velocity


712 V. Prokop, K. Kozel

Re=1500, 5=40

Fig. 3. Bypass flows no.2a, Re=1500, vector field of velocity

Fig. 4. Bypass flows no.2b, Re=1500, izolines of velocity

Re=2000, 5=30

Fig. 5. Bypass flows no.3a ,Re=2000 , vector field of velocity

Re=2000, 5=30

Fig. 6. Bypass flows no.3a ,Re=2000, izolines of velocity


Numerical Modeling of Bypass Flow 713

Re=2000, s=40

Fig. 7. Bypass flows no.4, Re=2000, vector field of velocity

Fig. 8. Bypass flows no.4, Re=2000, izolines of velocity

Re=500, 8=40,1=100.2
per=4

Fig. 9. Bypass unstationary flows no.1, Re=500, izolines of velocity

Re=500, s=40,1=1 00.8


per=4

Fig. 10. Bypass unst. flows no.2, Re=500, izolines of velocity


714 V. Prokop, K. Kozel

Re=500, 5=40, 1=101.4


per=4

Fig. 11. Bypass unst. flows no.3, Re=500, izolines of velocity

Re=500, 5=40, 1=1 02.0


per=4

Fig. 12. Bypass unst. flows no.4, Re=500, izolines of velocity

Re=500, 5=40, 1=1 02.6


per=4

Fig. 13. Bypass unst. flows no.5 ,Re=500, izolines of velocity

Re=500,s=40, 1=103.2
per=4

Fig. 14. Bypass unst. flows no.6 ,Re=500, izolines of velocity


Numerical Modeling of Bypass Flow 715

Re=500, 5=40, 1=103.8


per=4

Fig. 15. Bypass unst. flows no.7, Re=500, vector field of velocity

References
1. R. J. LeVeque: Numerical Methods for Conservation Laws, Birkhauser Verlag,
1990
2. A. Quarteroni, M. Tuveri, A. Veneziani: Computational Vascular Fluid Dynam-
ics: Problems, Models and Methods
3. R. Dvorak, K. Kozel: Mathematical modeling in aerodynamics,CVUT, 1996 (in
Czech)
4. S. Canic, D. Mirkovic: A Hyperbolic System of Conservation Laws in Modeling
Endovascular Treatment of Abdominal Aortic Aneurysm, International Series of
Numerical Mathematics Vo1.140, 2001
5. K. Kozel, V. Prokop: Numerical solution of incompresible viscous flows through
a channel with cavitas and a channel with bypass, Proceeding of "The 2nd Inter-
national Conference Of Applied Mathematics And Informatics At Universities
2002", Trnava, September 2002, p. 59 - 63
A Posteriori Estimation of Dimension
Reduction Errors

Sergey Repin\ Stefan Sauter2 and Anton Smolianski 2

1 V.A. Steklov Institute of Mathematics, Fontanka 27, 191 011 St. Petersburg,
Russia [email protected]
2 Institute of Mathematics, Zurich University, CH-8057, Zurich, Switzerland
stas, [email protected]

Summary. A new a-posteriori error estimator is presented for the verification of


the dimensionally reduced models stemming from the elliptic problems on thin do-
mains. The original problem is considered in a general setting, without any specific
assumptions on the domain geometry, coefficients and the right-hand sides. The es-
timator provides a guaranteed upper bound for the modelling error in the energy
norm, exhibits the optimal convergence rate as the domain thickness tends to zero
and accurately indicates the local error distribution.

1 Introduction

The method of dimension reduction is a popular approach frequently used


by engineers for the approximate solution of the problems posed in thin do-
mains. The term "thin" means that the size of the original physical domain
along one coordinate direction is much smaller than along the others; this
allows to make some simplifying assumptions on the behaviour of the exact
solution and to replace the original, for instance, three-dimensional problem
by a two-dimensional one. It is, however, clear that the solution of the new,
"reduced" problem will, in general, differ from the solution to the original
high-dimensional problem. Thus, the dimension reduction method unavoid-
ably produces the error that can be referred to as the dimension reduction
or the modelling error. The essential part of the model verification is, hence,
a reliable a posteriori control of the dimension reduction error.
Despite the practical importance of the topic, only a few a posteriori esti-
mators for the dimension reduction error have been introduced so far. In [10]
and [2] (see also [1]) residual-type estimators were proposed and proved reli-
able and efficient under the assumptions that the right-hand side of the given
equation is zero and the original domain is a plate with plane parallel faces. In
[3] and [8] implicit estimators based on the solution of local Neumann prob-
lems were developed; the estimators were intended for hierarchical modelling
and involved the solution of local three-dimensional problems.
In this work we propose a reliable and efficient a posteriori estimator for the
dimension reduction error in the energy norm, having no specific assumptions
A Posteriori Estimation of Dimension Reduction Errors 717

on the right-hand side of the given equation and considering a general geome-
try of the given domain. We show that, for the zero-order dimension reduction
method considered here, the estimator of Babuska and Schwab (see [1], [2])
can be obtained as a particular case of our estimator when the right-hand side
of the equation is zero and the original domain is a plate with plane parallel
faces. We demonstrate the optimal convergence of the estimator as the plate
thickness tends to zero (although, it is worth noting that the proposed estima-
tor preserves its reliability for any positive thickness). Finally, we observe how
accurately the estimator indicates the local error distribution, thus, allowing
for a local improvement of the model.

2 Problem setting

We consider a three-dimensional Lipschitz domain


3
< X3 < d EB (Xl,X2)},
~
D:= {x E]R I (Xl,X2) ED, d e (Xl,X2)

where D c ]R2 is its projection on the (Xl, x2)-plane (D has the Lipschitz
boundary F) and de and dtB are Lipschitz continuous functions: D- t R The
lower and upper faces of D are denoted by

and
3
I (Xl, X2)
~
rEB := {X E]R ED, X3 = dtB(Xl, X2)},
the lateral boundary by

(see Figure 1).


Remark. We consider de and dEB as explicit functions of (Xl, x2)-coordinates
only for the sake of simplicity. The generalization of the theory to the case of an
arbitrary Lipschitzian domain D presents no difficulty from the conceptional
point of view.
The assumption that the given domain D is "thin" can now be written as

diam D » m~xd (Xl, X2), (1)


f!

where d = dtB - de is the domain thickness, d (Xl, X2) 2: d* > 0 \I(Xl' X2) E D.
Although the assumption is of purely qualitative nature, it serves as a basis
for the derivation of the corresponding two-dimensional reduced model. We
also have to notice that Figure 1 depicts a simplified case; in the geometrical
definitions we do not assume the domain thickness d (Xl, X2) to be a constant.
718 S. Repin et al.

Fig. 1. Sketch of the domain geometry

In the domain 0 we consider a model elliptic problem

-Div(AV'u)=J inO, (2)


u = 0 on ro , (3)
AV'u,ve =Fe onre , (4)
A V'u· vfB = FfB on r fB , (5)

where J E L 2(0), Fe ,FfB E L2(D), Ve and VfB are outward normal vectors
at re and r fB respectively. The matrix A = (aij(x))i,j=1,3 with the compo-
nents from Loo(O) is symmetric and uniformly positive definite, i.e. there exist
constants 0 < c < C < CX) such that

From now on we will frequently use the notation x = (Xl, X2), X = (x, X3), and
all functions depending only on (X1,X2) will be marked by ~; in addition, we
will distinguish between 3- and 2-dimensional divergence operator:

Div 7* = 7*1,1 + 7*2 ,2 + 7*3 ,3 , div r = rl,l + r2,2 .


A Posteriori Estimation of Dimension Reduction Errors 719

The weak form of the problem (2)-(5) reads


Problem (P): Find u E Va := {v E Hl(D) [v = 0 on To} such that

1n
A Vu . V w dx = rf
k
w dx + r Fe
J~
w ds + rF
J~
$ w ds Vw EVa. (6)

3 The reduced problem


The assumption (1) allows one to suppose that

the exact solution u ~ canst in the x3-direction. (7)


This gives rise to the so-called zero-order reduced model for the original prob-
lem (6). The model is very popular due to its simplicity and purely two-
dimensional formulation. The discussion on the hierarchy of the reduced mod-
els of different orders can be found in, e.g., [9]' [2].
Then, introducing the subspace

VO:={VEVO[::JVEH{j(D) such that v(x)=v(x) fora.e.x=(x,x3)ED}


and the operation C) of averaging in the x3-direction

J
dEll (x)
g(x) := d tx) g(x, X3) dX3 for a.e. xED,
de(x)

we can deduce from (6) the reduced problem (see [7]) that reads
u
Problem (P): Find E Vo such that

in d (x)Ap(X)Vu. Viii dx = in d (x) 1(x) iii dx Viii EVa, (8)

h
were f~ -- f~+ Feyl+lV'deI2 +FEIly1+IV'dEl
d
l I2
an d A~ p (~)
x -- (~
a,).. (~)) - l·S the
x i,j=1,2
averaged "plane" part (Ap(x) = (aij(x))i,j=l,2) of the matrix A.
It is clear that problem (8) is a two-dimensional elliptic problem with the
homogeneous Dirichlet boundary condition:

-div (d(x) Ap(x)Vu) = d(x) 1(x) in D (9)


u= 0 on T. (10)

4 A posteriori estimation of the modelling error


In order to control the dimension reduction error e := u - U, we apply the
functional-type a posteriori error estimate derived in [6] (see also [4] and [5])
720 S. Repin et at.

to the original three-dimensional problem (6):


For all "Y > 0,0> 0 and y* E H*(D,Div) there holds

Illu-ii:111 2 ::; (l+"Y)Mf+ (1+~) (l+o)CAMi (11)

+ (1 + ~) (1 + ~ ) Cf (1 + CA) Ml ,
where 111·111 is the energy norm, Illvlll := UnA(x)Vv, Vv dx) 1/2 Vv E Vo,
n
Cn is the constant from Friedrichs' inequality (C 2 = inf 1IIIIIr I1l2 ), Cr
wEVo\{O} w L2(fl)

. . l' (C2 IIWII12(rEj))+llwIl12(re))


IS the constant from the trace mequa Ity r = sup Illwll12+llwll2
wEVo\{O} L2(0)

and the functionals M'f, M?, Ml are defined as follows:

Mf:= in (Vii: - A - l y*) . (A Vii: - y*) dx,

Mi := IIDivy* + fIIL(n) ,
* *
M3
2.
.=
~ 2
lIFe - y veIIL2(re) + IIFEB
~ 2
- y vEBIIL 2(rEj))'

We emphasize that the estimate is valid for any positive numbers "Y and 0 and
for any vector-function y* from the space H*(D, Div) defined as

H*(D, Div) :=
{y* E L 2 (D,lR 3 ) I Divy* E L 2 (D) , y •. ve E L 2 (Te ) , y*. vEB E L 2 (TEB )}·

While the best possible option would be to take as y* the exact flux A Vu
(then M2 and M3 would vanish and Ml would give us the energy norm of
the exact error e), we have to restrict ourselves to choosing some computable
quantity, i.e. not containing the unknown exact solution u. We approximate
the flux by
y* = Ap Vii: + r* , (12)
where r* = {O, 0, 1/J(x)}T, 1/J is the auxiliary function from L2(D) such that
1/J,3 E L 2 (D) , 1/J E L 2 (Te ) and 1/J E L 2 (TEB ). Using (9), it is easy to verify that
y* from (12) belongs to H*(D, Div). A discussion about other choices of y*
can be found in [7J.
Substituting (12) into the functionals M'f, M?, Ml, we obtain (see the
details in [7J)
A Posteriori Estimation of Dimension Reduction Errors 721

(13)

(14)

Ilf - f- - Fevh + l'Vde l2+


d
FEB VI + I'V dEB 12 .1. _ 'Vd . A 'V~112
+ '1-',3 d p U L2(.f2) ,

(15)

where :8 p is the averaged "plane" part of the matrix B := A -1 (i.e., if B(x) =


(bij (x))i,j=l,3' then Bp(x) = (bij (x))i,j=1,2)' the vector h3 := {b 31 , b32 }T and
I is the 2 x 2 identity-matrix.
Now we still have the freedom of choosing the auxiliary function 1jJ that in
the case of the Poisson equation should, obviously, approximate the derivative
U,3 of the exact solution in the x3-direction. The simplest choice is to take such
a 1jJ that the term M3 (i.e. the residual on the Neumann boundary condition)
would be identically zero. This can be immediately achieved by letting 1jJ(x) =
a(x) X3 +,8(x) with the coefficient functions a, ,8 E L 2 (n) uniquely determined
by the requirement M3 = O. Other options for the function 1jJ are considered
in [7]. Then, minimizing the right-hand side of (11) with respect to the scalar
parameters"Y > 0 and 8 > 0, we arrive at the estimate

(16)

where M1 and M2 are defined by (13) and (14). The error majorant M has
been derived for quite general geometry of n and coefficient matrix A(x).
However, to make the estimate more transparent, we consider two particular
cases.

4.1 Plate of constant thickness

We assume that
dEB = de + do (do = canst> 0) (17)
and, in addition, that

A = A(x) (this immediately implies B = B(x)) , (18)


a31 = a32 = 0 (this yields Bp = A;1, b33 = a'3l, b31 = b32 = 0). (19)

With these assumptions the terms M1 and M2 in estimate (16) become simpler:

(20)
722 S. Repin et al.

One may notice that the integral in the first term M1 of the error majorant
M can be rewritten as

which means that the term M1 is of order O(d~/2) when the plate thickness do
tends to zero. If f E Loo(O), the second term M2 is obviously of the same order
O(d~/2), i.e. the whole estimator M converges to zero with the rate O(d~/2)
as do -+ o. This is the optimal convergence rate for the modelling error e in
the energy norm, as was shown in [9] for the simpler case of a plate with plane
parallel faces and f = o. It is worth noting that, if f E C1(0), the second
term in M is of higher order O(d~/2) as compared to the first term.

4.2 Plate with plane parallel faces

If, in addition to (18), (19), we strengthen the assumption (17) replacing it by

do do
dffJ = 2 ' de = - 2 (do = canst > 0) , (21)

the auxiliary function 'ljJ will take the simple form 'ljJ = F$;ioFe X3 + F$;Fe
and the error estimate (16) will read

If we set here f = 0, a33 = 1 and FffJ = Fe = F, we obtain

(23)

which is exactly the estimator of Babuska and Schwab (see [1]) for the zero-
order reduced model. Thus, the latter estimator can be obtained as a particular
case of the error majorant (16) if one makes the assumptions (18), (19), (21)
and sets f = O. This is a particularly interesting fact, since we advocate the
estimation approach that is completely different from the one utilized in [1]
(see the details in [7] and [6]).

5 Numerical example

In order to analyse the performance of the proposed error estimator, we con-


sider a simple two-dimensional test problem in the "sine-shape" domain (see
Figure 2 (left)) whose upper and lower faces are given by
A Posteriori Estimation of Dimension Reduction Errors 723

. do
dffi,e(x) = sm(k7rx) ± "2' k = 1,2, ... ,

where do > 0 is the domain thickness. In this example, = (0,1) and n n=


{(x, y) E ]R21 x En, de(x) < y < dffi(x)}. The considered problem is

-L\u = f in n,
u =0 at x = 0 and x = 1,
"ilu· vffi,e = Fffi,e at y = dffi,e,

and the right-hand sides of the equation and of the boundary condition are
computed using the exact solution

u(x, y) = sin(-lrx) . yrn (m = 1,2, ... ).


The reduced problem (8) is, in this case, a one-dimensional Dirichlet problem
that, of course, can be solved very accurately (in the present work, we address
the estimation of the modelling error only, assuming that the discretization
error stemming from the solution of the reduced problem is negligible).
Figure 2 (right) shows the convergence rates of the exact modelling-error in
the energy norm (III ell I) and of the error major ant M as the domain thickness
do tends to zero. It is clear that both the exact error and the majorant converge
to zero with the theoretically predicted, optimal rate O(d6/ 2 ), and, moreover,
the effectivity index II~II demonstrates the asymptotics II~II = 1 + O(do). It
is also important to note that the presented error estimator provides a reliable
upper bound for the exact error at any positive values of the domain thickness
do, i.e. also in the cases when the domain is not "thin" at all.
Finally, the local error distribution provided by the exact error and by the
first, M1-term of the majorant are depicted in Figure 3. The figure shows that
already for rather large value of the domain thickness do = 0.1 the majorant
delivers a sufficiently accurate information on the location of the regions of the
biggest modelling error, while for do = 0.05 the exact and the estimated error
distributions are practically coincident.

References
1. Babuska, I., Lee, I., Schwab, C. (1994): On the a posteriori estimation of the
modeling error for the heat conduction in a plate and its use for adaptive hier-
archical modeling. Appl. Numer. Math., 14, 5-21
2. Babuska, I., Schwab, C. (1996): A posteriori error estimation for hierarchic mod-
els of elliptic boundary value problems on thin domains. SIAM J. Numer. Anal.,
33, 221-246
3. Oden, J.T., Cho, J.R. (1996): Adaptive hpq-finite element methods of hierarchical
models for plate- and shell-like structures. Comput. Meth. Appl. Mech. Engrg.,
136, 317-345
724 S. Repin et al.

O.5~

Fig. 2. (left) The domain geometry, (right) Convergence rate of the exact error
and of the error majorant, k = 2, m = 2, the majorant is indicated by "0"

4. Repin, S.L (2000): A posteriori error estimation for variational problems with
uniformly convex functionals. Math. Comp., 69, 481~600
5. Repin, S.L, Sauter, S.A., Smolianski, A.A. (2003): A posteriori error estimation
for the Dirichlet problem with account of the error in the approximation of
boundary conditions. Computing, 70, 205~233
6. Repin, S.L, Sauter, S.A., Smolianski, A.A. (2003): A posteriori error es-
timation for the Poisson equation with mixed Dirichlet/Neumann bound-
ary conditions. Preprint 02~2003, Institut of Mathematics, Zurich University,
http://www.math.unizh.ch/index.php?preprint
7. Repin, S.L, Sauter, S.A., Smolianski, A.A. (2003): A posteriori estima-
tion of dimension reduction errors for elliptic problems on thin do-
mains. Preprint 18~2003, Institut of Mathematics, Zurich University,
http://www.math.unizh.ch/index.php?preprint
8. Stein, E., Ohnimus, S. (1997): Coupled model- and solution-adaptivity in the
finite-element method. Comput. Meth. Appl. Mech. Engrg., 150, 327~350
9. Vogelius, M., Babuska, 1. (1981): On a dimensional reduction method 1. The
optimal selection of basis functions. Math. Comp., 37, 31~46
A Posteriori Estimation of Dimension Reduction Errors 725

O.04,----~---,__--~--~---__,

0.035

0.03

0.025

0.03,----~--~---~--~--_____,

Fig. 3. Local error distribution provided by the exact error (solid line) and by the
M1-term of the majorant (dash-dot line), k = 4, m = 4: (left) do = 0.1, (right)
do = 0.05

10. Vogelius, M., Babuska, I. (1981): On a dimensional reduction method III. A


posteriori error estimation and an adaptive approach. Math. Comp., 37, 361-
384
Analysis of a Multi-Numerics/Multi-Physics
Problem

Beatrice Riviere 1

Department of Mathematics, University of Pittsburgh, 301 Thackeray, Pittsburgh


PA 15260 [email protected]

Summary. The coupled Stokes and Darcy flows problem is solved by locally con-
servative numerical methods. Discontinuous Galerkin methods are used in the Stokes
region and discontinuous Galerkin methods coupled with mixed finite element meth-
ods are employed in the Darcy region. Optimal a priori error estimates are derived.

1 Introduction and Model Problem

In this work, a numerical method for solving the coupled problem of Stokes
and Darcy equations is formulated and analyzed. The coupled system arises
from the study of the interaction between surface and subsurface flow. Discon-
tinuous finite elements and mixed finite elements (MFE) are used in subregions
of the subsurface domain while only discontinuous finite elements are used in
the surface domain. While mixed elements are popular and efficient on regular
grids, discontinuous Galerkin (DG) methods are accurate and easily imp le-
mentable on highly unstructured meshes. The proposed method enables to
take advantage of one of these locally conservative methods in a particular
subregion of the subsurface. This work is an extension of the coupling of DG
for Stokes and MFE for Darcy [8], and DG for Stokes and Darcy [7J. Similar
couplings are studied in the work of Layton, Schieweck and Yotov [5J and Dis-
cacciati and Quarteroni [2, 3J. Let n be a domain in IR 2 , subdivided into three

Q,

r13
"2 I n l2 j n13

~,~
Q 2 Q3

Fig. 1. Computational domain

sub domains n 1 , n 2 , and n 3 . Denote by r ij the interface between n i and nj


for i < j (see figure 1). Define also the boundary r i = ani n an, for i = 1,2,3.
The velocity Ui (resp. pressure Pi) denotes the restriction of the fluid velocity
Analysis of a Multi-Numerics/Multi-Physics Problem 727

U (resp. pressure p) to the subdomain D i . We assume that the fluid satisfies


the Stokes equations in D l .

-V'. (2/1,D(Ul) - PI!) = 11, in Dl , (1)


V"Ul =0, in Dl . (2)
Here, 11 is an external force acting on the fluid, p, > 0 is the constant fluid
viscosity and D( u) = ~(V'u+ V'u T ) is the strain tensor. Let n denote the unit
outward normal vector to the boundary aD. The single phase flow problem is
solved on the region D2 U D 3 .

V' . Ui = ii, Ui = -KV'Pi, in Di , i = 2,3. (3)


The permeability tensor K is symmetric, positive definite, bounded below and
above uniformly. As boundary conditions, we consider U = 0 on the Stokes
boundary r l , and KV' P . n = 0 on the Darcy boundary r 2 U r3 . At the
interface, the transmissibility conditions arise from the mass conservation, the
balance of forces across each interface, and the Beaver-Joseph-Saffman law
[1,9]. Denoting by nij (resp. Tij) the normal (resp. tangential) unit vector to
the interface r ij , for i < j, the conditions are written as:

Ul . n12 = -KV'P2 . n12, on r 12 , (4)


Ul . n13 = U3 . n13, on r 13 , (5)
U3 . n23= -KV'P2 . n23, P2 = P3, on r 23 , (6)
PI - 2p,(D(ul)nli) . nli) = Pi, on r li , i = 2,3, (7)
Ul . Tli = -2G(D(ul)nli) . Tli, on r li , i = 2,3. (8)
As the pressure is unique up to an additive constant, we assume that Jn P = O.
A weak solution (u, p) of the coupled Stokes-Darcy equations exists (see proof
in [5]). We assume that it is also a strong solution with enough regularity.
Section 2 defines the discrete spaces and the numerical method. Section 3
contains the error analysis.

2 Numerical Method
The discontinuous Galerkin method is used in the region Dl U D2 , while the
mixed finite element method is used in the region D3 . For i = 1,2,3, let £~ be
a non-degenerate quasi-uniform subdivision of Di , let r~ be the set of interior
edges and let h denote the maximum diameter of elements. We assume that
the meshes match at the interface n3
match, but they may not match at the
interfaces r 12 and r 23 . Given a fixed normal vector n e , on each interior edge,
pointing from E; E;,
to the average {w} and jump [w] of function w is defined:
728 B. Riviere

On a boundary edge, the jump and average coincide with the trace of the
function. For any integer k ::::: 0, the Sobolev space on a domain 0 is denoted
by Hk(O) = {v E L2(O) : Dmv E L2(O), \llml::::; k}, with norm 11·lko. We
denote by L6(O) the space of square-integrable functions with zero average,
with L2 inner-product (', ')0.
Let kl' k2 and k3 be positive integers. The DG discrete spaces are

x~ = {VI E (L 2(fh))2: \IE E E~, VI E (lPkl(E))2},


M~ = {ql E L2(D2): \IE E E~, ql E lPkl-I(E)},
M'/. = {q2 E L2(D2): \IE E E'/., q2 E lPk2(E)}.

The MFE discrete spaces are the classical ones, such as the Raviart-Thomas
spaces [6]. We assume that the mixed velocity spaces X~ C H(div; D 3 ) con-
tains polynomials of degree k3 and the pressure spaces M~ C L2 (D3) polyno-
mials of degree k3 - 1. We associate to these spaces the following norms:

Ilq21lit~ = L II\7q2116,E + L ~:'Ie II[q2]116,e,


EE£~ eEr~

Ilv311~~ = Ilv3116,573' IlqIllit~ = IlqIll6,57 l , Ilq31lit~ = Ilq3116,573'


Here, lei denotes the measure of each edge e, the parameters (JI,e and (J2,e are
positive constants defined later. Throughout the paper, C denotes a generic
positive constant whose is independent of the mesh size h. We now define the
global finite element spaces.

Xh = {v = (VI, V3) : Vi E xL \lry E X~ . nI3 L


eErl3 e
1 ry(VI - V3) . nI3 = O},

(9)
(10)
We recall a result proved in [4] that generalize a Sobolev imbedding. There
exists a constant C independent of h such that

\lVI EXt \lq2 EM'/., IlvIIIa,57 l ::::; CIIvIIIx~, Ilq21Ia,572::::; CIIq21IM~'


(11)

We introduce the following bilinear forms al : X~ x X~ -7 JR, bl : X~ x M~ -7

JR,a2 : M'f. x M'f. -7 JR, a3 : X~ x X~ -7 JR and b3 : X~ x M~ -7 JR:


Analysis of a Multi-Numerics/Multi-Physics Problem 729

-2p, L
eEr~uTr
1 e
{VUln e } . [VI] + 2WI L
eEr~uTr
1 e
{VVln e } . lUll

+~ L 1(uI''T 12 )(vI''T 12 )+ ~ L l(uI''T 13 )(vI''T d ,

1 1
eE r 12 e eEr13 e

bl(VI,PI) =- L PI V . VI + L {PI}[VI]' n e ,
EEEh E eEr~url e

a2(P2, q2) = L
EEE~
1E
KVP2 . Vq2 + L
eEr~
(J1:'le 1 e
[P2][q2]

- L 1{KVp2 . n e }[q2] + E2 L 1{KVq2 . n e }[P2].


eEr~ e eEr~ e

By introducing the parameters EI, E2 that take the value 1 or -1, we allow
for non-symmetric or symmetric bilinear forms al and a2. We assume that
in the non-symmetric case (EI = 1), the parameter (JI,e is equal to 1 and
in the symmetric case (EI = -1), the parameter (JI,e is bounded below by
a sufficiently large positive value. The same assumptions hold true for E2 and
(J2,e' Combining the subdomains bilinear forms, we define

A(u,p;v,q) = al(uI,vI) +a2(P2,q2) +a3(u3,v3), Vu,v E X h, Vp,q E Mh,


(12)
B(v,q) = bl(VI,qI) - b3(V3,q3), Vv E Xh, Vq E M h . (13)
It is easy to show that there is a coercivity constant t£ > 0 such that

Finally, we define a bilinear form A: (X h x M h) x (X h x M h ) -+ 1R acting on


the interfaces r l2 and r 23 .

With these forms, the numerical method is: find (U, P) E X h X Mh such that

V(v,q) E X h X M h , A(U,P;v,q) +B(v,P) +A(U,P;v,q) = (15)


= (f I, VI)n + (12, q2)n21
1

Vq E Mh , B(U, q) = -(13, q3)n3' (16)


730 B. Riviere

Lemma 1. If (u,p) is the strong solution of the coupled Stokes-Darcy flow


problem (1)-(8), then (u,p) satisfy the variational equations: for all (v, q) E
Xh xMh

A(u,p;v,q) +B(v,p) +A(u,p;v,q) = (17)

= (f1,VI)S?, + (h,q2)S?2 - r
ir'3
P3(VI - V3)' n13,

B(u,q) = -(h,q3)S?3' (18)

Proof. Multiplying the Stokes equation (1) by VI E xi" integrating by parts


over one element E, summing over all elements in E~ and using the regularity
of the strong solution yields:

L
E iE
r(2J1,D(UI) : D(VI) - PI \1 . VI)

-L
eEr~
1
e
{-PIJ + 2fLD(UI)}n e . [VI] + ELl eEr~ e
{2fLD(vr)}n e . lUll

L 1 (-PIJ + 2fLD( UI) )n12 . VI

- L 1(-PIJ + 2J1D(ur))n .
eEr'2 ur'3 e

VI +E L 12fLD(vI)n. UI = r fl' VI·


eEr, e eEr, e is?,
Using the interface conditions (7), (8), we obtain

al(uI,vr)+bl(VI,Pr) + L lp2vl.n12+ L lp3vl.nI3=(fI,vl)S?,'


eEr'2 e eEr'3 e

Similarly, multiplying (3) by a test function q2 E M~, integrating by parts


on one element, summing over all elements in E~ and using the boundary
condition yields:

With the conditions (4) and (6), we obtain

a2(P2, q2) - L
eEr'2
1 e
UI . nl2q2 + L
eEr23
1 e
U3 . n23q2 = (12, q2)S?2'

Rewriting (3) as K-IU3 = - \1P3, we easily obtain by multiplying by V3 E xt


integrating by parts and using (6):

a3(u3, V3) - b3(v3,P3) - L


eEr'3
1 e
P3 V 3 . nl3 - L
eEr23
1 e
P2 V 3 . n23 = O.
Analysis of a Multi-Numerics/Multi-Physics Problem 731

Finally, the regularity of u and a simple integration of (3) give:

The final result is obtained by adding the previous variational equations.

We now recall some approximation properties satisfied by the spaces X hand


Mh [4,8]. Given v E (Hl(SlI U Sl3))2, there exists v E X h such that

\lq E Mh, B(v - v, q) = 0, (19)

\Ie E r~ Un, \lq E (IF\1_I(e))2, 1[V 1 ]' q = 0, (20)

\leEn2url3, \lqE(lP k1 _ 1 (e))2, 1(V 1 -v 1 ).q=o, (21)

\Ie E r 23 , \IT] E X~. n23, 1(v3 - V3)' n23T] = O. (22)

If in addition, VI E (Hkl+1(SlI))2, v3 E (H kd 1 (Sl3))2, the approximation v


satisfies

Ilv - vllo,'h :::; Ch k1 +1lvikl+l,J.l1, Ilv - vllx~ :::; Chkllvlkl+1,J.lll


(23)
Ilv - vllx~ : : ; Chk3+1lvlk3+1,J.l3·
Define also the L2 projection p of the pressure p. If Pi E HSi (Sli), we have

\lq E Mt \IE E [~, i = 1,2,3, 1 q(p - p) = 0, (24)

m = 0, I, i = 1,2,3, Ilpi - Pillm,J.l i ::::; Chsi-mlpilsi,J.li . (25)

Lemma 2. The discrete solution to (15), (16) exists and is unique.

Proof In a finite-dimensional setting, it suffices to show that the solution is


unique. Set Ii = 0 and (v, q) = (U, P) in (15), (16). This yields A(U, P; U, P)+
A(U,P;U,P) = O. Since A(v,q;v,q) = 0, we are left with al(U 1 ,Ud +
a2(P2, P2) + a3(U 3, U 3) = O. This clearly implies that U 1 = U 3 = 0, and that
P2 is a global constant over Sl2.
We now define

and consider PE L2(SlI U Sl3) such that P = P + P. Then the equation (15)
becomes
732 B. Riviere

Since P E L6(01 U 0 3), there is a function Z E (HJ(OI U 0 3))2 such that


- V' . Z = F. Denote by z the approximation of z satisfying (19)-(22), and
choose v = z in (26). With property (19) and the regularity of Z, we obtain
2
1IPIIo,n,un3= B(z, P) = B(z, P) = o.
A A A

The equation (26), with F= 0 becomes:

(P2 - P)(
Jr'
r 2
VI' n12 - r
Jr
23
V3' n23) = 0, Vv E Xh.

Since P belongs to L6(O), the constants P2 and P are related by


1021P2+ (1011 + 1031)P = o.
These two equations imply that P2 = P = 0, which concludes the proof.
We now finish this section with some trace and inverse inequalities needed for
the analysis. Let E be a mesh element with diameter hE. Let k be a positive
integer. Then, there exists a constant C independent of hE such that

V¢ E IPk(E), Ve c DE, 11¢llo,e ~ Ch El / 211¢llo,E. (27)


V¢ E IPk(E), Ve C DE, IIV'¢' nello,e ~ Ch El / 211V'¢llo,E. (28)

3 Error estimates
Theorem 1. Let (u,p) be the solution of the coupled problem (1)-(8) such
that uln i E (Hki+l(0;))2 for i = 1,3, plni E Hki (0;) for i = 1,3, and
pln2 E Hk2+l(02)' Then the discrete solution (U, P) of (15), (16) satisfies the
following estimate

IIU l - uIilxk + 11P2 - + IIU3- u311x~ ~ Chk'(lulk,+l,n, + Iplk"n,)


p211M~
+Chk2Iplk2+l,n2 + Chk3(lulk3+l,n3 + Iplk3,nJ.

Proof. Define U the approximation of u satisfying the properties (19)-(23) and


Pthe approximation of p satisfying (24)-(25) Define X = U - u and = P - P e
e;
and Xi' their restrictions to the sub domains Oi. Subtracting (17) and (18)
from (15) and (16), choosing VI = Xl' q2 = 6 and V3 = X3 and using the
coercivity result (14), we obtain
K(llx1113c1 + Ilell~2 + Ilx3113c3) ~ al(ul - UI, Xl) + a2(P2 - fjz,6)
h h h

+a3(u3 - U3, X3) + bl(XI, PI-PI) - b3(X3,P3-P3) - bl (Ul-UI, 6)

+b3(U3-U3,6) +
Jrn2(P2-P2)XI' n12 - Jrn26(Ul-Ul)' n12
+
JrG36(U3- U3)' n23 - (Prp2)X3 . n23 +
Jrn3P3(XI-X3)' n13.
Analysis of a Multi-Numerics/Multi-Physics Problem 733

The pressure term b3(X3,P3 - P3) vanishes because of (24). The terms bl (UI -
uI,6) and b3(U3 - U3, 6) vanish because of property (19) of the approxima-
tion u. We now bound the remaining terms. By using Cauchy-Schwarz, trace
inequalities, and the approximation results (23) and (25), we can bound the
first three terms
1
al (UI - UI, Xl) + a2(P2 - P2, 6) + a3(u3 - U3, X3) :S 81lxllli~
1
+Ch2kllult+I,.rtl + 811611~~ + Ch2k2Ipl~2+1,.rt2
+~llx31Ii~ + Ch2k3+2Iul~dl,.rt3·
Because of the L2 projection, the first pressure term is reduced and bounded
as follows:

bl(XI,PI - PI) = L
eEr~ur,
1e {PI - pI}[xIl· ne

:S ~ L ~11I[XllI16,e+Ch2kllplt,.rt,.
eEr~ur,

The remaining terms are the interface terms. Using the approximation result
(25), the trace inequality (27) and Cauchy-Schwarz's inequality, we obtain

Similarly,

With the bound (11) and the approximation result (23), we have

r 6(UI- UI) ·nI2:S ~11611~~ +Ch2k'lult+I,.rt,·


Jr'2
The approximation result (23), the inverse inequality (27) and the bound (11)
give

Define P3 the L2 projection of P3 with respect to the L2 inner product on the


edge e. Since X belongs to X h and by definition of the projection, we have
734 B. Riviere

r P3(X1 - X3) . n13


1
1 1
['13

= L (P3 - P~)(X1 - X3) . n13 = L (P3 - P~)x1 . n13·


eE['13 e eE['13 e

Assume that each edge e of Fr3 is shared by the elements E; E [~ and E~ E [~.
Define the constant Ce = lel- 1 X3 . n13· Ie
L
eE['13
1 e
(P3 - P~)X1 . n13 = L
eE['13
1e
(P3 - P~)(X1 . n13 - ce )

:::; Ch k3 L IP3Ik3,E~II\7xrllo,E~ :::; ~111\7xII16,J.ll + Ch2k3Ip3ILJ.l3·


eE['13

The theorem is obtained by combining all bounds.

Theorem 2. Under the assumptions of Theorem 1 and the additional assump-


tion that for any element E E [~! the edge fJEnr23 belongs to only one element
of [~! the approximation P satisfies the error estimate

liP - pllo,J.I:::; Chkl(lulkl+1,J.ll + Iplk ,J.ll) 1

+Chk2Iplk2+1,J.l2 + Ch k3 (lulk 3+1,J.l3+ Ipl k3,J.l3)·

Proof. Subtracting (17) from (15), the error equation is

Vv E Xh, B(v,fJ = a1(u1 - U 1,V1) + a3(u3 - U 3,V3)

+ r (P2 - P2)V1 . n12 - r (P2 - P2)V3 . n23 + r P3(V1 - V3) . n13.


1 ['12 1['23 1
['12
(29)

Define ~ = (15711+ 15731)-1 IJ.l UJ.l3~ 1 and the function € E L6(57 1 U 573) by
€ = ~ -~.
B(v,O=B(v,€)+B(v,~)=B(v,€)-~(r V1·n12- r V3· n 23).
1 ['12 1['23
Let v E(HJ(571 U 573))2 such that -\7 . v = €
and Ilvllx~ + Ilvllx~ :::;
CII€llo,J.lI UJ.l3. Choose in (29) v = v the approximation of v, defined by (19)-
h

(23):

11€116,J.l UJ.l3= a1(u1 - U 1, v~) + a3(u3 - U 3, v~)


1

+ r (P2 - P2)V~ . n12 - r (P2 - P2)V~ . n23 + r P3(V~ - v~) . n13.


ln2 lG3 ln2
Analysis of a Multi-Numerics/Multi-Physics Problem 735

The additional assumption for the meshes at r 23 is needed to bound 23 (P2-Ir


Ie
P2)V~ . n23. Using the fact that v~ . n23 = 0 for e = fJE n r 23 and E EEl,
and the inequality Ilvhllxk + Ilvhllx~ ::; CII€llo,D,UD 31 we obtain:

Ilello,D,UD3::; CliUI -UIll xk +Cllu3-U31Ix~ +Cllp2-P21IM~ +Ch 2k 3Ip3Ik23,D3'
We now bound 11(llo,D2UD3= ((IDII + ID31)1/2. Since P E L§(D), we have

-- -1
e - IDII + ID31
1( _-)
D2 P2 P2 -
< ID21 I - - I
IDII + ID31 P 2 P2 O,D2
::; ClIP2- p211x~ + Chk2Ip2Ik2+I,D2'
Combining the results of Theorem 1 with the bounds above, give the optimal
error bound.
As a concluding remark, one can introduce Lagrange multipliers on the inter-
faces, and obtain an equivalent method. This allows the decoupling of each
subdomain, which may be advantageous for a parallel implementation.

References
1. G.S. Beavers and D.D. Joseph. Boundary conditions at a naturally impermeable
wall. J. Fluid. Mech, 30:197-207, 1967.
2. M. Discacciati and A. Quarteroni. Analysis of a domain decomposition method
for the coupling of stokes and darcy equations. In Brezzi et al., editor, Numerical
Analysis and Advanced Applications - Enumath 2001, pages 3-20. Springer, Milan,
2003.
3. M. Discacciati and A. Quarteroni. Convergence analysis of a sub domain iterative
method for the finite element approximation of the coupling of stokes and darcy
equations. Computing and Visualization in Science, 2004. to appear.
4. V. Girault, B. Riviere, and M. Wheeler. A discontinuous Galerkin method with
non-overlapping domain decomposition for the Stokes and Navier-Stokes prob-
lems. Mathematics of Computation, 2003, to appear.
5. W.J. Layton, F. Schieweck, and 1. Yotov. Coupling fluid flow with porous media
flow. SIAM J. Numer. Anal., 40(6):2195-2218, 2003.
6. R. A. Raviart and J. M. Thomas. A mixed finite element method for 2nd order
elliptic problems. In Mathematical Aspects of the Finite Element Method, Lecture
Notes in Mathematics, volume 606, pages 292-315. Springer-Verlag, New York,
1977.
7. B. Riviere. Analysis of a discontinuous finite element method for the coupled
Stokes and Darcy problems. Technical Report TR-MATH 03-11, University of
Pittsburgh, 2003.
8. B. Riviere and 1. Yotov. Locally conservative coupling of Stokes and Darcy flows.
SIAM J. Numer. Anal. revised.
9. P. Saffman. On the boundary condition at the surface of a porous media. Stud.
Appl. Math., 50:292-315, 1971.
The Discontinuous Galerkin Method
for Singularly Perturbed Problems

Hans-Gorg Roosl and Helena Zarin 2

1 Institute of Numerical Mathematics, Dresden University of Technology, Germany


roos@math. tu-dresden. de
2 Department of Mathematics and Informatics, University of Novi Sad, Serbia and
Montenegro [email protected]

Summary. A nonsymmetric discontinuous Galerkin method with interior penal-


ties is considered for convection-diffusion problems with parabolic layers. On an
anisotropic mesh with bilinear elements we prove error estimates (uniformly in the
perturbation parameter) in an integral norm associated with this method. On dif-
ferent types of interelement edges we derive the values of discontinuity-penalization
parameters. Numerical experiments support the theoretical results.

1 Introduction

Consider the following boundary value problem with homogeneous Dirichlet


boundary conditions

-c:6u + b1 U x + cu = f

° on
{ (1)
u = f=on.
Here °
< c: « 1 represents a perturbation parameter and b1 , c and fare
real-valued functions defined on IT. Assuming

1 ObI
br(x) 2: (JI > 0, c(x) 2: 0, c(x) - "2ax (x) 2: I > 0, x E IT, (2)

the problem (1) belongs to the class of convection-diffusion problems whose


solutions beside an exponential layer exhibit also parabolic layers.
In contrast to convection-diffusion problems with only exponential layers,
not much is known concerning sharp estimates for derivatives of u. However,
for the problem (1) with bl(x) = 1 and c(x) = 0, in [10] the following result
was proved using elliptic decompositions (in the corresponding result from [13]
more compatibility is required):

Theorem 1. If b1(x) = I, c(x) = °


and f E C 3 ,<>(IT), for some a E (0, I),
satisfies the compatibility condition f(O,O) = f(O, 1) = f(l, 0) = f(l, 1) = 0,
then the solution u of the boundary value problem (1) can be decomposed as
u = S + El + E2 + E 3 , where for all (x, y) E IT and °: :;
i + j :::; 3
The Discontinuous Galerkin Method for Singularly Perturbed Problems 737

8 jS 8 i+jE <
8xi+
_
'" . '" . x, y )
__ 1(
_ C E - i e -(I-x)/" ,
I
i 8yj (x, y) S; C,
I
1 1
ux'uyJ

with a constant i' > O.

A diverse nature of the vector field b = (b I , 0) produces a complete change


in the behaviour of the boundary layers. According to Theorem 1, beside the
regular boundary layer of exponential type at the outflow x = 1, there are
also parabolic layers placed near the sides y = 0 and y = 1. In the sequel we
assume that the solution decomposition from the previous theorem exists also
for the more general problem (1)-(2).
The only result known so far for finite element methods (FEMs) on layer-
adapted meshes for the problem (1) is the estimate

EI/21u - UhIHl(!1) S; CN-IlnN,

from [10] for the Galerkin FEM with linear or bilinear elements on a Shishkin
mesh (see also the survey paper [6]). Surprisingly, up to now there is no result
in the literature for the streamline-diffusion finite element method (SDFEM).
Furthermore, for the SDFEM the optimal choice of the stabilization parameter
near parabolic layers is an open task, see discussion in [4] and [5]. This was
also a reason for us to investigate some alternative discretization technique.
Here, for numerical solving of (1)-(2) we use the h-version of a nonsymmet-
ric discontinuous Galerkin finite element method with interior penalties (the
NIPG method), [2], [7], [8]' [9]. The technique from [2] is applied, but with
a bilinear interpolant in error splitting instead of an L 2 -projection onto a fi-
nite element space. This allows us to use the well-known interpolation error
estimates for the problem (1)-(2). We finally show that on a specially chosen
shape-irregular mesh (Shishkin mesh), this method yields error bound that
is uniform in the perturbation parameter. Since our discretization involves
the layer-adapted mesh whose construction directly uses information from the
solution decomposition, the technique for proving E-uniform error estimates
from this paper cannot be applied on more general (nonrectangular) domains
n, when such a decomposition is not available.

2 The nonsymmetric discontinuous Galerkin method

Following [2] and the notation therein, let T be a general partitioning of the
domain n = (0,1)2 consisting of disjoint open axiparallel rectangles K, such
738 H.-G. Roos, H. Zarin

that n = U"'ET K. In contrast to [2] we allow anisotropic (shape-irregular)


meshes, but assume that there are no hanging nodes.
The broken Sobolev space of composite order s = {s", : K E T} is defined
with HS(n, T) = {v E L2(n): viI< E HSK(K), \:/K E T}. Let us assume that
each K E T is an affine image of a fixed reference element k = (-1,1)2, i.e.
K = FI«k). Then the finite element space is

5(0" T,F) = {v E L2(n): viI< of,, E QI(k)} ,

where F = {F", : K E T} and QI (k) is the space of bilinear functions defined


on k.
Let [ be the set of all open one-dimensional element interfaces associated
with T, and [int C [ be the set of all edges e E [ contained in n. Also,
let rint = {x E n : x E e for some e E [int}o Then for each e E [int we
define the jump and the mean value of a function v E HI(n, T) across e by
[v]e = vlB",ne - vIB""ne and (v)e = (vlBl<ne + vIBI<'ne) /2, respectively. Here e is
a common edge for elements K and K', and 8K denotes the union of all open
edges of K. With each e E [int we associate the unit normal vector v pointing
from K to K'; if e C r, we take v to be the unit outward normal vector f.1.. We also
define the inflow and outflow parts of 8K by 8_K = {x E 8K: b(x)·f.1.I«x) < O},
8+K = {x E 8K : b(x) . f.1.1«x) ~ O}, respectively, where f.1.1«x) represents the
unit outward normal vector to 8K at the point x E 8K.
For any element K E T and v E HI (K), we denote by v;t the interior trace
of viI< on 8K. In the case 8_K \r -I- 0, for some K E T, for each x E 8_K\r there
exists a unique K' E T such that x E 8+K'. Now for a function v E HI(n, T)
and for some K E T with the property 8_K \ r -I- 0, we define the outer trace
v;; of von 8_K \ r as the inner trace v;t such that 8+K' n (8_K \ r) -I- 0. The
jump of v across 8_ K \ r is defined by lvJI< = v;t - v;;. In order to simplify
the notation, in the sequel we omit indices in the terms [v]e, (v)e and lvJI<.
Now, the weak formulation of (1) that corresponds to the NIPG method
reads
find Uh E 5(0" T, F) such that
{
B(Uh, Vh) = L 1
fVh dx, for all Vh E 5(0" T, F),
I<ET '"
(3)

where the bilinear form B is given by

B(v,w) = L (sl Vv· Vw dx + l(b. Vv + cv)w dX)


KET I< I<

+l avw ds + lin~[v][w] ds

+L
I<ET
(- } rB_l<nr (b· f.1.)v+w+ ds - } r I<\r (b. f.1.K)lVJW+ dS)
fL

+s 1r
(v(Vw· f.1.) - (Vv· f.1.)w) ds + s 1~~
([v](Vw· v) - (Vv· v)[w]) ds,
The Discontinuous Galerkin Method for Singularly Perturbed Problems 739

for v, wE H1(D, T). Here a is called the discontinuity~penalization parameter,


and is defined by ale = a e , e E f, where a e is a nonnegative constant. In the
sequel we shall present the exact choices of a e , for all edges e E f.
As in [2], assuming that u E H2(D, T) and that U and '\lu·// are continuous
across each interior edge e, we obtain that B satisfies the Galerkin orthogo-
nality property. Also, the bilinear form allows one to introduce the so~called
DG~norm
IIvll5G = B(v, v),
In [3] the authors proved the existence and uniqueness of the solution Uh of the
discrete problem (3). All error estimates in that paper and in [2] are derived
in the DG-norm.
Specifying the result from [2] to the problem (1) on a shape~regular mesh
of maximum element diameter h, we obtain

(4)

with the choice a e = Sh;l, where he represents the length of an edge e E f.


In general, the estimate (4) is useless when s --> 0, see Theorem 1. Therefore,
we use an a priori constructed layer~adapted mesh (Shishkin mesh) and show
that on such a partitioning robust convergence is guaranteed. For simplicity,
we use the standard conforming Shishkin mesh and avoid hanging nodes.

3 The discretization mesh and the interpolation error

Discretization mesh. For the discretization of the boundary value problem (1),
here we use an anisotropic tensor-product Shishkin with (N + 1) x (N + 1)
mesh nodes, that is adapted to the layers at x = 1, y = a and y = 1. Let N
be an integer divisible by 4 and let Ax and Ay be mesh transition parameters
defined by

where (31 is the lower bound for the function b1 and ;Y is a constant from
the solution decomposition (Theorem 1). The domain D is split into D =
D11 U D12 U D21 U D 22 , with

D11 = [1 - Ax, 1] x ([0, Ay] U [1 - Ay, 1]) , D12 = [1 - Ax, 1] x [Ay, 1 - Ay] ,
D21 = [0,1 - Ax] x ([0, Ay] U [1 - Ay, 1]) , D22 = [0,1 - Ax] x [Ay, 1 - Ay].

Then the intervals [0,1- Ax] and [1- Ax, 1] are uniformly dissected into Nj2
subintervals to give the mesh D;: in x-direction, while for D;: we dissect [0, Ay]
and [1 - Ay, 1] into Nj4 subintervals and [Ay, 1 - Ay] into Nj2 subintervals.
740 H.-G. Roos, H. Zarin

Taking the tensor of 0,r: and 0,lj, we obtain our final rectangular Shishkin
mesh.
On such a constructed partitioning we need to introduce several types of
edges (depending on the type, we later determine the discontinuity-penalization
parameter a e for each edge e E £). The edges of type I belong to the set
(1 - Ax, 1] x ([0, Ay) U (1 - Ay, 1D. They are part of the layer region (more pre-
cisely, they lie in the corner layers) and their length is either hx = 2A x N-l or
hy = 4AyN- 1. Assuming Ax = 2E/f31lnN, Ay = 2vfifYlnN and E::::: CN- 1,
we have hx « Hx := 2(1 - Ax )N- 1 and hy « Hy := 2(1 - 2Ay)N- 1.
Therefore we describe these edges as short; other edges are called long.
Edges of type I I are also short and belong to the rest of the layer region
0,12 U 0,21. Since this region contains also long edges, we refer to them as
being of the third type. Type I I I also contains long edges near the layers
from the set 0,~2 = 0,22 \ 0,:h. Finally, edges of type IV are long and lie in
0,22 = [0,1 - Ax - Hx] x [Ay + Hy, 1 - Ay - Hy].

Interpolation error. As it was already stated, in the error analysis instead


of an L 2-projection onto a finite element space, here we shall use a bilinear
interpolant u f of u that vanishes on r. In order to obtain an E-uniform estimate
for Ilu - uhlIDG, we shall use various bounds on interpolation error TJ = u - u f .
For example, in [11] one can find L2- and L'Xl-estimates of TJ

while [14] and [15] contain results for \lTJ. All these results hold under the
assumption y'cln 2 N ::::: C and they are proved using the decomposition from
Theorem 1 and the technique from [1].
In the following section we present some of the key points of the error analy-
sis for the NIPG method on the Shishkin mesh when applied to the problem (1).
More details can be found in [14] and in the forthcoming paper [15].

4 Error analysis

We start the error analysis by first introducing the error decomposition u -


Uh = TJ + e, TJ = u - u f , e = u f - Uh· The final estimate for Ilu - uhllDG is
further obtained from the triangle inequality.
First notice that from the Galerkin orthogonality property one has IlellbG =
-B(TJ,e). Analyzing each term in B(TJ, e) with the technique from [2], we con-
clude that lIellDG can be estimated by terms that depend on the interpolation
error TJ, data functions, E and the mesh. Collecting these expressions and the
expressions from IITJIIDG, we conclude that in the final error bound Ilu - uhllDG
the following terms are to be estimated:
The Discontinuous Galerkin Method for Singularly Perturbed Problems 741

1/2 1/2
h == ( lO'T)2 ds ) ( / 0'[T)]2 ds )
Jrmt
Is == ILl
I<ET I<
(b· \l~)T) dxl '

First it can be proved that, [14],


It ::; eN-lInN,
The treatment of the terms h ,14, ... ,I7 depends on the type of edge; more
precisely, we look for the contribution from each edge to these terms. Let hI,
742 H.-G. Roos, H. Zarin

Table 1. Different types of edges and corresponding parameters 0" e

edge type parameter O"e

I,II horizontal ,fiN/InN


I, II vertical N/lnN
III horizontal N (e C 0;2)' ,fiN/In N (otherwise)
III vertical N(ecOL), N / In N (otherwise)
IV horizontal eN
IV vertical eN

hn, hm and hlv denote the contributions of edges of the type I, II, I II
and IV to the terms I j , j = 3,4, ... ,7, respectively. Assuming that the values
of the discontinuity-penalization parameters are the same for all edges of the
same type, then from LOO-interpolation error estimates for rJ and "'\lrJ we have

+ h,I :::; CdN-~ ln2 N,


h,I + Is,I + h,I :::; Cc~N-lln2 N,
/ 4 ,1

h,n + h,I! :::; CdN-~ ln 2 N, hn + Is,n + h,I! :::; CdN- 1 ln2 N,


h,m + I 6 ,m :::; CN-"2ln N, I 4 ,m + Is,m + h,m :::; CN-lln~ N,
.:l.
3
2

h,Iv + I 6 ,IV :::; cd N-~ , I 4 ,IV + Is,IV + h,Iv :::; cd N- 1 .


The choices of the discontinuity-penalization parameter lJ e are summarized in
Table l.
We proceed with the term Is which reduces to

The first term in Is can be estimated with

while for the second term we have

Thus,
The Discontinuous Galerkin Method for Singularly Perturbed Problems 743

Is :::; CN-111~IIDG.
At the end, it can be proved, [14],

19 :::; CN- 2 , Ito:::; CN- 2 ,


In :::; CN- 2 ln 2 N , It2 :::; CN- 3/ 2ln2N ,
113 :::; CN- 2 ln 2 N, It4 :::; CN- 3 / 2 ln 2 N.

Collecting previously given estimates for 11 , h, ... , It4, we observe that


the long edges from the layer region (type II I edges) produce an error of the
highest order. Therefore, the main result for the NIPG method on anisotropic
Shishkin mesh for the convection-diffusion problem (1) with regular and
parabolic layers reads

Theorem 2. Let u be a solution of the convection-diffusion problem (1) and


let Uh be a solution of the discrete problem (3) on the Shishkin mesh. Assuming
y'Eln 2 N :::; C and (2), and choosing the penalty parameter as in Table 1, we
have

5 Numerical experiments

In the sequel we experimentally verify the theoretical result from Theorem 2.


We test the discontinuous Galerkin finite element method (3) on the aniso-
tropic Shishkin mesh when applied to the problem (1) with b1 (x) = c(x) = l.
The right-hand side f is chosen such that the function

u(x, y) =x (1 - e-(l-X)/€) (1 - e- Y/ VS ) (1 - e-(l- Y )/VS)

is the exact solution.


Table 2 presents the maximum values for E = 10- 3 , ... , lO- s of different
norms of the error eh == u - Uh. The first column corresponds to the DG-norm
lIehllDG, while the second corresponds to an estimate of the maximum norm
ofthe error, denoted by lIehlld,oo' We compute this estimate using an auxiliary
mesh that contains 25 uniformly distributed mesh points per element. We also
compute the values of the L2-norm of the error lIeh 11£2(0), as well as the LOO-
norm of the jumps lI[uhllloo along the edges. We observe that the numerical
results for the DG-norm of the error eh are better than those predicted in
Theorem 2. These results also indicate second-order accuracy of the jumps
along interior edges and 3/2 as the order of convergence of the L2-norm of the
error.
For this test problem, the NIPG method on anisotropic Shishkin mesh is
inferior in II eh I d,oo compared to the bilinear Galekin FEM and bilinear SDFEM
(with the streamline-diffusion parameters as in [12], or in [5]). Nevertheless,
744 H.-G. Roos, H. Zarin

Table 2. The NIPG method on the Shishkin mesh for the test problem

IlehllDG Ilehlld,= IlehIIL2(O) II[uhlll=


N error rate error rate error rate error rate
8 2.200(-1) 0.850 3.465(-1) 0.572 5.099(-3) 0.762 2.113(-1) 1.301
16 1.220(-1) 1.135 2.332(-1) 0.759 3.007(-3) 1.016 8.576(-2) 1.725
32 5.555(-2) 1.346 1.378(-1) 0.954 1.487(-3) 1.299 2.594(-2) 1.996
64 2.185(-2) 1.488 7.115(-2) 1.200 6.044(-4) 1.485 6.504(-3) 2.133
128 7.788(-3) 3.097( -2) 2.159( -4) 1.483( -3)

it can be expected that this method, or a more general DGFEM, will exceed
the streamline-diffusion FEM for the problems with more complicated layer
structures where the flexibility of the DGFEM with respect to the mesh is
useful.

References
1. Dobrowolski, M., Roos, H.-G. (1997): A priori estimates for the solution of
convection-diffusion problems and interpolation on Shishkin meshes. J. Anal.
Appl., 16, 1001-1012
2. Houston, P., Schwab, C., Siili, E. (2002): Discontinuous hp-Finite Element
Methods for Advection-Diffusion-Reaction Problems. SIAM J. Numer. Anal.,
39, 2133-2163
3. Houston, P., Schwab, C., Siili, E. (2000): Discontinuous hp-finite element meth-
ods for advection-diffusion problems. Technical Report NA-00/15, Oxford Uni-
versity Computing Laboratory, Oxford, UK
4. Kopteva, N.V. (2001): How accurate is the streamline-diffusion FEM inside
parabolic layers? Lecture presented at the 19th Biennial Dundee Conference on
Numerical Analysis
5. LinE, T. (2002): Anisotropic meshes and streamline-diffusion stabilization for
convection-diffusion problem. Preprint MATH-NM-11-2002, Institut fUr Nu-
merische Mathematik, Technische Universitiit Dresden, Germany
6. LinE, T. (2003): Layer-adapted meshes for convection-diffusion problems.
Compo Meth. Appl. Mech. Eng., 192, 1061-1105
7. Oden, J.T., Babuska, 1., Baumann, C.E. (1998): A discontinuous hp finite ele-
ment method for diffusion problems. J. Compo Phys., 146, 491-519
8. Prudhomme, S., Pascal, F., Oden, J.T., Romkes, A. (2000): Review of a priori
error estimation for discontinuous Galerkin methods. TICAM Report 00-27,
Texas Institute for Computational and Applied Mathematics, Austin, USA
9. Riviere, B., Wheeler, M.F., Girault, V. (2001): A priori error estimates for fi-
nite element methods based on discontinuous approximation spaces for elliptic
problems. SIAM J. Numer. Anal., 39, 902-931
10. Roos, H.-G. (2002): Optimal convergence of basic schemes for elliptic boundary
value problems with strong parabolic layers. J. Math. Anal. Appl., 267, 194-208
11. Roos, H.-G., Skalicky, T. (1997): A comparison of the finite element method
on Shishkin and Gartland-type meshes for convection-diffusion problems. CWI
Quarterly, 10, 277-300
The Discontinuous Galerkin Method for Singularly Perturbed Problems 745

12. Roos, H.-G., Stynes, M., Tobiska, L. (1996): Numerical Methods for Singularly
Perturbed Differential Equations. Springer-Verlag, Berlin
13. Shishkin, G.l. (1992): Discrete Approximation of Singularly Perturbed elliptic
and Parabolic Equations. Russian Academy of Sciences, Ural Section, Ekater-
inburg, (in Russian)
14. Zarin, H. (2003): Finite element methods for singularly perturbed problems with
special emphasis on discontinuities. PhD Thesis, University of Novi Sad, Serbia
and Montenegro
15. Zarin, H., Roos, H.~G. (2003): Interior penalty discontinuous approximations
of convection~diffusion problems with parabolic layers. submitted to Numer.
Math.
A Finite-Volume Mass- and
Vorticity-Conserving Shallow-Water Model
using Penta- /Hexagonal Grids

William Sawyer and Rolf Jeltsch

Swiss Federal Institute of Technology (ETH Zurich) [email protected]

Summary. A finite-volume scheme using penta- /hexagonal (PH) grids is presented


for the shallow water model on the sphere. The irregular structure of the PH grid
presents new challenges, e.g. for the calculation of energy gradients. Radial basis
functions (REFs) are employed for the accurate and efficient approximation of these
values. The resulting algorithm is shown to be mass- and vorticity-conserving, and
initial numerical results are presented.

1 Introduction
We present a mass- and vorticity-conserving finite-volume algorithm to solve
the shallow-water equations on the sphere (SWES), where the sphere is
spanned by a grid of pentagonal and hexagonal cells (hereafter referred to
as a PH grid).
The vector-invariant formulation of the SWES is

oh
-at + \7 . (vh) = 0 (1)
ov
at + ( + f)k x v + \7(1'i: + <1» = 0, (2)

where v is the velocity field, ( = k- (\7 x v) is the relative vorticity, f = 2w sin ()


the Coriolis parameter (w is the angular velocity of the earth, () the latitude),
k the vertical unit vector, h the depth of the fluid (directly proportional to
the fluid mass in the cell), I'i: the kinetic energy, and <1> = gh the potential.
'f} = ( + f is also referred to as the absolute vorticity.
This paper is an extension of work by Lin and Rood [10] who proposed
a mass- and vorticity-conserving algorithm for a standard orthogonal latitude-
longitude (lat-Ion) grid. While [10] has proved successful in atmospheric mod-
eling, it makes assumptions about the orthogonality of the underlying grid.
Such grids have a singularity at the poles, which must be dealt with specially.
On PH grids, which have no pole problem, the algorithm is not immediately
applicable.
A short overview of PH grids is given in Section 2. In Section 3 it is then
shown that the inherent limitation in [10] to orthogonal grids lies only in its
A Finite-Volume Mass- and Vorticity-Conserving Shallow-Water Model 747

advection algorithm. Assuming the availability of an appropriate advection al-


gorithm for PH grids, a related formulation exists for general non-orthogonal
grids. One problem with such grids, however, is the additional difficulty cal-
culating the gradients in equation (2). A technique for calculating these to
higher order is described. First numerical results of tests proposed in [14] are
presented in Section 4.
Although only the two-dimensional SWES are discussed here, the ultimate
goal of our research is to create a three-dimensional solver for atmospheric
dynamics. A technique for extending the 2D SWES to the 3D problem is
treated in [8] and will not be treated further here.

2 Overview of Pent a- /Hexagonal Grids

The use of penta-/hexagonal grids - which cover the sphere with twelve spheri-
cal pentagons and an arbitrary number of hexagons - for atmospheric modeling
is not new. A summary of finite-difference methods on such grids was given by
Williamson in [13]. A milestone in the use of PH grids was achieved by Heikes
and Randall [4] who built much of the foundation for a genuine General Cir-
culation Model (GCM). Baumgartner, Majewski, et al. [6] built a production
model based on these grids which now provides numerical weather forecasts
at the German Weather Service (DWD).

Fig. 1. The standard lat-lon grid (left) illustrates the Mercator projection of surface
coordinates to the rectangular domain [-1f/2,pi/2] x [-1f,1f]. The PH grid (right)·
instead decomposes the sphere into 12 spherical pentagonal cells and the rest into
hexagonal cells.

A PH grid can be obtained by first distributing n points over the globe


in a relatively even manner. A common method, used in [4, 6] among others,
is to start with an icosahedron (consisting of 20 equilateral triangles). Each
triangle is then subdivided recursively into 4 equilateral triangles until the
748 W. Sawyer, R. Jeltsch

desired resolution is obtained. This procedure gives rise to an icosahedral grid.


Heikes et al. [4] use roughly the same technique although the grid is twisted
to maintain a northern/southern hemisphere symmetry. From this grid a PH
grid can be constructed, e.g. by considering the perpendicular bisectors of
each triangle's edge. In this case, the cells form the Voronoi complement grid
of the icosahedral grid. The Voronoi cell C k is the set of points on the sphere
equidistant or closer to triangle vertex k than any other.
Icosahedral grids are not the only basis for the class of PH grids. Given
a non-degenerate set of points on the sphere, the spherical convex hull can be
constructed. The facets of this hull are triangles. Care must be taken with the
choice of the triangle vertices so that the intersection (subsequently referred
to as the cell vertex) of the three perpendicular bisectors will be inside the
triangle. Around any given triangle vertex the cell vertices will form either
pentagons or hexagons (not necessarily regular, but by their construction con-
vex). Finally, the surface is "inflated" to form a sphere (see Figure 1). Several
issues then arise due to the spherical geometry which are described in [11] but
will be passed over here for sake of brevity.
A non-degenerate distribution of four or more points yields a convex hull
with triangular faces. With twelve or more points, the Voronoi dual diagram
will be a PH grid. The points can be evenly distributed by minimizing a poten-
tial function, in which each point on the sphere can be considered as a point
charge. An exhaustive study of such distributions has been made by Sloane,
et al. [3]. Finally the cell centers can be defined as the cell's barycenter; there
are numerical advantages for this choice. The barycenter does not necessarily
coincide with the corresponding triangle vertex.
Heikes and Randall [5] point out that even if the perpendicular bisector
of the triangle edge is used, the midpoint of the chord joining the cell centers
will not generally coincide with the midpoint of the chord connecting the cell
vertices (cell edge midpoint or flux point). This fact leads to problems if simple
finite differences are used to calculate, for example, the gradient in the flux
point, and can have a negative influence on the order of the algorithm. In
[5] a revised method of placement for the triangle vertices is suggested which
minimizes a (somewhat artificial) norm describing midpoint alignment. In [12],
a more founded approach is taken: the vertices are conceptually connected by
springs and allowed to adjust their positions by spring dynamics. The resulting
configuration still has the advantages of the icosahedral grid, while improving
the algorithm's order.
Several authors, e.g. [6, 13]' have pointed out the advantages of PH grids
for atmospheric modeling. In the first place, they avoid the "pole" problem
of lat-lon grids, namely the convergence of the meridians at the poles (small
grid cells can violate a CFL condition). Moreover, PH grid cells only have
edge neighbors. That is, there is no pair of cells which share only a cell vertex.
This allows a straightforward application of finite volumes, as will be seen
subsequently.
A Finite-Volume Mass- and Vorticity-Conserving Shallow-Water Model 749

3 Shallow Water Model on the Sphere with PH Grids

By taking the curl and divergence of equation (2), respectively, we arrive at


the vorticity/divergence form of the SWEs:

(3)

(4)

An expression for the velocity can be found by solving elliptic equations to


determine the stream function and velocity potential:

Vorticity: TJ = 6'IjJ with stream-function 'IjJ


Divergence: o= 6X with velocity potential X
Velocity: v = k x '\l'IjJ+ '\lX.

This approach is taken in [6]' for example, at considerable computational ex-


pense, and it seems worthwhile to look for a more computationally expedient
method.
For lat-lon grids, an explicit mass- and vorticity-conserving shallow-water
model was proposed in [10]. To derive this algorithm, equation (2) was formu-
lated in lat-lon coordinates, thus inherently limiting it to orthogonal grids. The
elegance of the method lies in the numerical treatment of the vector-invariant
formulation under consideration of the vorticity equation (3), which implies
that vorticity TJ is merely advected. Thus a local constraint is imposed: local
changes in the vorticity can only effect the region within the propagation zone
of the advection.
The explicit time stepping method for the grid cell ni,j at the i, jth loca-
tion, presented in [10]' is:

hn+l = hn + F(u*, Llt; he) + G(v*, Llt; h)..) (5)

u n+ 1 =un +Llt{Y(V*,Llt;TJ)..)- 1 0).. [K*+<li n+1e )..]} (6)


ALlA cos B

vn+1 = vn - Llt { X( u*, Llt; TJe) + A~ie [K* + <lin+l).. e] } , (7)

where F and G are flux-form operators in longitude and latitude, specifically


the difference of incoming and outgoing fluxes F = Fi+l/2 - F i - 1 / 2 and G =
-).. -e
9]+1/2 - 9j - 0)..(.) and oe(.) are grid differences, while the (.) and (.) are
1/ 2 ;
grid means, defined in the edge midpoints in A and B respectively.
750 W. Sawyer, R. Jeltsch

The source terms X and Y in equations (6) and (7) are the time-averaged
fluxes for the finite-volume discretization of equation (2). Ignoring higher order
terms, these can be approximated for some scalar quantity qn at time step n
as:

1
X(u*, Ll;qn) = Llt I
t
t
+Llt u* q dt 1
Y(v*,Ll;qn) = Llt I
t
t
+Llt v*qdt. (8)

The discrete, cell-averaged relative vorticity is defined as:

(9)

Crucial to the algorithm is the use of staggered grids (C- and D-grids,
described in [7]), on which the velocities are defined in the same point as the
flux. Thus there is a one-to-one dependency between fluxes and velocities. If the
fluxes depended on more than one velocity value, it would be necessary to solve
for velocity from a system of equations. But in this case it is straightforward
to calculate predictor values, u* and v* (on a C-grid) which are then inserted
into the corrector step, on a D-grid, in (5), (6) and (7). As this technique is
discussed at length in [10]' we will pass over this topic.
There is some additional complexity at the pole, where a pole cap cell (a reg-
ular polygon with m = 27r / Ll.:\ vertices) needs to be considered separately. The
poles are also treated in a way which conserves mass and vorticity.
After some analysis of the algorithm in [10]' several key features present
themselves. The global conservation of discrete mass follows directly from equa-
tion (1). With some calculation (see [11]), it can be shown that the global dis-
crete vorticity is conserved as well. Furthermore, orthogonality is inherently as-
sumed in the algorithm, particularly in its use of a flux-form Semi-Lagrangian
(FFSL) advection scheme presented in [9]. FFSL is a finite-volume scheme
which splits the time-averaged fluxes of the generic scalar quantity q into an
east-west X and north-south y. This splitting can only realistically be applied
on an orthogonal grid.
Our goal is to obtain a similar algorithm to solve the SWES which is not
limited to orthogonal grids, nor even to grids which are orthogonal to their dual
grid (i.e. possess the Voronoi-Delaunay property). To this end it is necessary
to introduce nomenclature which is sufficiently general for PH grids.
A Finite-Volume Mass- and Vorticity-Conserving Shallow-Water Model 751

Di cell with index i


fi Coriolis parameter in Di
Ni index set of neighboring cells to Di
hi fluid depth ("mass") in cell Di at time step n
'rJi discrete absolute vorticity in cell Di at time step n
li,k length of edge {i, k}
lli,k outward vector perpendicular to edge {i, k}
ti,k counter-clockwise unit vector along edge {i, k}
W'l.i,k "C-grid" wind speed ..L to edge {i, k}
WOi,k "D-grid" wind speed II to edge {i, k}
.F['k(q, w, Llt) flux of quantity q with wind w for Llt through edge {i, k}
rrik(q) discrete approx. to gradient of q at flux-point {i, k}
Using this notation it is possible to formulate expressions for the discrete vor-
ticity and divergence, which are simply generalizations of the formulas in [10]:

(10)

These are second order if the cell center is the barycenter.


At this point the one-to-one correspondence with [10] is lost. In [10], the C-
and D-grid winds (corresponding to w'l.t,k and w nll ,t,k
. ) are interchanged by two
averagings. This only works if the grid is orthogonal, thus for non-orthogonal
grids another approach must be found. In [1] a linear reconstruction (a Raviart-
Thomas element of order 0) for the wind inside the cell is made, thus retaining
only the edge-parallel winds as unknowns. However there is a limitation: the
coefficients of the linearization can only be determined in the case of a trian-
gular grid. Unfortunately, for PH grids the linear system is over-defined, and
higher order approximation would have to be employed.
We propose instead to work with both W'l.,k and WOi,k' treating each
equally. In order to do this, we need the help of equation (4) to determine
W'li k' The predictor-corrector character of algorithm in [10] is retained to
provide higher order in time. This gives rise to the following half-time up-
dates:

h in+1 = hni - IDil.ri,k


Llt q-n (h n+l/2
i, W J..
;\ )
lli,k,.wt (11)

w lli: k1 = WOi,k - Llt {:Fi';k('rJi , w~+1/2lli,k' Llt) + Qlli,k (K: + <I»} (12)

w1-:~ = w'li,k - Llt {:Fi';k('rJi , -w~+1/2lli,k' Llt) + QJ..i,k (K: + <I»} . (13)

Inserting the above updates into the definitions for discrete vorticity, we find:
752 W. Sawyer, R. Jeltsch

'T7~+1 = I~'I L (W~i.k - Llt {F~k('T7i' w1. i.kll i,k, Llt) + 911i.k (l-£+dJ)}) li,k + fi
t kENi

= 'T7f -I~:I {kENi


L :F[';k ('T7i , W1."klli,k, Llt)li,k - L 9I1i'k(l-£+dJ)li,k}
kENi

= 'T7f - I~tl L F~k('T7i' w1.i,klli,k, Llt)li,k - Llt CGi


t kENi

Vi Vi

Vi kENi
v
o
Similarly, we arrive at the anticipated form for the updated divergence:

5~+1 = 5f - I~tl L :F[';k ('T7i , -WlIlli,k, Llt)li,k - I~tl L 9J.. i,k (1-£ + dJ)li,k.
t kENi t kENi
A clear hurdle in the above PH grid updates are the calculations of the
gradients 9. Not only should these be as accurate as possible, they should also
satisfy the constraint that the discrete curl of G i , namely CG i , is identically
O. Given a generic scalar cell mean quantities qi, it is possible to find a higher
order expression for the edge gradients [11] at the flux point. For example, one
can define a spherical radial basis function (SRBF, see [2]) defined by a local
set of qi clustered around the flux point:

N
q(x) = LAkY(rj) +AN+1 + (AN+2, ... ,AN+d+1fx with rj = Ilx-xjll,
j=l
(14)
where Y(r) is a radial basis function, e.g. Y(r) = r 2 logr/161T for thin-plate
splines, and the A are found such that q(Xj) = qj. If one considers the subset
qj from cells centered on (and including) cell Di and uses the resulting SRBF
to approximate all the edge-parallel gradients of the central cell, the discrete
curl-gradient CG i disappears:

CGi(q) = L ('lq. t)i,kli,k


kENi
N Y'(rj)
= LA
J - - rJ.. L

t'klk +(AN+2' AN+3) . L ti,kli,k = O.
1" 1"

j=l J kENi kENi


~ ~
o o
A Finite-Volume Mass- and Vorticity-Conserving Shallow-Water Model 753

But the use of this gradient approximation will not produce consistent gra-
dients between neighboring cells, which is a requirement. On the other hand,
it is possible to use different SRBFs q(k) (x) - each built from a subset q?)
clustered around a flux point {i, k} - to approximate the gradients. In this
case there is only one consistent gradient value per flux point. However, CG i
does not necessarily disappear.

-=I- 0 in general.

SRBFs offer a flexible interpolation method with scattered data and do


not require specific grid qualities, such as the orthogonality of the grid with its
dual (the Voronoi-Delaunay property). If a higher order approximation, e.g.
with SRBFs, for q in each cell vertices is used, then a second order gradient
approximation can be found which fulfills CG i == 0 and is consistent between
cells.

Vi ,j,k = Some approximation of vertex value shared by cells i, j and ~5)

9 Vi,k,s(i,k) - Vi,k,p(i,k)
Ikdq) = li,k
(16)

CGi(q) = L Vi,k,s(i,k) - Vi,k,p(i,k) = 0, (17)


kENi

where p( i, k) and s( i, k) are the indices of the preceding and successive neighbor
cells of i and k around i in counter-clockwise fashion. The final formulation of
the predictor is

(18)

n+1/2 n Llt { ( n Llt) ( )}


wlli,k =wlli,k - 2 .'FT:k r]i,W1. ni ,k'2 +9I1i,k K,+tP (19)

n+1/2
W 1. i ,k = W 1.n i ,k -
Llt
2 {'L'n ( n
.ri,k r]i, -WII ni,k,2 + 91.'i,k (K, + /F.)}
Llt) '¥ , (20)

with the time-averaged kinetic and potential energies,

(21)
The corrector step is
754 W. Sawyer, R. Jeltsch

h n+1 hn L1t '"' Tn (h n+l/2 ;\ ) (22)


i,k = i,k-ID'I ~.ri,k i,W~ ni,k,L..lt
, kENi

w~i~kl = wrri,k - L1t {:F~k(TJi' w~+1/2ni,k' L1t) + Qlli,k (/'£* + iP*)} (23)

w~"t.! = w1 i,k - L1t {:F~k(TJi' -wt 1/ 2ni,k, L1t) + Q~i,k(/'£* + iP*)}. (24)
Thus for PH grids, the SWES problem, as in [10], can be completely de-
termined from gradient approximations and advective fluxes Fi,k across cell
boundaries. Fluxes between vertex neighbors are avoided since these do not
exist on PH grids. In this paper the flux determination is considered a "black
box": the advection problem on PH grids is far from simple, and finding a sec-
ond order monotone method for solving it is a matter of our current research.

4 Numerical Tests and Discussion

A prototype of the suggested algorithm was programmed in Matlab; later it will


be programmed in Fortran 90. We employ the tests suggested in [14], in partic-
ular the Cosine Bell and Rossby-Haurwitz tests. For the initial tests, a simple
first-order, hence rather diffusive, advection algorithm was implemented. The
gradient approximation is performed using SRBF approximations and vertex
differences from equation (16). The results for the cosine bell rotation are
illustrated in Figures 2 and 3.

References
1. L. Bonaventura. Development of the ICON dynamical core modelling strategies
and preliminary results. Unpublished manuscript, 2003.
2. G. Fasshauer and L. Schumaker. Scattered Data Fitting on the Sphere, pages
117-166. Vanderbilt Univ. Press, 1998.
3. R. H. Hardin, N. J. A. Sloane, and W. D. Smith. Spherical Codes. Book in
preparation; see http://www . research. att. comrnj as/ electrons/, 2003.
4. R. Heikes and D. A. Randall. Numerical Integration of the Shallow-water Equa-
tions of a Twisted Icosahedral Grid. Part I: Basic Design and Results of Tests.
Monthly Weather Review, 123:1862-1880, 1995.
5. R. Heikes and D. A. Randall. Numerical Integration of the Shallow-water Equa-
tions of a Twisted Icosahedral Grid. Part I: Basic Design and Results of Tests.
Monthly Weather Review, 123:1881-1887, 1995.
6. D, Majewski, D. Liermann, P. Prahl, B. Ritter, M. Buchhold, T. Hanisch,
G. Paul, and W. Wergen. The Operational Global Icosahedral-Hexagonal Grid-
point Model GME: Description and High-Resolution Tests. Mon. Wea. Rev.,
130:319-338, Feb. 2002.
A Finite-Volume Mass- and Vorticity-Conserving Shallow-Water Model 755

X 104

1.65
Hours Ratio Ratio
of peak of min
1.6
0 1.00 1.00
0.' 1.55 60 0.75 0.96
0.6
0.4 1.5
120 0.63 0.90
0.2 180 0.55 0.88
1.45
240 0.51 0.85
-0.2
-0.4
1.4 300 0.47 0.92
-0.6
1.35 360 0.45 0.90
-0.8
420 0.44 0.89
1.3
480 0.41 0.95
0.5
1.25
540 0.39 0.96
-0.5
1.2 600 0.38 0.99
-0.5

Fig. 3. For the test in Fig. 2,


Fig. 2. The Cosine-Bell test from [14] is ap-
the table contains the ratios, as
plied to a grid containing 4482 cells. After a function of time, of the max-
one full rotation in approx. 600 hrs., the peak ima (i.e. cone peak) and min-
has diffused considerable, but retains its basic
ima to the starting values
form.

7. F. Mesinger and A. Arakawa. Numerical Methods used in Atmospheric Models,


volume 1, chapter 1. GARP, 1976. Publication Series No. 17.
8. S.-J. Lin. A finite-volume integration method for computing pressure gradient
force in general vertical coordinates. Q. J. R. Met. Soc., 123:1749-1762, 1997.
9. S.-J. Lin and R. B. Rood. Multidimensional Flux Form Semi-Lagrangian Trans-
port Schemes. Mon. Wea. Rev., 124:2046-2070, 1996.
10. S.-J. Lin and R. B. Rood. An explicit flux-form semi-Lagrangian shallow water
on the sphere. Q. J. R. Met. Soc., 123:2477-2498, 1997.
11. W. Sawyer. A Shallow Water Model on the Sphere using Penta-/Hexagonal
Grids. Research report, SAM ETHZ, 2004. In preparation.
12. H. Tomita, M. Tsugawa, M. Satoh, and K. Goto. Shallow Water Model on a
Modified Icosahedral Geodesic Grid by Using Spring Dynamics. J. Camp. Phys.,
174:579-613, 2001.
13. D. Williamson. Difference Approximations for Fluid Flow on a Sphere. CARP
Publication Series, 2(17) :51-120, Sept. 1978.
14. D. Williamson, J. Drake, J. Hack, R. Jacob, and P. Swarztrauber. A Stan-
dard Test Set for Numerical Approximations to the Shallow Water Equations in
Spherical Geometry. J. Camp. Phys., 102:211-224, 1992.
Application of Parallel Computing Techniques
for Problems of Degenerated Diffusion

Milan Senkyf, Jifi Mikyska and Michal Benes

Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering,


Czech Technical University, Trojanova 13, 120 00 Prague, Czech Republic
Contact e-mails:[email protected]@kmlinux.fjfi.cvut.cz.
Michal.Benes@fjfi·cvut.cz

Summary. In this contribution, we discuss parallelization of the problem of curve


dynamics in plane. Related PDEs are based on the levelset method introduced in
[5], and on the phase-field method described in [1]. Numerical schemes use a finite-
difference discretization in space and explicit time solvers. Parallel algorithms are
designed for systems with distributed memory, and are based on the domain splitting.
The achieved results indicate strength and efficiency of the described approach in case
of such highly nonlinear problems.

1 Mean-curvature flow
We study the following motion law for closed planar curves denoted as r:
VI' = -g(8)Kr + F, (1)

in the direction of the Euclidean normal vector to r. Here, nr denotes the


normal vector to r, VI' the normal velocity, Kr the mean curvature, F a forcing
term, and 9 is a suitable positive 21f-periodic function of curve anisotropy, 8 is
the angle between nr and a prescribed direction. We take g( 8) = 1jJ( 8) + 1jJ" (8),
where 1jJ(8) = 1 + (cOS(Njold 8), ( is the anisotropy strength and N jo1d a type
of symmetry. The equation (1) in the form of the Gibbs-Thompson law is
contained in the modified Stefan problem. For details, we refer the reader to
[1]. In [4], we may find an application in noise filtering, edge detection and
morphing of computer-processed image data.
Hamilton-Jacobi equation. Assume that the curve r(t) is represented by
a levelset of a function P = P(t, x), i.e., r(t) = {x E ]R2 I P(t, x) = canst.}.
We can express the quantities appearing in (1) by means of P:

\1P
nr = -1\1PI' Kr = div(nr).

Then, we can introduce the Hamilton-Jacobi equation (see [5, 3])


OP \1P
at = g(8)1\1 PI\1 . (1\1 PI) + 1\1PIF· (2)
Parallel Computing Techniques 757

Allen-Cahn equation. An extensive experience with non-linear reaction-


diffusion equations led to the development of a phase-field approximation of
(1) by the Allen-Cahn equation [2], or by a modified Allen-Cahn equation [1].
The evolution of the levelset ~ of its solution approximates the evolution of
the manifold r(t), as discussed in [1].
First, we denote a rectangular domain D = (0, L 1 ) x (0, L 2 ) C ]R2, [x, y] E
D, the time variable t E (0, T). The problem for an unknown function p =
p(t, x, y) reads as follows

~~~ = g(e) (~LlP + ~fo(P)) + F(u)~I\7pl, in (0, T) x D,

plan = °on (0, T) x aD, plt=o = Pini(X) in D.

°
Here, ~ > is a parameter related to the thickness of the interface layer (it is
usually set to a value « 1). The polynomial fo(p) = ap(l - p)(p - ~) with
°
a > is derived from the double-well potential Wo as wb = - fo. The function
F = F(x, y) is bounded. The function Pini is an initial condition. We refer the
reader to [1], for details concerning the equation and physical background of
it.

Numerical schemes. We treat the PDE problems (2) and (3), both closely
related to (1), by several numerical schemes implemented by means of par-
allelization tools for the systems with distributed memory. The problems are
solved in a spatial domain D = (0, L 1 ) x (0, L 2 ), which is discretized by a rect-
angular uniform grid with mesh sizes hI, h2 in directions x and y.
We introduce the following notations for a given function u:

hI = t~, h2 = t~, Uij = u(ih 1,jh2),


Wh = {[ih 1 ,jh2] I i = 1, ... ,N1 -1; j = 1, ... ,N2 -I},
Wh = {[ih 1 ,jh2] I i = 0, ... , N 1 ; j = 0, ... , N 2}, rh = Wh - Wh,
Uij - Ui-l,j Ui+l,j - Uij Ui+l,j - Ui-l,j
Ux,ij = hI Ux,ij = U
hI = --'-:...='-::-:----==
0 ••

X,'J 2hl
Uij - Ui,j-l Ui,j+l - Uij Ui,j+l - Ui,j-l
Uy,ij = h2 Uy,ij = h2 UO = ---'''--'------,---'''--
y,ij 2h2
1 1
Uxx,ij = h 2 (UHl,j - 2Uij + Ui-l,j) , Uyy,ij = h 2 (Ui,j+l - 2Uij + Ui,j-d,
1 2

'~hu = lux, u y], fh U = [u&" u y]' Llhu = U xx + U yy .


Direct discretization of the levelset equation. The curvature expressed
in terms of second-order derivatives

o~xP (OyP)2 - 2 O~yP oxP OyP + O;yP (oxP)2


Kr = - ((ox P )2 + (oyP)2)3/ 2 '
758 M. Senkyr et at.

allows us to use central differences to approximate both first- and second-order


derivatives. We then propose an explicit scheme in the following form (n is the
time-level index, T is the time step):

which is subject of a regularization when (PJY + (P:y)2 = O. The relationship


of T and h is given by a stability condition.
The equation (1) defines the motion law on r(t) only. On the other hand,
the function P is obtained from the equation (2) valid in D. In our work, we ex-
tend the forcing term F from (1) to (2) as it is. Other extensions (construction
of extension velocities) are discussed, e.g. in [6].
Discretization of the regularized levelset equation. Let c: > 0 be a small
regularization parameter. Instead of (2), we solve the following problem:

It can be approximated using the following explicit nine-point-stencil finite-


difference scheme:

where Q(u, v) = Vc: 2 + u 2 + v 2 , V Pi~l2,J. = [P;,i,j, ~(P:.


Y,t+l,J
. + P:Y,1,,).)],
k 1 . = [P;iJ'~(P: +P:O
vp1,-2') k '+ l and Vpk. 1 evaluated ana-
)], Vp1,,)
, , Y,i,j Y,i-l,j '2 1,')-"2

logically. The scheme is only conditionally stable.


Discretization for the Allen-Cahn equation is derived by spatial finite
differences. Nodal values then remain functions of time, for which we obtain
a system of ODEs (the semi-discrete scheme) in the following form:

dph
e dt = eg(B) (L1 hph + fo(ph)) + elVhphW on Wh,
ph l'Yh= 0, ph(O) = Pini.

The equations are numerically solved by the Runge-Kutta-Mersn 4-th order


method with adaptive time step. The scheme has been analyzed in [1] from
the convergence viewpoint.
Parallel Computing Techniques 759

2 Parallelization techniques

The above described algorithms are parallelized by means of the Message Pass-
ing Library MPI both using Fortran 77/90 and C programming languages.
Computations using MPI, version 1.1 were performed on the supercomput-
ing systems IBM SP3 and Cray T3E at CINECA 1 , IBM SP, IBM SP2 at the
Czech Technical University in Prague, and computations using LAM MPI li-
brary 2 were performed on a local network of Linux PC workstations at the
Czech Technical University in Prague. In both approaches described below,
the computational task is performed by one or more processes, each of them
running either on a separate processor (a hardware unit, or virtual unit in the
emulated mode).
Cartesian domain splitting is an approach where a rectangular domain
fl is decomposed into rectangular subdomains, each of them treated by one
process. Boundaries of sub domains overlap by one grid line, on which they
exchange data. The amount of communication between processes depends on
the blocking strategy. We tested the row-wise blocking strategy, where the
domain is decomposed row-wise. Each block interacts with neighbouring blocks
during a timestep. The other tested strategy was the chequerboard blocking.
In this case, each block communicates with maximum eight neighbours during
a timestep.

------- ...,-------- ... -------- ... --------.,


,
,,

Fig. 1. Cartesian domain splitting (left), and narrow-band splitting (right)

Narrow-band technique introduced in [6] explores the fact that we are in-
terested only in the evolution of the curve r(t). It is therefore enough to follow
the evolution of P = P(t, x) in the vicinity of the levelset r(t). The presented
approach provides a significant speedup. On the other hand, it is less accurate
and more difficult to implement, because it requires a reconstruction of the
narrow band when r approaches its edge (the operation is called reinitializa-
tion). In our implementation, we cover the curve by overlapping squares of
a constant width which are assigned to processes in an intuitive way (Fig. 1).

1 Supercomputing Center of Italian Universities, Bologna


2 LAM MPI, Local Area Multicomputer is an open source implementation of MPI
standard, http://www .lam-mpi. org
760 M. Senkyr et al.

For example, in case of 64 covering squares and 16 processes, the first four
squares are computed by the first process, the second four squares by the sec-
ond process etc. Consequently, the narrow band created by such squares is
not of constant width. The processes exchange data for all nodes, where the
squares overlap. The approach is easy to implement including processing of the
grid by parts small enough to fit them in the fast cache memory of processors.
For the purpose of algorithm evaluation, we define the following quantities:

Speed up =
run time in a single process
.. , Eff. = Speedup
run tIme m n processes number of processes

Speedup and efficiency of parallelization for the direct algorithm of


the levelset equation - Study 1 (IBM SP). In this study, we consider
a circle of the initial radius Ro = 1.35 placed in a domain (0,4) x (0,4),
which shrinks according to the law Vr = -K,r (see Figure 2a). The domain
is discretized with the mesh size 0.02 in both directions, the time step is T =
4 . 10- 5 . Number of time steps is 22500, the computation stops right before
the shrinking time T = 0.9 (see [1], ). The code is parallelized by means of
the domain splitting. The results achieved on the IBM SP system are shown
in Table 1.

3.5'---~-~--~--"'"

2.5

1.5

1.5 2 2.5 3.5 1.5 2.5 3.5

Fig. 2. (a) A circle in (0,4) x (0,4) shrinking from the initial radius Ro = 1.35 to the
radius RT = 0.15 according to the isotropy law Vr = -Kr. (b) An initial circle of the
radius Ro = 3.0 deforming itself according to the 5-folded anisotropy law described
by Eq.(3), where N jo1d = 5 and ( = 0.025.

Study 2 (IBM SP). The above given problem (see Figure 2a) was recom-
puted using several choices of the mesh size and the time step. As it can be
seen from Table 2, efficiency of parallelization depends on the size of data ex-
changed between processes (e.g., it is faster to send 200kB of data than twice
100kB, due to an initiation).
Study 3 (IBM SP and Linux network). In this case, the initial condition
(a circle with the initial radius Ro = 1.35) evolves according to (1) with F = 0,
N fo1d = 5 and ( = 0.025 as indicated in Figure 2b). With numerical parameters
Parallel Computing Techniques 761

Table 1. The results of parallelization efficiency on IBM SP.


Number Mesh nodes CPU time Mesh Communication
of per per nodes mesh Eff.
processes process process total nodes
1 40000 908 40000 0 -
4 10000 258 40000 400 (1.0%) 88%
8 5000 149 40000 800 (2.0%) 76%
12 3333 113 40000 1000 (2.5%) 67%
16 2500 94 40000 1200 (3.0%) 60%

Table 2. Efficiency depends on the mesh size. Computation performed on the IBM
SP system (CPU time per process and efficiency).

Mesh size 200 x 200 267 x 267 400 x 400 667 x 667
Time step 4.0 . 10 -b 2.3.10 -b 1.0 . 10 -b 3.6.10- 0
Iterations 22500 40000 90000 250000

Mesh size \ Processes 1 4 8 12 16


200 x 200 908 258 (88%) 149 (76%) 113 (67%) 94 (67%)
267 x 267 2585 697 (93%) 392 (83%) 277 (78%) 231 (70%)
400 x 400 14171 3657 (97%) 1915 (93%) 1343 (88%) 1058 (84%)
667 x 667 98904 25574 (97%) 12889 (96%) 8740 (94%) 6775 (91%)

hl = h2 = 0.01, T = 1.1.10- 5 and 64286 time levels, it terminates at t = 0.72.


The curve is covered by squares 35 points wide. Due to the curve shrinking,
number of active nodes in the narrow band decreases from rv 33000 to 16000 as
shown in Table 3. Compared to the domain splitting, efficiency is lower. This
is caused by the fact, that the overlapping areas between processes are larger,
and even the number of active nodes increases with the number of processes.
On the other hand, the computation is faster, as only a part of the grid is
active, and the absolute amount of exchanged data is smaller.

Table 3. Narrow-band approach on IBM SP and Linux network applied to an


anisotropic circle shrinking.
Number Min. no. Avg. no. Max. no. Communication CPU time CPU time
of of active of active of active mesh nodes per process per process
processes nodes nodes nodes (% of Avg) (IBM SP) (linux cluster)
1 16226 26390 33268 o 15219 2608
4 16366 26407 33252 560 (2.1206%) 5009 (76%) 884 (74%)
8 16383 26497 33368 1120 (4.2269%) 3260 (58%) 576 (57%)
12 16435 26517 33409 1680 (6.3356%) 2613 (49%) 456 (48%)
16 16581 26620 33536 2240 (8.4147%) 2365 (40%) -
762 M. Senkyr et al.

Speedup and efficiency of parallelization for the regularized levelset


equation - Study 4 (IBM SP3). We considered the test problem shown in
Figure 3. For a given number of processes, it is possible to split the domain
either into NPROC rows, or into NPROCX columns and NPROCY rows in
the checquerboard blocking (NPROCX x NPROCY = NPROC). Unless
N P ROC is a prime, there are several possibilities for selecting N P ROC X and
N P ROCY. Tables 4 and 5 present the runtimes, numbers of communication
nodes, and efficiencies for both of the above mentioned blocking strategies at-
tained on an IBM SP3 machine. It is clear from Table 5 that the checquerboard
blocking is superior to the row-wise blocking for higher numbers of processes,
which is due to a lower number of communicating nodes.

Table 4. Results of parallelization for the row-wise blocking.


Number Mesh nodes CPU time Mesh Communication
of per per nodes mesh Eff.
processes process process (s) total nodes
1 360000 2656 360000 0 -
4 90000 778 360000 1800(0.5000%) 85%
9 40000 307 360000 4800(1.3333%) 96%
16 22500 180 360000 9000(2.5000%) 92%
25 14400 145 360000 14400(4.0000%) 73%
36 10000 125 360000 21000(5.8333%) 57%

Table 5. Results of parallelization for the chequerboard blocking.


Number Mesh nodes CPU time Mesh Communication
of per per nodes mesh Eff.
processes process process (s) total nodes
1 360000 2656 360000 0 -
4 90000 684 360000 1201 (0.3336%) 97%
9 40000 320 360000 2404(0.6677% ) 92%
16 22500 171 360000 3609(1.0025%) 94%
25 14400 122 360000 4816(1.3377%) 87%
36 10000 85 360000 6025(1.6736%) 87%

Speedup and efficiency of paralIe liz at ion for the Allen-Cahn equation
- Study 5 (IBM SP3). In this computation, we studied the isotropic curve
evolution starting at a four-folded pattern in a spatial domain (0,2) x (0, 2). The
curve approaches the circle ofradius R = 0.6 according to the law Vr = -I\;r+
F(x) where the forcing F is a suitable radially symmetric and linear function.
Other parameters are ~ = 0.01, hi = h2 = 0.00995. The curve evolution is in
Figure 4(a). The domain was divided into 1, 4, and 16 rectangular subdomains,
Parallel Computing Techniques 763

Q
0.75

0.5

0.25

0
0 0.25 0.5 0.75

Fig. 3. Evolution of the cardioida curve is driven by the regularized levelset equation,
solved in the unit square with € = 10- 8 , grid 600 x 600, T = 10- 6 and 20000 time
levels

and the computation was repeated with corresponding number of processes.


The mesh size and the total number of mesh points remained the same, the
number of mesh points per process decreased, the number of communication
mesh points increased, both with increasing number of processes. Measurement
results are in Table 6.

1.6

1.4

1.2

0.8

0.6

0.4

(a) 0·~.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Fig. 4. (a) 4-folded initial curves in (0,2) x (0,2) approaches circle of radius R = 0.6
according to Vr = -Kr + F(x) for radially symmetric linear F; = 0.01, hi = e
h2 = 0.00995. (b) 4-folded initial curve in (0,2) x (0,2) shrinks inside and expands
outside of the circle ofradius R = 0.6 according to Vr = -g( O)Kr + F(x) for radially
e
symmetric linear F, g(O) = 1.0 - 0.8cos(40 - 7r/4); = 0.02, hi = h2 = 0.00995.

Study 6 (CRAY T3E). The computation, performed on CRAY T3E, studies


the anisotropic curve evolution starting at a four-folded leaf-like curve placed
in a spatial domain (0,0.4) x (0,0.4). The curve shrinks inside a circle of
radius Ro = 0.1, and expands outside of it thanks to a spatially dependent
choice of F in the law Vr = -g(O)K,r + F g(O) = 1.0 - 0.Scos(40 - 11"/4);
~ = 0.02, hi = h2 = 0.00995. The curve evolution is in Figure 4(b). The
domain was divided into 1, 4, 16, 25 and 64 rectangular subdomains, and the
764 M. Senkyr et a!.

Table 6. Table of parameters for the use of IBM SP3 - Study 5.


Number Mesh elements CPU time Mesh Communication
of per per elements mesh Eff.
processes process process total elements
1 40401 118.11 40401 0 -
4 10201 29.89 40401 401 (0.9925%) 99%
16 2601 10.01 40401 1197(2.9628%) 74%

computation was repeated with corresponding number of processes. Mesh size


and total number of mesh points remained the same, number of mesh points
per processes decreased, number of communication mesh points increased, both
with increasing number of processes. Measurement results are in Table 7.

Table 7. Table of parameters for the use of CRAY T3E - Study 6.


Number Mesh elements CPU time Mesh Communication
of per per elements mesh Eff.
processes process process total elements
1 40401 37.55 40401 0 -
4 10201 9.65 40401 401(0.99%) 97%
16 2601 3.23 40401 1197(2.96%) 73%
25 1681 2.48 40401 1592(3.94% ) 61%
64 625 2.20 40401 2765(6.84%) 27%

Acknowledgment. The first author was partly supported by the project No.
159 of the Czech-Slovak Science Programme, the second author was partly
supported by the project CTU 0309914 of the Czech Technical University
in Prague, and the third author was partly supported by the project No.
201/01/0676 of the Grant Agency of Czech Republic. The authors gratefully
acknowledge technical support of the Center for High Performance Computing
at the Czech Technical University in Prague (Project No. 145), and technical
support of the CINECA - High Performance Computing Centre, Bologna (Pro-
gramme Minos).

References
1. M. Benes, Mathematical and computational aspects of solidification of pure sub-
stances, Acta Mathematica Universitatis Comenianae 70, No.1 (2001), 123-152.
2. C.M. Elliott, M. Paolini, and R. Schatzle, Interface estimates for the
fully anisotropic Allen-Cahn equation and anisotropic mean curvature flow,
Math. Models Methods App!. Sci. 6 (1996), 1103-1118.
3. L.C. Evans and J. Spruck, Motion of level sets by mean curvature I, J. Diff. Geom.
33 (1991), 635-681.
Parallel Computing Techniques 765

4. V. Minarik and M. Benes, Numerical solution of degenerate parabolic equations


of Hamilton-Jacobi type within the context of computer image processing, ALGO-
RITMY 2002, Proceedings of contributed papers and posters (Bratislava), 2002,
pp. 162-170.
5. J.A. Sethian, Level set methods, Cambridge University Press, New York, 1996.
6. ___ , Evolution, implementation, and application of level set and fast marching
methods for advancing fronts, Journal of Computational Physics 169 (2001), 503-
555.
The Finite Element Analysis of an Elliptic
Problem with a Nonlinear Newton Boundary
Condition

Veronika Sobotikova

Department of Mathematics, Faculty of Electrical Engineering,


Czech Technical University, Technicka 2, 166 27 Praha 6
Czech Republic
[email protected]. cz

Summary. Results of the study of an elliptic 2D problem with a nonlinear New-


ton boundary condition are presented. The problem is discretized with the use of
the FEM and the integrals are evaluated by numerical quadratures. In the case of
a non polygonal domain the main attention is paid to the effect of a piecewise linear
approximation of the boundary. The error estimate for the solution of the discrete
FE problem is derived.

1 Introduction

A number of problems in science and technology can be described by partial


differential equations with a nonlinear Newton boundary condition, see e.g. [1],
[8], [5] and [7]. In this contribution we deal with a finite element approximation
of one of such problems which can be met in the modelling of electrolysis of
aluminium with the aid of the stream function. In this case the nonlinear
term in the boundary condition has a "polynomial" behaviour and describes
turbulent flow in a boundary layer.
Let us introduce some notations which we will use later. Let G C R2
be a bounded domain with a Lipschitz-continuous boundary EJG. By G we
denote the closure of G and by n the unit outward normal to EJG. We
use the well-known Lebesgue and Sobolev spaces LP(G), LP(EJG), Wk,P(G),
Hk(G) = W k,2(G), Wk,P(EJG) for k E {D, 1, 2, ... } and P E [1,00] (see e.g.
[6]). By 11.llk,p,G and 11.llk,p,8G we denote the standard norms in Wk,P(G)
and Wk,P(EJG), respectively. Then 11.llo,p,G and 11.llo,p,8G mean, of course, the
norms in £P(G) and £P(EJG). The symbols 1.lk,p,G and 1.lk,p,8G stand for the
seminorms in Wk,P(G) and Wk,P(EJG). The space (Hl(G))* is the dual space
to Hl(G) and (.,.) is the duality pairing between (Hl(G))* and Hl(G). If
K eRn, we denote by Pk (K) the space of all polynomials on K of degree less
than or equal to k.
The Finite Element Analysis of an Elliptic Boundary Value Problem 767

2 Continuous Problem

Let fl c R 2 be a bounded domain. We suppose the boundary afl of fl to be


Lipschitz-continuous, and moreover, let the boundary of fl be piecewise of the
class C 3 •
We consider the following boundary value problem:
Find u : fl ----* R such that

-6u = f in fl, (1)


au
an + Ii:[u["u = cp on afl, (2)
where f : fl ----* Rand cp : afl ----* R are given continuous functions, and
Ii: > 0, a;::: 0 are given constants.
In a standard way we can introduce a weak formulation of the above prob-
lem:
A function u : fl ----* R is said to be a weak solution of problem (1) - (2), if

a) u E Hl(fl), (3)
b) a(u,v) = L(v) \:Iv E H 1 (fl).

The forms a and L from (3,b) are defined for u,v E Hl(fl) by

a(u,v) = b(u,v) +d(u,v),


L(v) = L.D(v) + LF(V),
where

b(u,v) = in \lu· \lvdx,

d(u, v) = Ii: r [u["uvdS,


Ja.D
L.D(v) = in fvdx,

LF(v) =
Ja.D
r cpvdS.

It was shown in [2] that the operator A: Hl(fl) ----* (Hl(fl))* corresponding
to the nonlinear form a by

(A(u),v) = a(u,v) \:Iu, v E Hl(fl)

is uniformly monotone, Lipschitz-continuous on every bounded subset of


Hl(fl) and coercive; the linear form L defining the right-hand side of (3,b) is
continuous. Hence, by the monotone operator theory, problem (3) has exactly
one solution.
768 V. Sobotikova

3 Finite Element Discretization


We discretize the problem with the use of the finite element method. Let us
consider a system {{h} hE(O,ho) ,
0 < ho < 1, of polygonal approximations of
n. As n need not be convex and thus the inclusion nh C n need not be valid,
we suppose that n* is a bounded domain with Lipschitz-continuous boundary
such that n c n* and nh C n* for every h E (0, h o). Let the right-hand side
f of the equation (1) be defined on the whole domain n*.
On the domains nh we consider triangulations Yr. formed by a finite number
of closed triangles T. We say that T E Yr. is a boundary triangle, if T has a side
S C anh. By Sh we denote the set of all sides S C anh of all boundary triangles
T E Th. We suppose that all vertices of Th are in n, that all vertices lying on
anh belong to an too, and that every boundary triangle has exactly two
vertices lying on anh. Moreover, let all points from an where the condition
of C 3 -smootheness of an is not satisfied be vertices of Yr.. Finally, we suppose
the intersection of an and anh to be formed only by sides and vertices of
triangles from Yr..
We denote by hT the length of the maximum side of T E Th . Let us set

h= max h T .
TETh

We assume the index h to be chosen in such a way that h = h.


In what follows we denote by ITI the area of a triangle T E Th and by lSI
the length of a side S E Sh.
In order to be able to prove the solvability of the discrete problem and the
convergence of the method, we assume that:

a) The system {Yr.} of triangulations is regular, which means that the


hE(O,ho)
magnitudes of inner angles of all triangles T E Yr. are bounded from zero by
a positive constant '13 0 independent of h E (0, h o).
b)The triangulations Yr., hE (0, ho), locally satisfy the inverse assumption at
an:
There exists l/ > 0 such that for every h E (0, ho), S E Sh, we have

lSI> l/h.
Due to these two assumptions there exists a constant oX > 0 such that

for every boundary triangle T E Yr. and every h E (0, h o).


An approximate solution of problem (3) will be sought in the space of linear
triangular conforming elements Hh C Hl(nh):
The Finite Element Analysis of an Elliptic Boundary Value Problem 769

The forms a and L defining the weak solution are discretized in two steps.
First we integrate in all integrals over th and ath instead of over D and
aD, respectively. In the second step we apply quadrature formulae to evaluate
these integrals. We suppose the formula used for the integration over triangles
to be exact for all constant functions and the formula used for the integration
over sides to be exact for all linear functions and to be monotone, i.e., all its
coefficients are positive. In such a way we come to the following approximations
of the forms defining the weak solution (Vh' Wh E H h ):

ah(Vh, Wh) = bh(vh, Wh) + dh(Vh, Wh),


Lh(Wh) = Lr(Wh) + L{(Wh),
where

m
dh(Vh, Wh) = K 2.: 18 12.: fI/-'(lvh I"'vhwh) (XS,/-,) ,
SESh /-,=1
m

L{(Wh) = 2.: 18 12.: fI/-,('PhWh) (XS,/-,)'


SESh /-,=1
M
Lr(Wh) = 2.: ITI2.: W/-,(jWh)(XT,/-,)'
TETh /-,=1

Here 'Ph is an approximation of the function 'P from the boundary condition,
which will be introduced later.
Let us note that in the approximation of the form b we do not use numerical
integration, as we integrate constant functions.
Now we can define an approximate solution of problem (3) as a function
Uh : Dh ----+ R such that

a) Uh E H h , (4)
b) ah(uh, Vh) = Lh(Vh)

4 Ideal Thiangulation
Since the problem is nonlinear in the boundary condition, we meet some diffi-
culties in the analysis of the FEM. These are caused especially by the fact that
in general the boundaries of Dh and D may not be identical, the union of all
triangles of T,. may not form D, and that we seek the approximate solution in
a space which may be different from the space in which we look for the weak
solution. We handle these difficulties with the aid of Zlamal's ideal elements
(see [12]).
770 V. Sobotikova.

First we introduce the concept of the ideal triangle. Let us have a triangu-
lation Th of the domain [h and let T E Th. If T is not a boundary triangle,
we set Tid = T and we denote its vertices by Pi, P2 , P3 in any way. If T
is a boundary triangle, then we numerate its vertices in such a way that the
vertices Pi and P3 lie on the boundary. Then we replace in the triangle T the
straight side S = P1 P3 c afh by the arc Es =P1 P3 C aD to get the curved
ideal triangle Tid. We denote the set of all ideal triangles by T~d and call it
the ideal triangulation associated with Th . It is obvious that the union of all
ideal triangles Tid from T~d forms D.
As the functions from the space Hh are not defined on the whole domain
D, we need to modify somehow these functions. We proceed in the following
way:
Let us consider the reference triangle K in the (6, 6) - plane with the
vertices Rl = (0,0), R2 = (1,0) and R3 = (0,1). We denote by XO the affine
mapping which maps the triangle K one-to-one on the triangle T in such away
that XO(R i ) = Pi for i = 1,2,3, and let 5° be its inverse.
By Zlamal [12], there exists such a mapping X, which maps one-to-one the
reference triangle K on the ideal triangle Tid, that it as well as its inverse 5 are
of the class C2. With the aid of these mappings we can now define a function
w associated with a function wE Hh by
w(x) = w(XO(5(x)))

It is obvious that we can construct this function w for any function w defined
on Dh and not only for w E Hh.
In a similar way we can introduce a function l' : aD --4 R associated with
a function 'Y defined on aDh . We put

Moreover, we can define for every function 'Y : aD --4 R its approximation
'Yh given on aDh by

In such a way we approximate the function 'P from the boundary condition.
Remark: It is obvious that WiT = WiT for every T E Th n T~d.
It was shown in [4] that

I r
Jan
fJ(x)dS - r
Janh
v(x)dSI S; c5h r
Jan h
Iv(x)ldS

for every v E Ll( aDh ). Moreover, the following relation between the seminorms
w
of wand in the Sobolev spaces Wl,p(S) and Wl,P(E) is valid:

ISI)l-P
( lET
Iwlf,p,s = Iwlf,p,E'
The Finite Element Analysis of an Elliptic Boundary Value Problem 771

where
1 :::; 1ST
1.01
< 1 + ch.
This estimates allows us to derive relations (5, a, b) between norms and
semi norms of wand w from the following Lemma 1. The proof of (5, d) can
be found in the same paper [4], for the proof of (5, c) see [10].
Lemma 1. Let p E [1,00). Then there exist positive constants C l = C l (p),
C2 = C 2(p), C3 = C 3 (p), C4 and hI E (0, ho] such that for every h E (0, hI)

a) C1lllwllo,p,an :::; Ilwllo,p,anh :::; CIilwllo,p,an Yw E U(8rh),


b) Cillwll,p,an :::; Iwll,p,anh :::; C2Iwll,p,an Yw E W l ,P(8Dh ),
(5)
c) C3'll W Il,p,n :::; I w Il,p,nh :::; C3 1w iI,p,n Yw E Hh ,
d) Cilllwlll,2,n :::; Ilwlll,2,nh :::; C4 1Iwlll,2,n Yw E Hh.

Remark: In what follows, whenever we will refer to any result from [2] or
[3], where the domain D is polygonal, the polygonality of D was not used in
the associated proof.

5 Existence of Approximate Solutions.


Let f E Wl,q(D*) for some q > 2 and <p E W l ,r(8D) for some r > 1. By [2]
the forms ah are under these assumptions continuous and strictly monotone.
Moreover, it was shown in [4] that the forms Lh are continuous and that there
exists a positive constant C5 such that

for every h E (0, h o ), Vh E H h . Concerning the coercivity of the forms ah, we


have the following lemma:

°
Lemma 2. Let hI and C4 be as in Lemma 1. Then there exist h2 E (0, hI] and
C6 > such that
ah(Vh,Vh) 2: C 6 1I v hllI,2,nh
for every h E (0, h2) and Vh E Hh with Ilvhlll,2,nh 2: C 4 .
Remark: In what follows, we suppose ho to be so small that h2 = hI = ho
in Lemmas 1 and 2.
The coercivity of ah was obtained by the aid of the following estimate of
the error in the approximation of the form d :

Id(vh,wh) - dh(vh,wh)1 :::; C7hl/2-a/rllvhllrl~Jwhlll,2,nh


Yh E (0, ho), YVh, Wh E H h,

where C 7 = C 7 (r) is a positive constant, r E [1,00).


On the basis of the above mentioned properties of the forms Lh and ah,
similar to those of the forms L and a, the monotone operator theory gives:
772 V. Sobotfkova

Theorem 1. Discrete problem (4) has for every h E (0, h o ) exactly one solu-
tion Uh E Hh.

(See [4], Theorem 4.1.)

6 Convergence of the Finite Element Method

The monotonicity of the forms ah is sufficient in the proof of the existence and
uniqueness of the approximate solutions. However, for the derivation of the
error estimate of the method we need somewhat stronger result. By [11], the
forms ah are "almost uniformly monotone" in the following sense:

where the function

for 0 ::::; t ::::; (c*)-i


for t ~ (c*)-i

is nonnegative and increasing in [0, (0).


In [11] also another estimate of the error in the approximation of the form
d was derived:
We have (1 < p ::::; 00, C8 , C 9 = C 9 (p) are positive constants):

Id(vh,wh)-dh(vh,Wh)1 ::::;
::::; (C8hllvhllr,t,~h + C9h1-~ IVhI1,p,Sh Ilvhll~,oo,Dh) Il whI11,2,Dh
for all h E (0, h o) and Vh, Wh E H h . (We set ~ = 0.)
On the basis of the above inequalities an abstract error estimate was
established. In what follows, we denote by u the weak solution and by
Uh, h E (0, h o ), solutions of the discrete problem.

Lemma 3. Let Q(t) = "~t) for t > 0, Q(O) = 0, Q-1 be the inverse of
Q and let 1 < p ::::; 00. Then there exist positive constants C lO , C n , C 12 and
C 13 = C 13 (P) such that

Ilu - uhlh,2,D ::::; Ilu - vhI11,2,D+


+Q-1 {ClOh + Cn (l + Il ullr,2,D + Il vhllr,2,D)llu - vhI11,2,D+
+C12 h(ll v hI11,2,D h + Ilvhllr,t,~J + C13h1-~ IV hI1,p,DhIlvhll~,oo,DJ
for all h E (0, h o ) and Vh E H h .
The Finite Element Analysis of an Elliptic Boundary Value Problem 773

Finally, we can formulate the main result, the error estimate for the solution
of the discrete problem (see [11], Theorem 4.1):
Theorem 2. Let the solution u of the continuous problem satisfy u E H2(rl)
and let U c E H2 (rl*) be an extension of the function u on R 2 . Then for every
p E (2,00) there exist h E (0, hal and a positive constant C = C(p, Ilu c I12,2,n*)
such that

for all h E (0, h).

(For the existence of U c see, e.g., [9]' Theorem 3.10.)


Remark: If there exists such a function U c E H2(rl*) n Wl,oo(rl*) that
ucln = u, it is possible to show that the rate of convergence of the method is
O(hah). It is an open question, whether this stronger assumption is necessary
for improving the error estimate from Theorem 2.

Acknowledgment

The support of the Research Project J04/98/210000010 of Ministry of Educa-


tion of the Czech Republic is gratefully acknowledged.

References
1. Bialecki, R., Nowak, A.J. (1981): Boundary value problems in heat conduction
with nonlinear material and nonlinear boundary conditions. Appl. Math. Mod-
elling, 5, 417-421
2. Feistauer, M., Najzar, K. (1988): Finite element approximation of a problem with
a nonlinear Newton boundary condition. Numer. Math., 78,403-425
3. Feistauer, M., Najzar, K., Sobotikova, V. (1999): Error estimates for the finite
element solution of elliptic problems with nonlinear Newton boundary condition.
Numer. Funet. Anal. Optim., 20, 835-851
4. Feistauer, M., Najzar, K., Sobotikova, V. (2001): On the finite element analysis of
problems with nonlinear Newton boundary condition in nonpolygonal domains.
Appl. Math., 46, 353-382
5. Ganesh, M., Graham, I.G., Sivaloganathan, J. (1994): A pseudospectral three-
dimensional boundary integral method applied to a nonlinear model problem
from finite elasticity. SIAM J. Numer. Anal., 31, 1378-1414.
6. Kufner, A., John, 0., Fucik, S. (1977): Function spaces. Academia, Prague
7. Krizek, M., Liu, L., Neittaanmaki, P. (1999): Finite element analysis of a nonlin-
ear elliptic problem with a pure radiation condition. In: Sequeira, A., de Veiga,
H.B., Videman, J.H. (eds) Applied Nonlinear Analysis. Kluwer, Amsterdam, 271-
280
8. Moreau, R., Ewans, J.W. (1984): An analysis ofthe hydrodynamics of alluminium
reduction cells. J. Electrochem. Soc., 31, 2251-2259
774 V. Sobotikova

9. Necas, J. (1967): Les methodes direetes en theorie des equations elliptiques.


Academia, Prague
10. Sobotikova, V. (1996): Finite elements on curved domains. East-West J. Numer.
Math., 4, 137-149
11. Sobotikova, V. (2003): An error estimate for the finite element solution of an
elliptic problem with a nonlinear Newton boundary condition in nonpolygonal
domains. Numer. Funet. Anal. Optim., 24, 621-635
12. Zlamal, M. (1973): Curved elements in the finite element method. 1. SIAM J.
Numer. Anal., 10, 229-240
Automatic Goal-Oriented hp-Adaptivity
Without Error Estimates

Pavel Solin 1 and Leszek Demkowicz 2

1 CAAM, Rice University [email protected]


2 ICES, The University of Texas at Austin [email protected]

Summary. We propose and test a fully automatic, goal-oriented hp-adaptive strat-


egy for elliptic problems. The method combines two approaches: the standard goal-
oriented adaptivity based on a simultaneous solution of the primal and dual problem,
and a recently proposed automatic hp-adaptive strategy based on minimizing the
projection-based interpolation error of a reference solution.

1 Introduction

Nowadays, the theory of the hp-version of the finite element method is well-
established and founded on solid results mostly due to the efforts of Babuska
and coworkers. However, the practical realization of fully automatic and robust
3D hp-adaptive algorithms still presents many serious difficulties mainly due
to excessive programming complexity.
We would like to introduce a novel fully automatic algorithmic approach to
goal-oriented hp-adaptivity for elliptic problems. The methodology does not
rely on estimates of error or its higher derivatives, and it is capable of achieving
exponential convergence not only in the asymptotic but also in preasymptotic
range of error level. Due to the limited length of this paper, only basic ideas of
the approach can be presented, but many details on both theory and computer
implementation can be found, e.g., in [3, 4, 5, 7, 8].

2 Different roles of error estimation in h-, p- and


hp-adaptivity

Error estimation forms an essential part of most h- and p-adaptive finite el-
ement algorithms. Recall that h-adaptivity is based on spatial refinement of
elements with largest contributions to the error and that p-adaptivity achieves
the reduction of the error by increasing the polynomial order in elements. In
both cases, thanks to the low number of options an element can be refined
(in most cases only one), the estimate of magnitude of error in elements is
sufficient to guide the adaptive process.
776 P. Solin, L. Demkowicz

The situation, however, changes with hp-adaptivity that allows both for
pure p-refinements and spatial refinements with suitable redistribution of the
polynomial order to element sons. Typically one has several options to choose
from, and the number of possibilities increases dramatically as the polynomial
order of elements in the mesh gets higher. The situation is illustrated in Fig.
1.

OR OR. ..
2

Fig. 1. Several options for hp-refinement of a quadratic triangular element. The


numbers in the elements indicate their polynomial orders.

It is clear that information about the magnitude of the error in elements is


not enough to drive an automatic hp-adaptive algorithm - in order to decide
between a pure p-refinement and a (let us call it) genuine hp-refinement, and
for the selection of optimal polynomial orders in the element sons, one needs
information about the actual shape of the error.
There are several possible approaches to do this, all of them in some sense
utilizing information about higher-order spatial derivatives of the error func-
tion. In [2] the authors apply standard error estimates to the element sons, [6]
uses duality-based estimates of higher derivatives ofthe error in a goal-oriented
algorithm. We will follow the idea [4] and calculate an approximation to the
error function by means of reference solutions.

3 Reference solutions and approximate error function


Consider a bounded domain D c lR? with a Lipschitz continuous boundary
and a standard boundary value problem

b(u, v) = f(v) for all v E V, (1)


where V = V(D) is a Hilbert space, b a symmetric bilinear positive-definite
elliptic form over V x V and f E V' a linear form.
By Th,p and Uh,p we denote the coarse mesh and the coarse mesh approxi-
mation to the exact solution u, respectively. For simplicity let us say that the
mesh Th,p covers the domain D exactly.
Assume that one can use the values of the coarse mesh approximation Uh,p
to calculate a function uref that approximates the exact solution U essentially
better than Uh,p itself. Then the difference
Automatic Goal-Oriented hp-Adaptivity Without Error Estimates 777

errh,p = Ure! - Uh,p (2)


gives a meaningful approximation of the true error

(3)
and the function Ure! is called reference solution.
There are various ways to calculate reference solutions. For example, highly
accurate approximations based on Babuska's extraction formulae (postprocess-
ing formulae applicable mainly to lower-order elements) were used to guide
hp-adaptivity in [5]. A robust way to obtain reference solutions for elliptic
problems without any limitation on the polynomial order was proposed by
Demkowicz [4]. The function ure! is defined as approximate solution on a uni-
formly hp-refined mesh, i.e. on a mesh where all elements are refined so that
h ~ hj2 and p ~ p + 1. Since Uh,p already contains useful information about
lower frequencies in the solution, the higher frequencies identifying Uh/2,p+l
can be obtained with a reasonable amount of work using a two-grid solver.

4 Projection-based interpolation

This elementwise local technique, that plays an essential role in the presented
automatic adaptive algorithm, generalizes the standard Lagrange (vertex) in-
terpolation to higher-order finite elements by combining it with projection on
spaces generated by hierarchic higher-order shape functions. We can confine
ourselves to a reference domain, since this is where almost all operations in an
hp-adaptive code are performed. Choose, for example, a reference triangle T.
Let pb denote the polynomial order in the interior of T and pl, p2 and p3 the
polynomial orders related to its edges el, e2 and e3, respectively. By Vl, V2, V3
denote the vertices of T. Consider a sufficiently regular function W defined in
T.
The projection-based interpolant Wh,p is constructed in three steps: First
one calculates the vertex interpolant Wh,p as a linear combination of vertex
shape functions ipVl, ipV2 and ipV3, such that

(4)
In the next step one subtracts the vertex interpolant from the original function
wand defines a new function

W(l) := W - Wh,p (5)


that vanishes at all vertices. The edge interpolant wh,p is computed in the form
of a sum of contributions over all edges,

(6)
778 P. Solin, L. Demkowicz

Each function w~~p' k = 1, ... ,3, is a linear combination of edge shape func-
tions <p~k , ... , <p;~, such that

II W (1) ek I
- W h , P Hl/2( ) (7)
00 €k

is minimal. The minimum is achieved if and only if the trace of the difference
(w(l) - W~k,p ) lek is normal to the traces of all edge functions <p~k, ... , <pe~
p
on
the edge ek in the norm HU 2 (ek). Hence the discrete minimization problem
(7) translates for each edge ek, k = 1, ... ,3, into a system of pk - 1 linear
algebraic equations. The norm H~b2(ek) is defined using harmonic extensions
of functions defined on edges to the element interior, and therefore is difficult to
evaluate exactly. For practical computations One can replace it by a weighted
HJ-norm [4, 8].
In the last step one defines a new function

W (2) .=
.
w(1) _ we
h,p· (8)
Notice that this function generally does not vanish on edges. The bubble inter-
polant wk,p is obtained by projecting w(2) on a space generated by the bubble
functions of orders p :::; pb in the Hl-seminorm. Minimization of the difference

(9)
in the Hl-seminorm translates, analogously as in the previous case, into (pb -
1) (pb - 2) /2 linear algebraic equations.
Finally the projection-based interpolant Wh,p is defined as the sum of the
vertex, edge and bubble interpolants,

(10)

5 Adaptivity as elementwise minimization of


approximate error

At the beginning of each mesh adaptation step one has at his disposal the
following information: the coarse mesh Th,p, the coarse mesh solution Uh,p, the
uniformly refined mesh Th/2,p+1, the reference solution ure! == Uh/2,p+l and
the approximate error function errh,p = Ure! - Uh,p. The question is how to
use the function errh,p to adapt the mesh Th,p in an optimal way.
Recall from Fig. 1 that there always are several possibilities an element
can be hp-refined. With no further information on the exact solution U it
is not known a-priori which hp-refinement is the optimal One. Therefore a
possible strategy could be to parse through all element refinement options,
always creating a new mesh Th ,computing a new approximate solution u'h ,p
~
Automatic Goal-Oriented hp-Adaptivity Without Error Estimates 779

and selecting the element refinement option that maximizes the value of error
drop 6errh,p,

6errh,p = Ilerrh,plle,n - IlerrhAe,n (11)


= Ilure! - uh,plle,n - IIUre! - uh,plle,n
N
= L Ilure! - uh,plle,Ki -IIUre! - uh,plle,Ki·
i=1

Here 11.lle,n means the standard energy norm


Ilwll;,n = b(w, w) (12)
that is defined for every form b satisfying assumptions listed at the beginning
of Section 3.
Unfortunately, 6errh,p is a global quantity and (11) cannot be maximized
elementwise. Obviously one cannot afford to solve the global discrete problem
for each element and all its hp-refinement options.
However, at the cost of introducing an asymptotically negligible error one
can replace the coarse mesh solution Uh,p in (11) with the coarse mesh inter-
polant of the reference solution Ih,pu re ! and at the same time the function
uh,p by the interpolant IIh,pure! of the reference solution to the mesh Th,p.
This transforms (11) to

N
6ERRh,p = L Ilure! - IIh,pure!lle,Ki -Ilure! - IIh,pUre!lle,Ki· (13)
i=1

This is a step of crucial importance. Thanks to locality of the projection-based


interpolation operators described in Section 4, the approximate error ERRh,p
can be minimized elementwise. In other words, one does not have to solve
global discrete problems for all investigated element refinement options. For
each element K i , i = 1,2, ... ,N, the global problem related to maximization
of 6errh,p(D) is replaced with a local problem of maximizing 6ERRh,p(Ki),
where

In each mesh optimization step the maximum of interpolation error decrease


rates over all elements is calculated and only elements with rates exceeding,
e.g., 1/3 of the maximum, are selected for refinement. The implementation,
however, involves many important details that exceed the scope of this paper
- we refer, e.g., to [4, 7].
780 P. Solfn, L. Demkowicz

6 One-dimensional illustration

The algorithm can be best illustrated in one spatial dimension. Competitive


refinements have a very natural structure here: Adding one DOF to a linear
element (p = 1) can be done either as a P ---+ P + 1 refinement or as a h ---+ hj2
refinement with PL = 1,PR = 1. Adding one DOF to a quadratic element can
be done either as a P ---+ P + 1 refinement or as a h ---+ hj2 refinement with
PL = 1,PR = 2 or PL = 1,PR = 2. Adding one DOF to a cubic element results
into four analogous options and so on.
Consider, e.g., the Poisson problem -u" = f in the interval [2 = (0,11")
with homogeneous Dirichlet boundary conditions. The function f is chosen in
such a way that the function u(x) = k(l - x)m sin(nx), k = 2, m = 2, n = 5,
depicted in Fig. 2, is the exact solution.

Fig. 2. The exact solution u

The following results were obtained by means of a one-dimensional au-


tomatic goal-oriented hp-adaptive C++ code MESHOPT 1. One starts from
an equidistant mesh consisting of three quadratic elements. The convergence
curve in H1-seminorm is shown in Fig. 3.
Figs. 4 - 7 visualize a few first steps of the automatic hp-adaptive algorithm.

7 Incorporation of goal-oriented adaptivity

In comparison with the adaptivity in energy norm which attempts to minimize


the energy of the residual of the approximate solution, the goal-oriented ap-
proach attempts to control concrete features of the solved problem (quantities
of interest). Very often quantities of interest can be represented as bounded lin-
ear functionals of the solution, see basic literature on goal-oriented adaptivity,
or, for a review, [7,8].
1 MESHOPT can be downloaded free of charge at the web page of the second author
http://www.caam.rice.edu/-solin.
Automatic Goal-Oriented hp-Adaptivity Without Error Estimates 781

10000 ,---,---,---r----.---,---,--,--,---,
100

0.01
0.0001
le-06

le-08

le-l0

10-12 L---'-_-'------''-----'-_-'----'_-'-_-'--....::J
5 10 15 20 25 30 35 40 45 50

Fig. 3. Convergence curve. x-axis: number of DOF, y-axis: IU-Uh,pl;"l (0,11") in decimal
logarithmic scale

/-~-----,
/\
'----

·2

0.5 1.5 2.5

350

300

250

200

150

100

50

0
0 0.5 1.5 2.5

Fig. 4. Left: Uh,p (solid line) and UTe! == Uh/2,p+l (dashed line). Right: projection
error decrease rates. Only element K3 exceeds 1/3 of maximum and is selected for
refinement. Largest decrease of interpolation error on K3 is achieved by means of its
p-refinement.

7.1 Dual problem and estimate of error in goal

Let us recall the basic ideas leading to the formulation of the dual problem,
since they will be used to incorporate the goal-oriented adaptivity into the
energy-driven hp-adaptive strategy discussed in Section 5.
Consider problem (1) and its discrete version b(Uh,p,Vh,p) = !(Vh,p) for all
Vh,p E Vh,p where Vh,p C V is a polynomial finite element approximation of
space V. Define the error eh,p = U - Uh,p and consider the residual rh,p (Vh,p) =
! (Vh,p) - b(Uh,p, Vh,p)' Relate the residual rh,p to the error in the quantity of
interest, i.e. find G E V" such that G(rh,p) = L(eh,p). By reflexivity, G can be
related to an element v in the original space (influence function),
782 P. SoHn, L. Demkowicz

·2

0.5 1.5 2.5

oL-__- L_ _ _ _ ~ _ _ _ _L __ _~~_ _~_ _ _ _~

o 0.5 1.5 2.5

Fig. 5. Left: Uh,p (solid line) and Uh/2,p+l (dashed line). Right: projection error
decrease rates. This time, all elements are selected for refinement. Automatically
selected best combination is p-refinement of Kl and K2 and hp-refinement with
PL = PR = 2 for K 3 .

-2

0.5 1.5 2.5

3.5

2.5

1.5

0.5

o I
o 0.5 1.5 2.5

Fig. 6. Left: Uh,p (solid line) and Uh/2,p+l (dashed line). Right: projection error
decrease rates. Elements Kl and K4 will be p-refined.
Automatic Goal-Oriented hp--Adaptivity Without Error Estimates 783

·2

0.5

1.8
1.6
1.4
1.2

0.8
0.6

l
0.4
0.2
.1
0
0 0.5 1.5 2.5

Fig. 7. Left: Uh,p (solid line) and Uh/2,p+l (dashed line). Right: projection error
decrease rates. Element K4 will be p-refined. In the following step (not depicted),
element K2 will be selected for hp-refinement with PL = PR = 2. And so on ...

G(rh,p) = rh,p(v) = f(v) - b(Uh,p,V) = b(u,v) - b(Uh,p,V)


= b( eh,p, v) = L( eh,p)
, #
(15)

where v is the solution to the dual problem: Find v E V such that

b(u,v) = L(u) (16)


for all u E V. Consider the discrete dual problem b(U,Vh,p) = L(u) for all
u E Vh,p. Estimate the error in the quantity of interest by means of the errors
in energy norms for both the primal and dual problem:

IL(u) - L(Uh,p)1 = IL(u - uh,p)1 = Ib(u - Uh,p, v)1 (17)

= Ib(u - Uh,p,V - vh,p)1 ::; C L Ilu - uh,plle,Kllv - vh,plle,K.


KETh.v

Standard orthogonality property for the error in the solution was used.

7.2 Elementwise minimization of approximate error in goal

Recall that the energy-driven hp-adaptive algorithm from Section 5 mInI-


mizes the global error in energy norm by elementwise maximizing the drop
of projection-based interpolation error (13) of the reference solution uref from
the coarse mesh Th,p to the next optimal mesh Th,p'
784 P. Solin, L. Demkowicz

The estimate (17) shows that the error IL(u) - L(Uh,p)1 in goal is con-
trolled by errors of both the primal and dual solutions in the energy norm.
Therefore, an hp-adaptive algorithm will minimize the error in goal if instead
of elementwise maximizing (13) it will elementwise maximize the product

LERRgoal(K)
h,p' = LERRprimal(K)LERRdual(K)
h,p' h,p' (18)
where

and

Here vref == Vh/2,p+l is the reference solution to the dual problem, calculated
on the uniformly hp-refined mesh Th/2,p+l.

8 Reference to numerical examples

Performance of the presented goal-oriented hp-adaptive strategy is quite im-


pressive in comparison with the same algorithm applied to standard goal-
oriented adaptivity or to standard (energy-driven) hp-adaptivity. However,
there is little space here for a sufficient presentation of a realistic model prob-
lem and meaningful discussion of numerical results. We refer the reader to a
recent book [8] together with the paper [7] that uses the presented automatic
goal-oriented hp-adaptive strategy to resolve a challenging industrial applica-
tion related to axisymmetric Maxwell's equations.

Acknowledgment

The work of the first author was supported by the Grant Agency of the Czech
Republic under Grant No. GPI02/01/D114. The second author acknowledges
the financial support of Air Force under Contract F49620-98-1-0255.

References
1. Ainsworth M., Senior, B. (1997): Aspects of an hp-adaptive Finite Element
Method: Adaptive Strategy, Conforming Approximation and Efficient Solvers.
Comput. Methods App!. Math. Engrg., 150, 65-87.
2. Ainsworth M., Senior, B. (1997): Aspects of an hp-adaptive Finite Element
Method: Adaptive Strategy, Conforming Approximation and Efficient Solvers.
Comput. Methods App!. Math. Engrg., 150, 65-87.
Automatic Goal-Oriented hp-Adaptivity Without Error Estimates 785

3. Demkowicz, L., Oden, J. T., Rachowicz, W., Hardy, O. (1989): Toward a Uni-
versal hp-Adaptive Finite Element Strategy. Part 1: Constrained Approximation
and Data Structure. Comput. Methods App!. Math. Engrg. 77, 79-112.
4. Demkowicz, L., Rachowicz, W., Devloo, Ph. (2002): A Fully Automatic hp-
Adaptivity. J. Sci. Comput. 17, Nos.I-3, 127-155.
5. Rachowicz, W., Oden, J. T., Demkowicz, L. (1989): Toward a Universal hp-
Adaptive Finite Element Strategy. Part 3: Design of hp Meshes", Comput. Meth-
ods App!. Math. Engrg. 77, No.2, 181-212.
6. Heuveline, V., Rannacher, R. (2003): Duality-based adaptivity and the h-p finite
element method. In: M. Feistauer (ed) Proceedings of ENUMATH (August 18 -
22, 2003, Prague, Czech Republic), Springer Berlin Heidelberg
7. Solin, P., Demkowicz, L. (2003): Goal-oriented hp-adaptivity for elliptic problems.
Comput. Methods App!. Math. Engrg., accepted.
8. Solin, P., Segeth, K., Dolezel, I. (2003): Higher-Order Finite Element Methods.
Chapman & Hall/CRC Press, Boca Raton, London, New York, Washington, D.C.
A Compression Method for the Helmholtz
Equation

Mirjam Stolper 1 and Sergej Rjasanow 2

1 Saarland University, FR. 6.1 Mathematik, 66041 Saarbriicken


[email protected]
2 Saarland University, FR. 6.1 Mathematik, 66041 Saarbriicken
[email protected]

Summary. The collocation boundary element method for the Dirichlet boundary
value problem is considered. In order to solve efficiently the resulting linear systems
for several wave numbers, the adaptive cross approximation (ACA) method is applied
to the matrices. In particular, the algorithm is reformulated for complex problems
and the so-called Fourier method is used to compute the necessary entries. Finally,
some numerical examples for the solution are presented.

1 Introduction

The Helmholtz equation


w
'" = -C E 1R+,

arises in many physical problems related to wave propagation. In acoustic


applications, wand c are the frequency and the speed of the sound, and u
corresponds to the pressure field. We are interested in the solutions of the
associated exterior Dirichlet boundary value problem (BVP)

~u(y) + ",2u(y) = 0, x E 1R3 \ n,


u(y)=g(y), xEr, (1)
(:r - i", )u(y) = 0 (Iyl-l) for large Iyl = r.
for a spectrum of real wave numbers 0 :::; '" :::; "'max, where "'max is correspond-
ing to the highest frequency. In (1), r = an denotes the smooth boundary
of the bounded, simply connected domain n, and 9 is a given function. Using
Boundary element methods (BEM) to treat these problems, we need to solve
a large linear system for each wave number. The memory requirement for each
problem is M em = O(N2) and a naive procedure for the matrix-vector mul-
tiplications (using an iterative solver) is given by Op = O(M N2), where M
denotes the number of frequencies, and N is the number of degrees of freedom
by BEM discretisation. Typical values are N = 10 3 - 10 4 for the dimension of
the problem and M = 10 - 10 2 for the wave numbers of interest.
A Compression Method for the Helmholtz Equation 787

Whereas we introduced in [10] and [11] a numerical method, for computing the
associated matrices, which is based on the Fourier transform with respect to
the wave number K" we now discuss an efficient method for solving the result-
ing linear systems. In particular, we apply the Adaptive Cross Approximation
(ACA) method (see e.g. [1], [2]) to the transformed matrices and discuss the
behaviour of the compression factors in dependence on the wave number.
The paper is organised as follows.
In Sect. 2, we consider the boundary integral formulation for the problem and
its discrete form. A review of the Fourier method for computing the matrices
is presented in Sect. 3. In Sect. 4, we reformulate the ACA algorithm for the
complex problem. Finally, we give some numerical results (Sect. 5).

2 Boundary Integral Formulation and Collocation


Method

We define the combined single- and double-layer potential

(B-i1')A)[f] =J8G~x~~,K,)f(X)-i1')G(X,Y,K,)f(X)dFx, yEJR3 \r,


r

where 1') E JR+, and G(x, y, K,) is the fundamental solution of the Helmholtz
equation defined by
1 eil<lx-yl
G(x,y,K,) = -4 I
1t X - Y
I' X,y E JR3.

Note that the fundamental solution and thus the above potential satisfy the
Sommerfeld radiation condition.
Using this potential to treat the Dirichlet BVP (1), we need to solve the
boundary integral equation (BIE)

(2)

For all wave numbers K" the following uniqueness theorem holds.
Theorem 1. The exterior Dirichlet B VP (1) has a unique solution for all
9 E HS(8D), s E JR,

u(y) = J[8G~X~~,K,) -i1')G(X,y,K,)] f(x)dFx S +! (DC).


E H loc (3)
r

In (3), f E HS(8D) denotes the unique solution of the BIE (2).


788 M. Stolper, S. Rjasanow

For more details on the analysis of the Helmholtz equation, we refer the reader
to [3], [5].
In order to solve the equation (2) numerically, the surface f is discretised using
a system of N plane, triangle panels f ~ fh = U~l fj . It is natural to use
the approximate function !h in the following form f ~ fh(X) = I:f=l Vj'Pj(x),
where the associated ansatz functions 'Pj, j = 1, ... , N, are piecewise constant
on f j .
Therefore, the BIE above leads to

(4)

where the elements of the matrices A, B E(CNxN are defined by

where r := Ix - yJ The vector v E(CN and the right-hand side of the systems
are given by (v)j = Vj and (e)i = g(Yi), where Yi, i = 1, ... , N, denote the
corresponding collocation points. It should be remarked that the matrices in
(4) explicitly depend on the wave number K.

3 Fourier-Method

In order to compute the matrices in Eq.( 4) for several wave numbers, we first
apply the inverse Fourier transformation (K ......, ~) to the matrices.
The elements of the inverse Fourier transformed Matrix A(~) = F:,~ [A] (~) E
lR NxN are given by

i,j = 1, ... ,N, (5)

as a consequence of F:,~ [eil<r](~) = F:,~ [1](~ - r) = 5(~ - r). In the case of


the double-layer potential matrix, we remark

and obtain B(~) E lR NxN with

bij(~) = ~J
41T r
13 (rddz -1) 5(z)1 z=r-~
(nx,x-Yi)dFx , i,j = 1, ... ,N. (6)
rj
A Compression Method for the Helmholtz Equation 789

Since we consider plane triangle elements f j, the analytical computation of the


entries (5) and (6) is based on an appropriate transform of the coordinates.
For more details on the analytical computation, we refer the reader to [11].
According to this paper, we restrict our further discussion to the special case
printed in Fig. 1

e,

h
---p ----'.

e,

Fig. 1. The computation of the entries of the matrices

and obtain the following analytical expressions:

and

- l(d d h )
bij(~) = 4n tf?~ 5(~ -Idl) - e-d2 J~Ld2_h2 ll[~=in,~=axl(~) , i -=I- j. (8)
, .J
v
~ij(O

Notice that the diagonal elements of B(~) are zero.


In (7) and (8), ~max denotes the maximal distance of the vertex of the triangle
to the origin, and ~min is defined by the minimum of ~o := Vd 2 + h 2 and the
minimal distance of the vertex of the triangle to the origin. Further, tf? is the
maximal angle between the triangle and the el-axis.
The expression above implies that each element has a local support. Therefore
the matrices A(~) and B(~) have sparse structures for a fixed ~ E [0, diam(f)].
It should be remarked that aij(~) and f3ij(~) are integrable functions with anti-
derivatives (Jij(~) and Yij(~)' Furthermore, f3ij(~) becomes singular at the point
~ = ~o.

We return to matrices which depend on the wave number by applying the


Fourier transformation (~ ~ K:), i.e. C(K:) = F~,K [6](~)](K:) with C = A and
790 M. Stolper, S. Rjasanow

C = B respectively. Using the expressions (7) and (8), we obtain new matrices
A(K),B(K) E(CNxN, where its elements are defined by

(9)

and bii(K) = 0,

bij(K) = ~sgn(d)eil<ldl- eil<~o (Yij(~)I~~ax+ ~W/~ij(6)sinc(K6)eil<~I). (10)


4n 4n ~~in 1=1

In particular, we need to solve the linear system

( "21 I + B- (K)-i7)A(K)
- ) Y=f2. (11)

Due to the independence of the wave number K, the antiderivative O"ij (~) and
the approximation function aij(~) in (9) and, similarly, Yij(~) and ~ij(~) in
(10) need to be calculated only once. Thus we will treat the respective linear
systems (11) for several K :::; Kmax using always these computed data.

4 Application of the ACA


In order to efficiently solve the above linear system, we consider the combi-
nation of the described method with an approximation method. In particular,
such numerical methods provide an approximation of the solution in almost
linear complexity by solving a perturbed linear system, in which the matrix
is compressed. Schemes such as fast multipole [4], panel clustering [9], and
'}-{-matrices [7], [8] are based on explicitly given kernel approximations by de-
generate kernels, i.e. a finite sum of separable functions. In contrast, the ACA
algorithm [1], which uses the '}-{-matrix format, is purely algebraic and relies on
a small part of the system matrix for its blockwise approximation by low-rank
matrices.
Theoretical and numerical aspects for the ACA of collocation matrices that
contain asymptotically smooth kernels are discussed in detail in [2]. It should
be pointed out that the devised algorithms can formally be applied to matrices
for which the kernel is not asymptotically smooth but can be approximated
by a degenerated function, e.g. the kernel of the Helmholtz operator. The
following estimate is shown in [6].
Lemma 1. Let x, y E IR3 and Y := IYl/lxl < Yo < 1. It holds

eil<lx-YI I el<lxl
I Ix _ yl - Lp(x, y) :::; C
N P2 y P ,
A Compression Method for the Helmholtz Equation 791

where

and c only depends on Va. In (12), jn denote the spherical Bessel functions, h~l)
the spherical Hankel functions and Pn correspond to the Legendre polynomials.
According to the paper [2], we outline the method for the Helmholtz equation,
i.e. we reformulate the so-called partially pivoted ACA for the collocation
matrices of the Helmholtz equation.
Let C E (Cm'xn' be a given block. The method produces vectors UI E (Cm',
VI E<D n ', l = 1, ... , k, from which the approximant Sk can be formed

and it holds, C = Sk + Rk. In particular, we obtain the following algorithm.

Algorithm (partially pivoted ACA)


Let So = 0 and Ro = C.
Z denotes the index-set of the computed rows of R k , k ?: O. Let Z := 0 and
i 1 := 1.
For k = 0,1, ... compute
1. k:= k + 1
2. * If ik > m', then exit.
Else
k-l

3. Z:= Z U {id and Vk = C*eik - L (Ul)ik VI


1=1
4. Find jk, so that (Vk)jk = maxj !(Vk)j!
5. If (Vk)jk = 0, then ik := ik + 1 and go to *
k-l

6. Vk = (Vk);lvk and Uk = Cejk - L (Vl)jk Ul


1=1

7. Sk = Sk-l + Ukvk
8. Find ik+1, so that (Uk)ik+l = maxi~Z !(Uk)i!
until the stopping criterion is fulfilled.

Since the block C will not be generated completely, the Frobenius norm of the
approximant Sk will be used to obtain the stopping criterion. An appropriate
stopping criterion is to terminate the iteration, if for a given c > 0 at step k
it holds that
792 M. Stolper, S. Rjasanow

(12)

where the value of IISkllF can be recursively computed as follows:


k-1
IISkll} = II Sk-111} + 2 L u'kulvivk + Ilukll}llvkll}·
1=1

Note, that we need O((m' +n')k 2 ) operations to generate the approximant Sk


and its memory requirement is given by O((m' + n')k).

5 Numerical Experiments

(I) (II) (with diam(r) = a.8m)


Fig. 2. The surfaces

We first consider the time-harmonic acoustic scattering of a given incoming


plane wave
u I = ei(x'd) , dT = (0,0,1)

by a soft-sound domain n. Then the total acoustic wave takes the form u = uI +
us, where uS denotes the scattered wave satisfying the Sommerfeld radiation
conditions. Further, the homogeneous Dirichlet boundary condition for u holds,
g = -uI. In particular, we present numerical experiments for the BIE (4).
Since we chose the surface ofthe unit sphere, see Fig. 2 (I), the solution f(y)
is known,

The solution, which arises from the Fourier method, is denoted by fFT.
We apply the algorithm to a family of surfaces converging to the unit sphere.
The sequence is generated by recursive refinement of the meshes dividing each
of the surface triangle in four and projecting the new knots to the unit sphere,
cf. [2].
A Compression Method for the Helmholtz Equation 793

For the approximation of the blocks we used the above algorithm and E in (12)
is chosen 10- 4 , while the relative accuracy of un-preconditioned GMRES was
10- 6 .
We always consider the compression factors (C F), i.e. the ratios of the amount
of storage needed when using the approximant and the amount of storage for
the original matrix.
In the tables below, the factors are printed for different N, the first one for
a fixed wave number K, = n/2 (Table 1) and the second one for various wave
numbers with K, • h ~ 0.36 (Table 2).

Table 1. Compression factors for K, = n/2

N h CF (%) Ilf - fFTIIIL2 Ilf - fFTIIIL2/11fll1L2 #It.


80 0.97 100 3.91E-01 6.14E-02 5
320 0.51 93 1.07E-01 1.62E-02 5
1280 0.25 46 2.75E-02 4.09E-03 5
5120 0.13 18 6.96E-03 1.03E-03 5
204800.06 6 1.93E-03 2.86E-04 4

Table 2. Compression factors for various K with K • h ~ 0.36

80 0.62 100 2.7E-01 3.74E-02 4


320 1.20 92 1.05E-01 1.54E-02 5
1280 2.28 49 2.65E-02 4.25E-03 5
5120 4.60 22 1.16E-02 2.24E-03 6
204809.24 8 1.93E-02 3.87E-03 9

Figure 3 shows the behaviour of the compression factors in dependence on the


wave number for different N.
In the graph, the curves correspond top down to the cases N = 80, N = 320,
N = 1280 and N = 5120. We decide that the factors increase with a larger
wave number.
Similarly, for the surface printed in Fig. 2 (II), we consider the problem

A[f](y) = ( -~ I + B) [w](y) (13)

for the exterior Dirichlet BVP for the Helmholtz equation using collocation
with piecewise constant ansatz functions. Since we chose w = G(x, Yo, K,) with
794 M. Stolper, S. Rjasanow
Approximation %
100'-~~-=~~~==============~

80
60

40

20

3
kappa

Fig. 3. Compression factors in dependence on the wave number

x E r and Yo E fl, the solution of the equation (13) is known to be f =


onxG(x,Yo,K)lxEr. Table 3 shows the behaviour of the compression factors of
the single- and double-layer potential matrices for various wave numbers with
K· h ~ 0.5 for different N.

Table 3. Compression factors for various K with K . h ~ 0.5

N K CF(SL) (%) CF(DL) (%) Ilf - fFTIIIL2 Ilf - fFTIIIL2/11fllIL2 #It.


1356 11 50 64 6.52E-02 1.08E-Ol 38
5424 23 23 28 9.34E-02 1.15E-Ol 66
2169646 9 10 3.18E-Ol 2.27E-Ol 150

References
1. Bebendorf, M. (2000): Approximation of boundary element matrices. Numer.
Math., 86, 565-589
2. Bebendorf, M., Rjasanow, S. (2003): Adaptive low-rank approximation of collo-
cation matrices. Computing, 70, 1-24
3. Chen, G., Zhou, J. (1992): Boundary element methods. Academic Press Ltd.,
London
4. Cheng, H., Greengard, L., Rokhlin, V. (1999): A fast adaptive multipole algo-
rithm in three dimensions. J. Comput. Phys., 155, 468-498
5. Colton, D. L., Kress, R.(1992): Integral equation methods in scattering theory.
Krieger Publishing Company, Malabar
6. Goreinov, S. A. (1999): Mosaic-skeleton approximation of matrices generated by
asymptotically smooth and oscillatory kernels. In: Tyrtyshnikov, E. (ed.) Matrix
Methods and Algorithms, INM RAS, Moscow (in Russian)
7. Hackbusch W. (1999): A sparse matrix arithmetic based on H-matrices. 1. Intro-
duction to H-matrices. Computing, 62, 89-108
A Compression Method for the Helmholtz Equation 795

8. Hackbusch, W., Khoromskij, B. N. (2000): A sparse 'H-matrix arithmetic. II.


Application to multi-dimensional problems. Computing, 64, 21-47
9. Hackbusch, W., Nowak, Z.P. (1989): On the fast matrix multiplication in the
boundary element method by panel clustering. Numer. Math., 54, 463-491
10. Kohl, M., Rjasanow, S. (2003): Multifrequency analysis for the Helmholtz equa-
tion. Comput. Mech., 32, 234-239
11. Stolper, M. (2004): Computing and compression of the boundary element ma-
trices for the Helmholtz equation. J. Numer. Math., 12, 55-75, in Press.
Application of a Stabilized FEM to Problems of
Aeroelasticity

Petr Svacek 1 and Miloslav Feistauer 2

1 Department of Technical Mathematics, Faculty of Mechanical Engineering, Czech


Technical University Prague, Karlovo n. 13, 121 35 Praha 2, Czech Republic
[email protected]. cuni. cz
2 Faculty of Mathematics and Physics, Charles University Prague, Sokolovska 83,
186 75 Praha 8, Czech Republic jeist@karlin. mff. cuni. cz

Summary. This paper is concerned with modelling of fluid-structure interaction.


We consider two-dimensional viscous incompressible flow past a moving airfoil, which
is considered as a solid body with two degrees of freedom, allowing vertical and tor-
sional oscillations of the airfoil. The fluid flow is simulated by the Navier-Stokes
equations in the Arbitrary Lagrangian-Eulerian formulation, discretized by the finite
element method. We describe the SUPG stabilization of the FEM, time discretiza-
tion, equations describing the motion of the airfoil and the solution of the discrete
problem. The solution of a test problem is presented.

1 Formulation of a flow problem in a moving domain

We analyze numerically two-dimensional viscous incompressible flow past a


moving airfoil, which is considered as a solid body with two degrees of free-
dom, allowing vertical and torsional oscillations of the airfoil. The study of
this problem plays an important role in the design of aerospace vehicles. The
aero-elastic stability of aerospace vehicles and the aero-elastic responses rep-
resented by dynamic load prediction and vibration levels in wings, tails and
other aerodynamic surfaces have a great impact on the design as well as in the

°
cost and operational safety.
We assume that (0, T) with T > is a time interval and by Qt we denote
a computational domain occupied by the fluid at time t. By u = u(x, t) and
p = p(x, t), x E Qt, t E (0, T), we denote the velocity and the kinematic
pressure (i.e, dynamic pressure divided by the density of the fluid), respectively,
and l/ will denote the kinematic viscosity.
Application of a Stabilized FEM to Problems of Aeroelasticity 797

Fig. 1. The Lagrangian(left) and Arbitrary-Lagrangian-Eulerian mapping(right)

In order to simulate flow in a moving domain, we employ Arbitrary


Eulerian-Lagrangian (ALE) method. Let us denote by D ref the computational
domain at chosen fixed time - reference or original configuration. (We can set,
e.g. Dref = Do.) A one-to-one mapping of the reference configuration onto the
computational domain Dt at time t - current configuration - is denoted by
At, i.e.
At : Dref ----+ Dt , (1)
X I----t x(X, t) = At(X).
Based on this mapping we can compute the domain velocity w at all points
X of the reference configuration Dref for each time level:

w(X, t) = :tx(X, t), (2)

which can be transformed to the space coordinates x by the relation

(3)

With the aid of ALE mapping we compute the so-called ALE derivative rt; ,
which is anologous to the material derivative in the Lagrangian approach. Foro
a function f : Dt x (0, T) ----+ R, we set

D.A oj
Dt f(x, t) = at (X, t), (4)

where J= f 0 At. We find that


D.A of
Dt f = at + (w . V)f. (5)
798 P. Svacek, M. Feistauer

Now we reformulate the Navier-Stokes equations in the ALE form


DA
Dt u + [(u - w) . \7] u + \7p - vLu =0 in nt, (6)
\7 . u = 0 in nt . (7)
This system is equipped with the initial condition

u(x,O) = Uo, xE no, (8)

and boundary conditions. We assume that ant = TDUToUTwt, where TD, To


and Twt are mutually disjoint. On TD, representing the inlet and, possibly,
impermeable fixed walls, we prescribe the Dirichlet boundary condition

(9)

We denote by Twt the boundary of the airfoil at time t. (See Fig. 1 showing
schematically the difference between the Lagrangian and Arbitrary Lagrangian-
Eulerian mapping.) On Twt we assume that the fluid velocity u equals the
velocity lir of the profile:

ulrWt = lir = wlrwt' (10)

The part To of the boundary represents the outlet, where we prescribe the
"do-nothing" boundary condition

-(p - Pre!)n + v -
au = 0 on To, (11)
an
where n is the unit outer normal to ant and Pre! is a prescribed reference
outlet pressure.

2 Discretization

There is a number of possibilities how to carry out the space-time discretization


([5], [10]). In order to develop a stable, accurate scheme, which can easily
treat complicated boundaries, we apply the finite element method (FEM). By
Re = U L / v we define the Reynolds number. Here U is a reference velocity
(usually the far field velocity) and L is the length of the airfoil. The relevent
Reynolds numbers in our applications are quite large, namely between 105 and
106 . (For such regimes the flow is usually turbulent, but we simulate the flow
with the aid of the classical Navier-Stokes equations without any turbulence
model.) In order to obtain a physically acceptable numerical solution, it is not
possible to use a standard Galerkin FEM, but we have to introduce a suitable
stabilization. Here we apply the streamline diffusion method (also called SUPG
method) together with grad-div stabilization of pressure, following [6]' [9].
Application of a Stabilized FEM to Problems of Aeroelasticity 799

Fig. 2. Time discretization on a moving domain.

2.1 Time discretization


First let us describe the time discretization of the problem. We consider a par-
tition 0 = to < tl < ... < T, tk = kT, with a time step T > 0, of the time
interval [0, T] and approximate the solution u(t n ) (defined in DtJ at the time
instant tn by un. For the time discretization we use a second-order two-step
scheme using the computed approximate solution u n - 1 in Dtn _ 1 and un in Dt n
for the calculation of u n +1 in the domain Dtn + 1 • With a given ALE mapping
At we have
A tn _ 1 (X) = xn-l, Atn (X) = xn, Atn+l (X) = xn+1, (12)
where X E Dref is a given point from the reference configuration, e.g. a node
of the triangulation. (See Fig. 2.)
Now we define the approximation of the ALE derivative at time tn+l and
point x n +1 by
DAu 3iin+l(X) - 4iin(x) + iin-l(X)
15t(xn +\ t n +1) ~ 2T (13)
3u + (xn+l) _ 4un (xn) + un-1(x n- 1)
n 1
2T
and obtain the problem for the unknown functions u n +1
Pn+1 .. Dtn+l --+ R'.
3un+1 (xn+1) _ 4un (xn) + un-1(x n- 1)
(14)
2T
V)
+ ((u n+1(xn+1) _ wn+1(xn+1)) . u n+1 (xn+1)
_vLlun+1(xn+1) + Vpn+l(xn+1) = 0,
800 P. SVc1cek, M. Feistauer

where wn+l ;:::; W(tn+I)' This problem is equipped with the boundary condi-
tions (9) ~ (11) on 8Dtn +1 • Taking into account that Atn+l (A~I (Xi)) E Dtn +1 ,
we can transform equations (14) completely to the domain Dtn +1:

3un + 1 _ 4u n + u n- I
(15)
27
+ ((u n+1 _ wn+l) . \7) u n+1 - vLlun +1 + \7pn+l = 0
· U n+1 = 0
d IV .
In
n
Jttn+l'

where ui = u i 0 Atn+l 0 A~I. This system is again equipped with the boundary
conditions (9) - (11).

2.2 Space discretization

In what follows, we shall carry out the space discretization of the problem
to find approximations of functions u := u n +1 and p = pn+1 defined in the
domain Dtn+ll satisfying system (15) and boundary conditions (9) - (11). To
this end, we reformulate this problem in a weak sense. Let us set D = Dtn+l
and define the velocity spaces W = (HI(D))2,X = {v E W;VlrDnrWt = O}
and the pressure space M = L5(D) = {q E L2(D);J.nqdx = O}. Then it is
easy to find that the solution U = (u,p) of problem (15) satisfies

a(U, U, V) = f(V), vV = (v,q) E (X,M). (16)


Here
3
a(U*, U, V) = - (u, v) + v (\7u, \7v) + (((u* - wn+l). \7) u, v)
27
- (p, \7. v) + (\7. u,q), (17)
f(V) = ~ (4u n -
27
un-I, v) - r
lro
Prefv, ndS,

U = (u,p), V = (v, q), U* = (u* ,p),


where (".) denotes the scalar product in L2(D). Moreover, we require that u
satisfies the Dirichlet boundary conditions (9), (10). The couple (u,p) repre-
sents the solution on the time level tn+l, i.e. un+l := u and pn+l := p.
In order to apply the Galerkin FEM, we shall restrict the weak formulation
from the spaces W, X, M to approximate spaces W h , X h , M h , hE (0, h o ), ho>
0, X h = {Vh E W h; vhlrDnrWt = O}. Hence, we want to find Uh = (Uh,Ph) E
Wh x Mh such that Uh satisfies approximately conditions (9), (10) and

(18)
The couple (Xh, M h ) of the finite element spaces should satisfy the Babuska~
Brezzi (BB) condition, which guarantees the stability of the scheme: there
exists a constant c > 0 such that
Application of a Stabilized FEM to Problems of Aeroelasticity 801

(p, \7. w)
sup
WEXh
iw iHI(st) ~ ciipii£2(st), Vp E M h , h E (0, h o). (19)

We proceed in the following way. Assuming that D is polygonal, by Th


we denote a triangulation of D with standard properties from the FEM. The
pressure space M is then approximated by the space of piecewise polynomial
functions of degree::::: k:

p ~ Ph E Mh = {q E M n C(D); qiK E pk(K), VK E 1h} (20)


and the velocity space Wand X are approximated by the spaces of piecewise
polynomial functions of degree ::::: k + 1:

u ~ Uh E W h = {v E W n (C(D))2; ViK E (pk+I(K))2 ,VK E Th } (21)


Xh = Wh n W.
This couple (Xh' M h ) satisfies the BB condition (see, [13]).
In practical computations we use the Taylor-Hood p2 / pI elements.

3 Stabilization of the FEM


The standard Galerkin discretization (18) may produce approximate so-
lutions suffering from spurious oscillations for high Reynolds numbers. In
order to avoid this drawback, we apply the stabilization via streamline-
diffusion/Petrov-Galerkin technique (see, e.g., [10]' [9]' [6]). We define the
stabilization terms

Lh(U*, U, V) =
~
L 6K ( ~u - v6u + (w. \7) U
~
+ \7p, (w· \7)v)
K
,
KE'h

Fh(V) = L 6K( ~ (4u n - un-I), (w· \7)v) , (22)


KETh 2T K

U = (u,p), V = (v,q), U* = (u*,p),


where the function w stands for the transport velocity w = u* - w n +l , (', ')K
denotes the scalar product in L2(K) and 6K ~ 0 are suitable parameters.
Moreover, we introduce the pressure stabilization terms

Ph(U, V) = L TK(\7· u, \7. V)K, U = (u,p), V = (v, q), (23)


KETh

with suitable parameters TK ~ O.


The stabilized discrete problem reads: Find Uh = (Uh,Ph) E W h x Mh such
that Uh satisfies approximately conditions (9), (10) and

a(Uh, Uh, Vh) + Lh(Uh, Uh, Vh) + Ph(Uh, Vh) = f(Vh) + Fh(Vh), (24)
VVh E X h X M h .
802 P. Svacek, M. Feistauer

The parameter (j K is defined on the basis of the transport velocity w as

(25)

where
(26)
is the local Reynolds number and hK is the size of the element K measured
in the direction of w. The factor ~(-) is a monotonically increasing function
of Rew such that for local advection dominance (Re W > 1) ~ ---> 1 and for
local diffusion dominance (Re W < 1) ~ ---> 0. The parameter (j* E (0,1] is an
additional free parameter. We set, e.g.
W
_ =
~(ReW) min (Re
-6-,1 ) . (27)

The choice of the parameters TK is again different in diffusion and convection


dominated regions. We put
(28)
for local advection dominance and local diffusion dominance, respectively,
where T* E (0,1]. For theoretical analysis of such a choice we refer to [6],
[9].
The nonlinear problem (24) is (on each time level) solved iteratively. Start-
ing from an initial approximation U~O) and assuming that already iterate U~k)
has been computed, we define U~k+l) E W h X Mh by

a(U~k), U~k+l), Vh ) + £h(U~k), U~k+l), Vh ) (29)


+Ph(U~k+l), Vh ) = !(Vh ) + Fh(Vh ),
VVh E X h X M h.
For each time level tn+l we set
Uh(O) ._
.-
(2An
U
_ An-l
U
An)
,p. (30)
As numerical experiments show, only a few iterations (29) have to be computed
on each time level.
Obviously, problem (29) is linear. It is equivalent to the linear algebraic
system
(31)
where Y:. E Rnh and p E Rmh are vectors whose components represent degrees
of freedom defining the velocity u and the pressure p, respectively, S is a non-
singular nh x nh matrix and B is an nh x mh matrix. The solution of this
system was realized by the direct solver UMFPACK ([1], [2]), which works
sufficiently fast for systems with up to 105 equations. For larger systems the
domain decomposition approach or algebraic multigrid ([11], [12]) will be used.
Application of a Stabilized FEM to Problems of Aeroelasticity 803

4 Description of the airfoil motion

The airfoil can oscillate in the vertical direction and in the angular direction
around the so-called elastic axis. This vertical and torsional motion is described
by the linearized system of ordinary differential equations (see [8], [3])

mil + kHH + Saa = -F, (32)


sail + laa + ka Q = M,

where the following notation is used: H - vertical displacement (oriented down-


ward), Q - angle of rotation around the elastic axis, kH - displacement stiffness,
Sa - static moment round the elastic axis, la - inertia moment, ka - torsional
stiffness.
The force F acting in the vertical direction and the torsional moment M
are defined by

F -1 L
=
rWt
2

j=l
T2jn j dS, (33)

M = -1 L rWt
2

i,j=l
Tijnjr?rtdS,

where

Tij = P ( -pbij +V (~~; + ~~~) ) , (34)

r ort
l =- (
X2 - XT2,
)
r2
ort
= Xl - XT1,

n = (nl' n2) is the unit outer normal to ant on rWt (pointing into the airfoil)
and XT = (XTl' XT2) is the given position of the elastic axis (lying in the
interior of the airfoil) and P is the fluid density.
System (32) is tranformed to a first-order ODE system and then solved by
the second-order Runge-Kutta method. We proceed in such a way that the
computed approximate solution Uh of (24) on time levels tn and tn-1 and the
corresponding force F and moment M are extrapolated and used for obtaining
Hand Q at tn+!. This allows us to determine the mapping Atn+ll the domain
ntn +1 and approximate the domain velocity w n + 1 . Then we pass to (24) on
the next time level t n +1.

5 Numerical results

We have performed a number of numerical simulations. In this paper we present


the results obtained for the profile NACA 63 2 - 415. The length of the profile
is 0.3 m, far field velocity is 20 m 8- 1 and the air kinematic viscosity give
804 P. Svacek, M. Feistauer

Reynolds number equal to 400000. Fig. 5 shows the position of the moving
profile and velocity isolines at four time instants. We can see conspicuously
von Karman vortices leaving the airfoil. Moreover, the oscillations of vertical
position h of the airfoil and the angle a of rotation around the elastic axis are
shown and values corresponding to the above airfoil positions are marked.

ll@
I'
f \ (I
2 I \ f \ /, I , ~
I ' I \ J \ / \,.../,..' ............. - - - -
, " IJ _

1
,
,
"~I
t ~
o '1:1J

time tima

Fig. 3. The moving profile, vertical displacement and rotation angle

Acknowledgments The authors acknowledge the financial support of the


Grant Agency of the Czech Republic by the project No. 101/02/0391 "Numer-
ical simulation and experimental research of aero elasticity of aircrafts consid-
Application of a Stabilized FEM to Problems of Aeroelasticity 805

ering large displacements". This research was also supported under Research
Plans MSM 21000003 and MSM 113200007 of the Ministry of Education of
the Czech Republic.

References

1. Davis T. A.: A column pre-ordering strategy for the unsymmetric-pattern mul-


tifrontal method, Technical report TR-03-006. Submitted to ACM Trans. Math.
Software.
2. Davis T. A., Duff 1. S.: A combined unifrontal/multifrontal method for unsym-
metric sparse matrices, ACM Transactions on Mathematical Software, vol. 25,
no. 1, pp 1-19, 1999.
3. Dowell E. H.: A Modern Course in Aeroelasticity. Kluver Academic Publishers,
Dodrecht, 1995.
4. Elman H., Silvester D.: Fast nonsymmetric iterations and preconditioning for
Navier-Stokes equations. SIAM J. Sci. Comput. 17(1), pp. 33-46,1996.
5. Feistauer M.: Mathematical Methods in Fluid Dynamics. Logman Scientific &
Technical, Harlow, 1993.
6. Gelhard T., Lube G., Olshanskii M. A.: Stabilized finite element schemes with
LBB-stable elements for incompressible flows (preprint)
7. Heywood J. G., Rannacher R., Turek S.: Artificial boundaries and flux and pres-
sure conditions for the incompressible Navier-Stokes equations. Stochastische
mathematische modelle, Universitat Heidelberg, (681), July 1992. (preprint)
8. Horacek J.: Nonlinear formulation of oscillations of a profile for aero-hydroelastic
computations, Proceedings of the Colloquium Dynamics of Machines 2003, 1.
Dobias, editor. Institute of Thermomechanics, Prague, 2003.
9. Lube G.: Stabilized Galerkin finite element methods for convection dominated
and incompressible flow problems, Num. Anal. and Math. Model., Banach Center
publications (29), Warszawa, 1994.
10. Turek S.: Efficient Solvers for Incompressible Flow Problems, Springer Berlin
1998.
11. Wathen A., Silvester D.: Fast iterative solution of stabilised Stokes systems,
part I: Using simple diagonal preconditioners. SIAM J. Numer. Anal., 30(5), pp.
630-649, 1993.
12. Wathen A., Silvester D.: Fast iterative solution of stabilised Stokes systems,
part II: Using general block preconditioners. SIAM J. Numer. Anal., 31, pp.
1352-1367, 1994.
13. Verfiirth R.: Error estimates for mixed finite element approximation of the Stokes
equations. R.A.1.R.O. Analyse numerique/Numerical analysis, 18(2), pp. 175-
182, 1984.
A Numerical Approach to the Dynamical
Behavior of Initiated Pulses in Some Nonlinear
Diffusion Equations

Kenji Tomoeda

Department of Applied Mathematics and Information,


Osaka Institute of Technology
Asahi-ku, Osaka, 535-8585, Japan [email protected]

Summary. The dynamical behavior of the pulses, which is governed by the interac-
tion between diffusion and absorption, shows the several phenomena. The remarkable
ones are the pulse splitting phenomena which accompany pulse connecting phenom-
ena. In this paper such phenomena are investigated from numerical points of view,
and the mathematical justification is stated.

1 Introduction
We consider the propagation of thermal waves in an one-dimensional absorbing
medium in which there is an interaction between diffusion and absorption. To
describe such a propagation we may use the nonlinear diffusion equation with
absorption, which is well-known as the description of the flow of the liquids
through the homogeneous porous medium and is represented in the form of
the initial value problem:
Vt =
(vm)xx - cv P , t > 0, (1.1 )
v(O,x) = vO(x), (1.2)
Here we have the following assumptions:
(i) m(> 1), p(> 0), and c(2: 0) are constants and m + p 2: 2;
(ii) vO(x) E CO(Rl) is nonnegative and has compact support.
In a heated plasma v denotes the temperature and -cv P describes the losses
caused by radiation. We may take p = 0.5 for bremsstrahlung radiation and
0.5:::; p :::; 2 for synchrotron radiation [14]. The diffusion rate of (1.1) vanishes
at points where v = O. This degeneracy causes the occurrence of the finite
propagation of the support.
From analytical points of view, Aronson [1], Oleinik, Kalashnikov and
Chzou Yui-Lin [13], Kalashnikov [7]'[8]' and Herrero and Vazquez [6] proved
the existence and uniqueness of a weak solution and the property of the finite
propagation of the support under the assumptions stated above. Moreover,
v(t, x) is smooth in the open set P(v) = {(t, x)lv(t, x) > 0 and t > O}, and has
the following properties:
Dynamical Behavior of Initiated Pulses 807

(P-1) For c = 0, or c > 0 and p ;:::: 1 the diffusion is active and supp v(t,·)
monotonously expands as t increases;
(P-2) For c > 0 and 0 < p < 1 the absorption is active and the solution
vanishes identically at some finite time T* > O.
In Case (P-1) supp v(t,·) never splits into any multiple connected components
for t > 0, when supp vO(x) is connected. Thus the pulse splitting phenomena
never appear. In Case (P-2) there is a possibility of the pulse splitting phe-
nomena caused by absorption, when VO(x) has two local maxima. Rosenau
and Kamin [14] suggested this possibility by numerical computation. Chen,
Matano and Mimura [3] constructed the pulse splits into multiple connected
components in a finite time. This motivates us to investigate the more detail
of the behavior of pulses. For this end we continue numerical computation
and find the following phenomena, where the initial pulse vO(x) has two local
maxima and a connected compact support:
(NS-1) The pulse splitting phenomena appear, and thereafter these two pulses
evolve separately until one of them vanishes (see the left hand side in
Fig. 1);
(NS-2) The pulse splitting phenomena never appear for t > 0 (see the right
hand side in Fig. 1);
(NS-3) After the pulse splitting phenomena appear, these pulses become con-
nected, and thereafter the pulse splitting phenomena appear again (see
Fig. 2).

0.16 0.18
o o

x x
-2 2 -2 2

Fig.!. Numerical support splitting and non-splitting phenomena

When m + p = 2 and 0 < p < 1, we obtain some sufficient conditions


under which the phenomena (NS-1) and (NS-2) appear ( [12]'[15]). However,
we are unable to answer the question mathematically whether the phenomenon
(NS-3) is true or not. In this paper, we try to justify a part of it, which is as
follows.
(NS-4) The pulse splitting phenomena appear, and thereafter these pulses be-
come connected.
808 K. Tomoeda

0.24
t
o

X
OL_2~----~------------------~~----~2

Fig. 2. Numerical support splitting phenomena with connecting property

We call such phenomena pulse splitting phenomena with connecting property,


and assume the following conditions.
CONDITION A. c> 0, m + p = 2 and 0 < p < 1 hold;
CONDITION B. i) vo(x) E CO(Rl) is a nonnegative function with compact
support and ((vo(x))m-l)x E LOO(Rl) n BV(Rl).
ii) ((vO(x))m-l)x is absolutely continuous on 1= {xlvO(x) > O} and
ess-infI ((vO(x))m-l)xx is finite.
Our proof is based on the finite difference scheme([10]'[1l]'[12]), the com-
parison theorem [2] and Kersner's exact solution [9]. Unfortunately, in the
case where m + p i- 2, m > 1 and 0 < p < 1, we are unable to find any
exact solution and to succeed in constructing the finite difference scheme with
convergence. This is the reason why we are concerned with the specific case
stated in Condition A.

2 Finite difference schemes


We put u = vm - 1 and rewrite (1.1)-(1.2) as follows:
Ut = muu xx + a(u x )2 - e', (4.1)
u(O,x) = uO(x) == (vO(x))m-t, (4.2)
m
where a = - - , e' = (m - l)e and the term of absorption is written as the
m-1
constant -e' by the assumption m+p = 2. Our difference scheme approximates
the problem (4.1)-(4.2) instead of (1.1)-(1.2), and is described as follows:
Find the sequence {uh}n=1,2,- .. c Vh for each u~ E Vh such that
U~+l = Sh,kUh for n = 0, 1,2" ." (4.3)
where £(u~) = £(uO), r(u~) = r(uO) and U~(Xi) = UO(Xi) for all i E Z, h is a
space mesh width and Vh is the set of the nonnegative continuous functions
Uh = Uh(X) with the following properties:
Dynamical Behavior of Initiated Pulses 809

(i) Uh has compact support;


(ii) Uh is linear on each interval [Xi,Xi+l] (i E Z), where

_ {ih for~:Z\{L-1,R+1},
Xi- C fort-L-1, (4.4)
r for i = R + l.
L = L(C) == min{i E Z I ih > C}, C = C(Uh), (4.5)
R = R(r) == max{i E Z I ih < r}, r = r(uh). (4.6)

Sh,k is somewhat complicated form and its detail is stated in [10]'[11] and
[12]. We omit the description of Sh,k. The variable time step k = kn+l ==
tn+l - tn (to = 0) determined by
1
k = -max(uL, UL+l) for the approximation to the left interface, (4.7)
c'
or
1
k = -max(uR, UR-l) for the approximation to the right interface. (4.8)
c'
The left and right numerical interfaces are defined by

Cn=C(Uh) and rn=r(uh) for n=0,1,2,"', (4.9)

respectively. When Sh,kuh' == 0 holds for some integer n* > 0, we put the nu-
merical extinction time T:; = tn*+l == t n*+kn*+l, and stop the numerical com-
putation. We define the left(resp. right) numerical interface curves Ch(t)(resp.
rh(t)) by piecewise-linearly interpolating (tn,Cn)(resp.(tn,rn))(O::::; n::::; n*).
We obtain the basic estimates which enable the proofs of the convergence
of the numerical solutions and the convergence of the numerical interface
curves([10]' [11], [12]). The former can be proved by Graveleau and Jamet's ar-
gument used in the proofs of Lemma 6.1 and Theorem 7.1([5]). The latter can
be proved by applying the idea of DiBenedetto and Hoff [4] to our difference
scheme. Moreover, we obtained the interface equation(see Main Theorem in
[11]). We state the basic estimates and the convergence of numerical solutions
without proof.
Theorem 1 (Basic estimates). Under Condition A assume u~ E Vh. Then
u h either becomes extinct or belongs to Vh for each n ;::: 0, and the following
estimates hold for all n ;::: 0:

T:; ::::; tn + lluh!loo, (4.10)


c
Co - all(u~)xllootn ::::; Cn ::::; rn ::::; ro + all(u~)xllootn, if u h+L ¢: 0, (4.11)
0::::; uh(x) ::::; max(llu~lloo - c't n , 0) on R\ (4.12)
ll(uh)xlloo ::::; ll(u~)xlloo, (4.13)
TV((uh)X) ::::; TV((u~)x), (4.14)
810 K. Tomoeda

II(u~+l - Uh)/kn+lllu(R1) ::; (m + a)llu~llooTV((u~)x)


+c'(ro - Co + 2all(u~)xllootn), (4.15)
inf J2 u O < inf J2 u':' (4.16)
iEZ 2 - iEZ 2 ,

Theorem 2 (Convergence of Numerical Solutions). Under Conditions A


and B let {h} be an arbitrary sequence which tends to zero. Then, there exists
the unique weak solution v of (1.1)~(1.2), and

where 1-{ = [0, (0) x R\ Vh = (Uh)l/(m~l), Uh(t, x) = uh(x) on [tn, tn+d X


R 1 for all tn and h, and T* is the extinction time.

3 Kersner's exact solution

When m + p = 2 and m > 1, there is Kersner's exact solution [9] given by

K(t,x,p,u) = (blt+b2(U))~=:'1 (2.1)

x [a 1(p,u)(b 1t+b 2(u))=:'-1 -a2(b 1t+b 2(u))2 _X 2]::'1,

where

(2.2)

(2.3)

p > 0 and u > 0 are arbitrary numbers and [g]+ = max{g, O}. This solution
satisfies (1.1)~(1.2) with vO(x) = K(O,x,p,u) in the weak sense and becomes
a classical solution in the open set wherein K(t, x, p, u) > o. It is easily seen
that supp K(O, x, p, u) = [-p, p] and the right and left interface curves (+ and
(~ are written as follows, respectively:

(±(t) = ± {a1(p, u)(b1t + b2(u)) =:'-1 - a2(b 1t + b2(u))2} 2" • (2.4)

When
u< 2p 1m (2.5)
(m-1)2V~
holds, supp K(t,x,p,u) expands for t E [0, T(p,u)], where
Dynamical Behavior of Initiated Pulses 811

(2.6)

For t > T(p, 0-) supp K(t, x, p, 0-) shrinks and K(t, x, p, 0-) identically vanishes
at the extinction time t = T* (p, 0-) given by

(2.7)

Fig. 3. Kersner's Exact Solution and Free Boundary with m = 1.5, p = 0.5, c = 1

4 Pulse splitting phenomena with connecting property

For a positive number "( we define an even function VO (x) by

O( ) _ {K(t,x+P+"(,P,o-) on (-00,0]'
(3.1)
v x - v O( -x ) on (0, (0).

Let 7) E (0, p)(o < 7) < p) be an arbitrary fixed constant and E: be an arbitrary
positive number such that E: < K(O, p - 7), p, 0-). For vO(x) we introduce an
even function v~ (x) satisfying the following
CONDITION C. i) v~(x) = v~(-x) and VO(x) :::; v~(x) hold on R\
ii)
on (-00, -"( - 7)],
(3.2)
on [-,,(,0]'
812 K. Tomoeda

and v~(x) is a decreasing function on [-,- 7), -,];

iii) v~,(x) :::; v~(x) holds for s':::; s;


iv) v~(x) is sufficiently smooth in its support [-2p -" ,+ 2p] and

Ilu~xlloo:::; lIu~lloo, TV(u~x):::; (TV(u~)), ess.infu~xx:::: ess.infu~x' (3.3)


where uo(x) = (vO(x))m-l and u~(x) = (v~(x)r-l.

Let v(t, x) and ve(t, x) be the solutions of (1.1)-(1.2) with v(O, x) = vO(x) and
ve(O, x) = v~(x), respectively. Then we obtain the theorem, which justifies the
appearance of the phenomena (NS-4).
Theorem 3. Under Conditions A and C let" p and a be constants such that
supp v(t,x) becomes connected at t = T(p,a). Then, for some €, there exist
i (0 < i < T(p, a)) and x (-, :::; x :::; ,) such that vg(i, x) = 0 holds, and
supp vg(T(p, a)" ) is connected.

Proof. Putting S = [0, T(p,a)] x [-",], we show that S contains at least


one point (i, x) such that vg(i, x) = 0 for some positive constant €. For this
end we assume the contrary; that is, suppose ve(t, x) > 0 on S for s > O. Then,
the following estimates hold by Theorems 1 and 2.

0:::; ue(t,·) :::; max(llu~lloo - c't, 0) on Rl, (3.4)

I: Iluex(t, ')1100 :::; Ilu~xlloo,

luexx(t,x)ldx:::; TV(uex(t, .)) :::; TV(u~x))'


(3.5)

(3.6)

I: I:
where Ue = (ve)m-l. By using these inequalities and Condition C we obtain

ue(t,x)dx = ue(O,x)dx

+
Jo
tj"l-"I {mue(t, X)Uexx(t, x) + a(uex(t, x))2 - c'} dxdt
I:
2,sm-l

-it
=

{2,c'-(m-2)a ue(t,x)uexx(t,x)dx-a[ue(t,x)uex(t'X)["I}dt

:::; 2,sm-l - {2,C' - a max


[O,t] x [-"1,"1]
ue(t, x) ((2 - m)TV(u~) + 21IU~1100)} t
for t E [O,T(p,a)]. (3.7)

Let d 1 be an arbitrary fixed positive constant such that

2,c'
(3.8)
d1 < a ( (2 - m)TV(u~) + 211u~1100 ) .
Dynamical Behavior of Initiated Pulses 813

Then, by the continuity of the solution ve(t, x) and the comparison theorem
on the initial data([2]) there exist positive constants El and Tl < T(p, 0") such
that

max ue(t, x) < dl for t < Tl and E < El < min (d l , K(O, P - 7], p, 0"))
[O,t] x [--y,-y]
(3.9)
We put
t. ( ) _ 2')'E m - l
(3.10)
2 E - {2')'c'-ad l ((2-m)TV(ug)+21I ugll oo ) } '

1:
and choose S < EI such that T2 (s) < TI . Hence, it follows from (3.7) that

Uf(t, x)dx < 0 for t E [T2(S), Td, (3.11)

which is a contradiction. Thus, w(t, i) = 0 holds for some (t, i) E S. It is clear


by the comparison theorem that supp w(T(p, 0"),,) becomes connected, which
completes the proof.

5 Acknowledgments

This work was supported by Japan Society for the Promotion of Science
through Grant-in-Aid (No. 13440038) for Scientific Research (B).

References
l. D.G. Aronson, The porous medium equation, In some Problems in Nonlin-
ear Diffusion(eds. A. Fasano and M. Primicerio), Lecture Notes in Mathematics
1224, Springer-Verlag,1986.
2. M. Bertsch, A class of degenerate diffusion equations with a singular nonlinear
term, Non-linear Anal., 7(1983),117-127.
3. X.-Y. Chen, H. Matano and M. Mimura, Finite-point extinction and continuity
of interfaces in a nonlinear diffusion equation with strong absorption, J. reine
angew. Math., 459(1995),1-36.
4. E. DiBenedetto and D. Hoff, An interface tracking algorithm for the porous
medium equation, Trans. Amer. Math. Soc., 249(1984),463-500.
5. J.L. Graveleau and P. Jamet, A finite difference approach to some degenerate
nonlinear parabolic equations, SIAM J. Appl. Math., 20 (1971), 199-223.
6. M.A. Herrero and Vazquez, The one-dimensional nonlinear heat equation with
absorption: Regularity of solutions and interfaces, SIAM J. Math. Anal., 18
(1987), 149-167.
7. A.S. Kalashnikov, The propagation of disturbances in problems of non-linear
heat conduction with absorption, Zh. Vychisl. Mat. i Mat. Fiz., 14 (1974), 891-
905.
814 K. Tomoeda

8. A.S. Kalashnikov, Some problems of the qualitative theory of non-linear de-


generate second-order parabolic equations, Russian Math. Surveys, 42 (1987),
169-222.
9. R. Kersner, The behavior of temperature fronts in media with nonlinear thermal
conductivity under absorption, Vestnik. Mosk. Univ. Mat., 33 (1978), 44-51.
10. M. Mimura, T. Nakaki and K. Tomoeda, A numerical approach to interface
curves for some nonlinear diffusion equations, Japan J. Appl. Math., 1 (1984),
93-139.
11. T. Nakaki and K. Tomoeda, A finite difference approach to the interface equation
for some nonlinear diffusion equations with absorption, Pmc. Japan Acad., 77,
Ser. A(2001), 32-37.
12. T.Nakaki and K.Tomoeda, A finite difference scheme for some nonlinear diffusion
equations in absorbing medium: support splitting phenomena, SIAM J. Numer.
Anal., 40(2002) ,945-964.
13. O.A. Oleinik, A.S.Kalashnikov and Chzou Yui-Lin, The Cauchy problem and
boundary value problems for equations of the type of nonstationary filtration,
Izv. Acad. Nauk SSSR Ser. Mat., 22 (1958), 667-704.
14. P. Rosenau and S. Kamin, Thermal waves in an absorbing and convecting
medium, Physica, 8D (1983), 273-283.
15. K. Tomoeda, The behavior of impulsively initiated thermal waves in an absorb-
ing medium, Dyn. Contino Discrete Impuls. Syst. Ser. B, 10(2003),151-164.
Fully Two-dimensional HLLEC Riemann Solver
and Associated Difference Schemes

Pavel Vachal 1 , Richard Liska 2 and Burton Wendroff3

1 Czech Technical University in Prague, Brehova 7, 115 19 Prague 1, Czech


Republic [email protected]
2 Czech Technical University in Prague, Brehova 7, 115 19 Prague 1, Czech
Republic [email protected]
3 Los Alamos National Laboratory, Group T-7, MS B284, Los Alamos, NM 87545,
USA [email protected]

Summary. Fully two-dimensional, sixteen state, approximate Riemann solver of


HLLEC type for Euler equations with sixteen states and central waves which ap-
proximate contact discontinuities has been developed. The Riemann solver is used in
Godunov and WAF finite difference schemes. Numerical example of their perfomance
is presented.

1 Introduction

The aim of this paper is to describe the class of HLLE Riemann solvers and
propose a fully two-dimensional extension of the one-dimensional HLLC Rie-
mann solver from [8].
In 1983, Harten, Lax and van Leer suggested [3] to approximate the solu-
tion of a Riemann problem by three constant states, separated by two waves,
propagating with constant speeds. A particular algorithm for computation of
these wave speeds were presented five years later by Einfeldt [2]. Since the
assumption of two waves is correct only for hyperbolic systems of two equa-
tions, Toro, Spruce and Speares [8] added one more wave, creating the 4-state
one-dimensional HLLC solver. In fluid dynamics, this new wave corresponds
to a contact discontinuity. Later, Wendroff presented a series of 9-state solvers,
extending the 3-state HLLE approach to two dimensions [10]' [11]. In this pa-
per, we construct a contact-corrected version of this approach, adding six new
waves.
The outline is as follows: First, we introduce the basic principles of HLLE
and HLLC Riemann solvers in the simple one-dimensional case. Then we pro-
ceed to two dimensions, describe the 9-state solver from [10]' [11] and introduce
its new, contact-corrected version. Finally, we demonstrate the application of
our new solver in two particular numerical methods, namely the Godunov-
type difference scheme and the WAF (Weighted Average Flux) approach, and
present results of selected numerical tests.
816 P. Vachal et al.

o bl L',.t /1x/2 -/1x/2


t 1 + L',.t - 1 - - - - " _

t1
Xi+l!2
(a) (b)

Fig. 1. 1D solvers: (a) 3-state HLLE; (b) 4-state HLLC

2 One Dimension

Methods presented in this paper have been derived for the Euler equations. In
one dimension we have

(:u) (PJ:p) =0,


E t
+
(E+p)u x
(1)

where p is the density, u fluid velocity, p pressure and E the density of the
total energy. Completing the system (1) by the equation of state for an ideal
polytropic gas p = (,-1) (E - ~pu2), we can write it in the general differential
form Wt + f(w)x = O.
Consider the initial Riemann problem

for < xi+!


Xi :::; X
(2)
for xi+! < X :::; Xi+!
Following [11], we will approximate the solution at tl < t < tl + ~t with
three constant states Wo, Wl.2 and WI, divided by two waves, propagating
with constant speeds bO and bl . The situation is shown in Fig. 1 (a). With
this layout and notation, the integral form of the conservation law over this
space-time domain becomes

~x (Wo + WI) = (~x


""2 ""2 + b0
~t) Wo + (bI ~t -b0)
~t W! + (3)

+ ( ""2
~x - bI ~t ) WI + [1 (WI) - 1 (Wo)] ~t.
Note that data below the plots in Fig. 1 show the absolute position, while data
above the plots are relative distances to the center of the staggered cell. Such
simplified notation will be especially advantageous later in 2D.
Solving (3) for Wl. gives
2

W _ blWI - bOWo - 1 (WI) + 1 (Wo)


! - bl - bO •
(4)
Fully Two-dimensional HLLEC Riemann Solver 817

To have the scheme completely determined, we must decide, how to choose


wave speeds bO and b1 . We will again follow [11] and use the Einfeldt speeds,
based on Roe averages, as summarized in [7]. Now we know all the states and
wave speeds of the approximate Riemann solver. Such a solver can be used in
several difference schemes, as we will see in Sect. 4.
As mentioned above, Toro et al. [8] extended this solver to a 4-state ver-
sion, called HLLC, which resolves stationary contact discontinuities exactly.
The main idea is to split the intermediate state by a third wave, representing
a contact. Since velocity and pressure stay unchanged across contact disconti-
nuities, the two central states will differ only in density (and, of course, total
energy, but this can be computed from pressure and other known state vari-
ables). So, consider the initial Riemann problem (2). Then the 4-state solver
can be summarized by the following algorithm:
1. Compute Einfeldt wave speeds bO, b1 from Roe averages as in [7].
2. Compute the intermediate state from (4) the same way as in the 3-state
solver (Fig. l(a)). Denote this state WI = (PI,PIUI,EI)T.
2 2 2 2 2
3. Compute UI, 2
PI and PI from WI.
2 2 2
4. Assume UI and PI to be preserved across the contact discontinuity. Then
2 2
the contact has to move also with speed UI and densities in the two new
2
intermediate regions can be computed from the scalar Rankine-Hugoniot
condition for the mass conservation, applied to the left, resp. right wave:

5. Compute both intermediate states (Fig. l(b)):

Note that there is also another way to evaluate PI, EI Land EI R only from
2 2' 2 '
Rankine-Hugoniot conditions [6, 7, 1].

3 Two-dimensional Riemann Solvers

In two dimensions, the Euler Equations have the form

pu 2pu+p ) + pv
puv ) _ 0
(6)
( puv ( pv 2 + p -
(E+p)u x (E+p)v y
818 P. Vachal et al.

bL12 1lt bI,In At


b6,112llt
"
bli~~l1t "
bi~At

Y=Yj+ll2 ~ V},u:p- t
bg,lIzl1t
b~,1I2I1t b~,lf2At
"
bl~l2l1t "
bri~2At

b~l2,oAt u 1!2,cflt btl2,cf t


"
b~i~l~l1t
(a) (b)

Fig. 2. 2D Riemann solvers: (a) 9-state HLLE with highlighted initial condition; (b)
16-state HLLEC

with the equation of state for ideal polytropic gas p = ("(-l)(E - ~p(u2 +V2)),
where v is the fluid velocity in the y-direction.
First, let us summarize the fully two-dimensional 9-state HLLE Riemann
solver, originally presented in [11]. We want to solve the 2D Riemann problem,
formed by two partitions at x = xi+! and y = YH!' subdividing the domain
[Xi, Xi+l] X [Yj, Yj+l] into four regions with constant states. As the system
evolves in time, we approximate the solution as shown in Fig. 2(a). Waves,
propagating from the center with constant speeds, split the domain into 9 re-
gions. For ~t sufficiently small, none of the waves in the y-direction will reach
°
the edge Y = Yj, so that the state Wl is affected only by the one-dimensional
2'
Riemann problem in the x-direction, given by Wo,o and WI,o. Analogous con-
siderations along the other three edges lead us to following method: The corner
states ( Wo,o, WI,o, WO,I, WI,I) stay undisturbed. We take these to compute
the edge states (W:;.1 , 0, W 12 I, Wo '21, WI ' 1),
I 2
using a 3-state solver based on the
one-dimensional HLLE solver from section 2. Then we do not need to solve
the real two-dimensional problem in the central area, since W 1 1 is given by
2'2
a two-dimensional conservation law, applied to the whole staggered cell.
Such 9-state Riemann solver works well for problems containing shocks and
rarefaction waves, but its weakness is poor resolution of contact discontinuities.
Our goal was to modify it to a solver which resolves contact surfaces better'
(stationary contacts even exactly), and which degenerates to the ID 4-state
solver from Sect. 2 for one-dimensional initial problems in coordinate directions
(i.e. if there is only one interface in the 2D initial condition).
The most straightforward way is to follow Toro's one-dimensional approach
and split e?-ch of the intermediate edge regions (i.e. states W 1 0, W 1 I, Wo 1,
2' 2' '2
WI , 1)
2
by a wave corresponding to a contact discontinuity. It is obvious that
to fulfill the degeneracy condition mentioned above we must subdivide also
Fully Two-dimensional HLLEC Riemann Solver 819

the central region (i.e. state WI 1) by two additional waves. This gives us the
2'2
setup for a 16-state solver as shown in Fig. 2(b).
It is not difficult to compute speeds of splitting waves and values of new
intermediate states along the edges. Basically, we use the same process as in
one dimension (5), taking also transversal velocity into account. In [9] it is
proven that we can simply copy these from the nearest corner states, that is
along the edge Y = Yj we simply take VIOL 2' ,
= Vo ' 0 and VI2 0 R = VI 0, along
J , ,

edge x = Xi we take UO,!,L = uo,o and U O,I,u = UO,l, etc.


What we have to do is to set speeds of waves splitting the central region.
One possibility is to take the velocities from the conservation laws. We compute
the central state as if it would not be further subdivided, then divide the
momentum in each direction by density and use these velocities as speeds of
the splitting waves. Besides this one, there are several other approaches to set
these partitions.
The only remaining issue is to compute new states in the four central re-
gions. First, we estimate the velocities. To fulfill the condition of degeneracy,
we take for each direction the average of velocity in edge states and corner
states with interface parallel to this direction, weigthed by length of these in-
terfaces. That means, for Ul2' 12' UR we take the velocities Uo , 1, Ul2' 1 L, Ul2' 1 , R
J

and U1,1, weighted by the lengths of the interfaces of their regions with up-
per right central region, while for VI 1 UR, we taked weighted average of VI 0,
2' 2' '
VI 1 L, VI 1 U and vII. Then we estimate central densities. This time, for each
, 2 ' '2' '
central state, we compute the density from ID Rankine-Hugoniot conditions
across each interface with surrounding regions (i.e. with corner and edge re-
gions), and weigh the results again by lengths of interfaces. Since we know the
total mass in the central states, which must be conserved, we multiply now all
the four density estimates by a suitable factor to satisfy the mass conservation
law. Similarly, we correct the momentum estimates (corrected density multi-
plied by estimated velocity) in both directions by adding a suitable amount to
each state, in order to keep momentum conserved. Finally, we suppose pres-
sure to be the same in all four central regions and compute its value from the
conservation law for total energy.
Let us summarize the complete algorithm of our new 2D 16-state HLLEC
Riemann solver (consult Fig. 2(b»:
1. Compute wave speeds b1 0' bi 0' b1 1 , bI 1 , bg 1, b6 l ' b~ 1 and b~ 1, using
2' 2' 2' 2' '2 '2'2 '2
Einfeldt formulas from [7].
2. Compute the intermediate (light shaded) states for each of four ID prob-
lems along edges.
a) Using the ID 4-state HLLC solver with appropriate 2D fluxes, com-
pute the longitudinal velocities (i.e. Ul 0, Ul l' Vo 1, VII) and both
2' 2' , 2 ' 2 '
densities (i.e. PI2" 0 L, PI2" 0 R, PIlL'
2"
PI 1 R, Po 1 L, Po 1 U, PI 1 Land
2" '2' '2' '2'

Pq,u)·
820 P. Vachal et al.

b) Copy the transversal velocities always from the nearest corner state
(i.e. VIOL
2' ,
= Vo , 0, Uo '2'
1 L = Uo , 0, etc.
c) Compute the intermediate pressures (i.e. PI 0, PI 1, Po 1, PI 1 ) so, that
2' 2' '2 ) 2,'
the total energy in "lD-stripe" along each edge is conserved.
3. Compute the central state WI 1 from the 2D integral conservation law
as if it was not further subdi~ided. These values will be used to assure
conservation and split the central region.
4. Split the central region into four pieces (UL, UR, LL, LR) divided by waves
with speeds based on velocities from the previous step.
5. Estimate eight central velocities (i.e. Ul 1 "" VII"" nE{UL LL UR LR})
2' 2' .f ' 2) '"
from adjacent edge and corner regions, so that the solver degenerates to
ID solver for ID initial conditions in X-, resp. y-direction.
6. Estimate four central densities (i.e. PI2' 12 n' nE{UL LL UR LR}), using Rankine-
1 1 ) '

Hugoniot conditions across the interfaces with surrounding (edge and cor-
ner) states, weighted by lengths of particular interfaces.
7. Correct central densities from step 6 so, that the integral mass conservation
law over the whole staggered cell is satisfied.
8. Correct central velocities from step 5 so, that integral momentum conser-
vation laws in both coordinate directions over the whole staggered cell are
satisfied.
9. Finally, compute central pressure PI2'21 (assuming it is equal for all four
central states) so, that the total energy is conserved.

4 Application in Difference Schemes


In this section, we demonstarte how our new Riemann solver can be practically
used in difference schemes. In particular, we first present the simplest, first
order Godunov-type method and then a more accurate WAF scheme.

4.1 Godunov-type Scheme


The scenario of the simplest Godunov scheme is as follows:
1. Construct the uniform rectangular mesh with spatial steps b.x and b.y,
with cells centered at (Xi,Yj). Dual (staggered) mesh will be formed by
cells centered at (xi+!' YH!)'
2. Discretize the initial condition, so that the values are constant inside each
original cell. The interfaces form 2D Riemann problems in staggered cells.
3. Choose the time interval b.t so, that these Riemann problems stay sepa-
rated in their respective staggered cells during the whole time step.
4. Let the Riemann problems evolve and compute their approximate solutions
after b.t with the HLLEC Riemann solver from Sect. 3.
5. Create new constant states by integrating the solutions over the original
cells. (For us, this means only weighted averaging.)
6. Repeat from step 3 until the solution at desired time is reached.
Fully Two-dimensional HLLEC Riemann Solver 821

v (1) ax

-r----------~~------------r t=tn

Fig. 3. 1D WAF scheme with a 4-state approximate Riemann solver

4.2 WAF Scheme

Here we apply our 2D Riwmann solver in Weighted Average Flux (WAF)


methods [6, 7]. We want to solve the system of Euler equations in one dimen-
sion. We start as with the Godunov approach: we create a uniform mesh and
replace the initial condition by a piecewise constant function, so that it forms
a set of Riemann problems at cell interfaces. Then we let the system evolve in
time and apply the approximate 4-state HLLC solver on each Riemann prob-
lem as in the Fig. 3. To obtain the value in cell center Xi at new time level
(t n +1 = t n + ~t), we use conservative formula

(7)

where FLl and Fin+l are weighted average fluxes in the staggered cell at left,
2 2
resp. at right. These can be computed at each time level from real physical
fluxes in four states of the solver as
4

FH! = La
k=l
k f (Wi~~) . (8)

To avoid excessive oscillations, weights a k are taken as

k = 1, ... ,4 (9)

with cp ~+O) 1 = -1, cp ~~ 1 = 1, and the limiter functions


2 2

k = 1, ... ,3. (10)

One can use the van Leer limiter, minmod, or many others listed for example
in [7].
In two dimensions, the conservative formula is

and we compute each weighted average flux from cell-sized region centered
at the point given by its indices (i.e. flux FH!,j from rectangle [Xi,Xi+1] x
822 P. Vachal et al.

tnt! tn

Yj+l

Yj+ll2

Yj

/ / /
+---''''\7-,'-i-'~'--+ t ll+1

+---------~---+t"

(a) (b)

Fig. 4. 2D WAF scheme: (a) computation of Fi+!,j; (b) computation of Gi,H!

[y j _! ' y1+!], etc.). Rather than to introduce 2D limiters and derive the WAF
method in two dimensions, we use the 1D WAF approach in each appropriate
direction, with each of the four fluxes taken as an average from a suitable set
of states. This is shown schematically in Fig. 4 and described more precisely
in [9].

5 Numerical Results
In Fig. 5, we can see results for the density, as computed by the Godunov
scheme with 9-state solver, and by both Godunov and WAF schemes with the
new HLLEC solver. The test is a slightly modified version of Test 16 from [4],
originally presented and described in [5]. The domain [0, 1] x [0, 1] is split by
two partitions, located at x = 0.5 and y = 0.5, into four regions with constant
states, forming a 2D Riemann problem. Initial values
PUL = 1.0222 PUL = 1.0 PUR = 0.5313 PUR = 0.4
UUL = -0.7179 VUL = 0.1 UUR =a VUR = 0.1
PLL = 0.8 PLL = 1.0 PLR = 1.0 PLR = 1.0
ULL =0 VLL = 0.1 ULR = a VLR = 0.8276

are chosen so that the analytical solution of one-dimensional problem along


each edge of the domain is a simple wave: from the top in clockwise direction:
left-propagating rarefaction, upward moving shock, stationary contact, upward
moving contact.
As expected, our new two-dimensional HLLEC Riemann solver gives better
results than the 9-state HLLE. The most visible improvement is a sharper
resolution of contact discontinuities, the stationary contact is resolved even
Fully Two-dimensional HLLEC Riemann Solver 823

Godunov with HLLE (9-sl.) Godunov with HLLEC (16-sl.)

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

o 0'---:0c'::.2--0~.4- 0.6 0.8 1 oo'----:o~.2----~0.4~~-:0~.6----0~.8----~,


CFL=0,49, 339 time steps, 400x400 mesh vertices CFL=0,49, 347 time steps, 400x400 mesh vertices

(a) (b)
WAF with HLLEC (16-sl.)

0.2

0.1

a 0.2 0.4 0.6 O.B 1


CFL=0,49, 352 time steps, 400x400 mesh vertices

(c)
Fig. 5. Contour plots of density at time t = 0.2 as computed by particular difference
schemes

exactly. However, also shock and rarefaction wave are also treated better by
the new solver. The price we have to pay for this improved resolution are
slight but visible artifacts at initial locations of discontinuities, in the middle
of the upper edge in Fig. 5 (b),(c). Comparing both schemes which use the
HLLEC solver, we see that WAF performs better than Godunov, which is not
surprising, since it is a second order accurate scheme,

Acknowledgements. First two authors were supported in part by the Czech


Grant Agency grant 201/00/0586.
824 P. Vachal et al.

References
1. P. Batten, N. Clarke, C. Lambert and D. M. Causon (1997): On the Choice of
Wavespeeds for the HLLC Riemann Solver, SIAM J. Sci. Comput., 18, 1553-1570
2. Einfeldt B. (1988): On Godunov-type Methods in Gas Dynamics. SIAM J. Nu-
mer. Anal., 25 (2), 294-318
3. Harten A., Lax P.D., van Leer B. (1983): On Upstream Differencing and
Godunov-type Schemes for Hyperbolic Conservation Laws. SIAM Rev., 25 (1),
35-61
4. Lax P.D., Liu X.-D. (1998): Solution of Two-Dimensional Riemann Problems of
Gas Dynamics by Positive Schemes. SIAM J. Sci. Comp., 19 (2), 319-340
5. Schulz-Rinne C.W., Collins J.P., Glaz H.M. (1993): Numerical Solution of the
Riemann Problem for Two-dimensional Gas Dynamics. SIAM J. Sci. Comput.,
14, 1394-1414
6. Toro E.F. (1989): A Weighted Average Flux Method for Hyperbolic Conservation
Laws. Proc. R. Soc. London A, 423, 401-418
7. Toro E.F. (1997): Riemann Solvers and Numerical Methods for Fluid Dynamics.
Springer Verlag, Berlin, Heidelberg
8. Toro E.F., Spruce M., Speares W. (1994): Restoration of the Contact Surface in
the HLL-Riemann Solver. Shock Waves, 4, 25-34
9. Vachal, P. (2003): Some Aspects of Numerical Resolution of Contact Discontinu-
ities in Conservation Laws. MS Thesis, Czech Technical University, Prague
10. Wendroff B. (1999): A Two-dimensional HLLE Riemann Solver and Associated
Godunov-type Difference Scheme for Gas Dynamics. Computers and Math. with
Applications, 38, 175-185
11. WendroffB. (1999): Approximate Riemann Solvers, Godunov Schemes, and Con-
tact Discontinuities. In: Godunov Methods. Theory and Applications. Oxford,
UK, Oct. 18-22, 1999
Deflation Accelerated Parallel Preconditioned
Conjugate Gradient Method in Finite Element
Problems

Fred J. Vermolen, Kees Vuik and Guus Segall

Delft University of Technology, Department of Applied Mathematical Analysis,


Mekelweg 4,2628 CD, Delft, The Netherlands, [email protected]

Summary. We describe the algorithm to implement a deflation acceleration in a pre-


conditioned Conjugate Gradient method to solve the system of linear equations from
a Finite Element discretization. We focus on a parallel implementation in this pa-
per. Subsequently we describe the data-structure. This is followed by some numerical
experiments. The experiments indicate that our method is scalable.

1 Introduction
Large linear systems occur in many scientific and engineering applications. Of-
ten these systems result from a discretization of model equations. The systems
tend to become very large for three-dimensional problems. Some models in-
volve time and space as independent parameters and therefore it is necessary
to solve such a linear system efficiently at each time-step.
In this paper we only consider symmetric positive definite (SPD) discretiza-
tion matrices. Since the matrices are sparse in our applications, we use an
iterative method to solve the linear system. In order to get a fast convergence
of the method we use a preconditioned Conjugate Gradient method, where
incomplete Choleski factorization is used as a preconditioner. This method is
very suitable for parallellization.
The present study involves a parallellization of the Conjugate Gradient
method in which the inner products, the matrix-vector multiplication and pre-
conditioning are parallellized. This parallellization is done by the use of domain
decomposition, where the domain of computation is divided into sub domains
and the overall discretization matrix is divided over the subdomains. To each
subdomain we allocate a processor. A well-known problem is that the parallel-
lized method is not scalable: the number of CG-iterations and wall-clock time
increase as the number of sub domains increases. To make the method scal-
able one uses a coarse grid correction (see for an overview and introduction
Smith et al [6]) or the deflation method. In [3] it is shown that deflation gives
a larger acceleration to the parallel preconditioned CG-method. The idea to
use deflation for large linear systems of equations is not new. Among others,
Nicholaides [4] and Vuik et al [8, 1, 10, 9] apply this method to solve large ill-
conditioned linear systems. The result of deflation is that the components of
826 F.J. Vermolen et al.

the solution in the direction of the eigenvectors corresponding to the extremely


small eigenvalues are projected to zero. The effective condition number of the
resulting singular system becomes more favourable. In the present paper we
deal with" algebraic" deflation vectors. For more details on the various types
of deflation vectors we refer to [8] and [7].
We assume that the domain of computation D consists of a number of
disjoint sub domains Dj , i E {l, ... ,m}, such that UJ:=lDj = n. To each
sub domain we allocate a processor and a deflation vector, Zj, for which we
define
1, for (x, y) E Dj ,
{ (1)
Zj = 0, for (x,y) ED \ Dj .
In case of Finite Volume methods we have to distinguish between cell-centered
and vertex-centered discretization. In the cell-centered the deflation vector is
not defined on the interfaces between consecutive subdomains. In the vertex-
centered case, however, we have an overlap at the interface points. In this paper
we use a Finite Element discretization, which is always vertex-centered by its
construction. The sub domains may be considered as "super" -elements consist-
ing of a set of finite elements. The global stiffness matrix is never constructed,
only the "super"-element matrix is constructed. Matrix-vector multiplication
is carried out per" super" -element and only after adding of the contributions of
each" super" -element the global vector is obtained. In this way parallellization
of the Finite Element method can be done in a natural way. For the interface
points we use the concept of" average overlap" , which is explained as follows:
Given a deflation vector Zj on an interfacial node that is shared by Dj and p
neighbours of Dj , then we set at this point:
1
z·--- (2)
J - p+ l'
The deflation method is applied successfully to problems from transport in
porous media where coefficients abruptly change several orders of magnitude
[9]. In the present paper we consider a Galerkin Finite Element discretization
of the Laplace equation with a Dirichlet and a Neumann boundary condition
at rD and rN respectively (note that rN U rD = aD):
-Llu = j, (x,y) E D
{
u = u(x,y), for (x,y) E rD,
au
an = 0, for (x,y) E rN
(3)

where u denotes the solution and u represents a given function. The resulting
discretization matrix is symmetric positive definite. The domain is divided
into sub domains and the resulting system of linear equations is solved by the
use of a parallellized Deflated ICCG. In the text the algorithm is given and
the issues of data-structure for the parallellization of the solution method are
described. Subsequently, we describe some numerical experiments. For more
mathematical background we refer to [1, 10,7].
Parallel deflated ICCe method 827

original domain

subdomain 1 subdomain 2

Fig. 1. Domain decomposition for a vertex centered discretization

2 Deflated Incomplete Choleski preconditioned


Conjugate Gradient Method
In this section we describe the deflated method for the symmetric positive def-
inite discretization matrix A. Let Z = (Zl ... zm) represent the matrix whose
columns consist of the deflation vectors Zj, as defined in equations (1) and
(2). The matrix Z is chosen such that its column space approximates the
eigenspace of these eigenvectors that correspond to the smallest eigenvalues.
We, then, define the projection P := 1- AZ(ZT AZ)-l ZT := 1- AZE- 1 ZT.
It is shown in [7] that the matrix P A is positive semi-definite (and hence singu-'
lar). Kaasschieter [2] showed convergence of the Conjugate Gradient method
for cases in which the matrix is singular. Let b be the right-hand side vector
and x be the solution vector, then we solve
Ax=b. (4)
After application of deflation by left multiplication of the above equation by
P, we obtain
PAx = Pb. (5)
828 F.J. Vermolen et al.

Since PAis singular the solution is not unique. We denote the solution that is
obtained by use of the ICCG method on equation (5) by x. To get the solution
x we use
x = (I _pT)x+pT x . (6)
It is shown in [7J that pTx = pT x, hence the solution of equation (5) can
be used. The second part (I - pT)x = Z(ZT AZ)-l ZTb is relatively cheap to
compute. Hence the solution x is obtained by addition of the two contribu-
tions, i.e. x = pTx+Z(ZT AZ)-l ZTb. For completeness we give the algorithm
of the Deflated ICCG:
Algorithm 1 (DICCG [9]):
- = P E.o'P.. = ~o = L- T L- 1E.o
k = 0 ,E.o -
1
while IIfkl12 > C
-T
k = k + 1, Q;k - E.k-1 ~k-1
- p..rPAP.. k
±k = ±k-1 + Q;P..k' fk = f k- 1 - Q;kPAP..k
-T
Z = L-TL-L;: (3 - E.k ~k
-k -k' k- -T
E.k-1 ~k-1
P..k-1 = ~k + (3kP..k
end while

The inner products, matrix vector multiplication and vector updates in the
above algorithm are easy to parallellize. Parallellization of the incomplete
Choleski preconditioned Conjugate Gradient method has been done before by
Perchat et al [5J. We use a restriction and a prolongation operator and block
preconditioners for the preconditioning step in the above algorithm. Note that
it is necessary to have a symmetric preconditioner. This is obtained by choosing
the restriction and prolongation matrices as transposes of each other. Let E.k be
the residual after k CG-iterations and N be the total number of sub domains,
then overall preconditioning is expressed in matrix form by:

~k = (t Rf Mi- 1Ri) E.k· (7)


>=1

Here ~k represents the updated residual after preconditioning. Further, Ri and


Mi respectively denote the restriction operator and block preconditioner. We
will limit ourselves to the issues of the implementation of the data-structure
needed for the parallel implementation of deflation. The above algorithm is
a standard ICCG except for the lines that contain the matrix P.

3 Data-structure of the deflation vectors for


parallellization
zf
To create P and Pv we need to make Zj and to compute Azj , AZj and Pv. To
do this efficiently we make use of the sparsity pattern of the deflation vectors Zj.
Parallel deflated ICCe method 829

We create the vectors Zj in the subdomain Dj only and send essential parts to
its direct neighbours. We explain the data-structure and communication issues
for a rectangular example. In the explanation we use the global numbering from
the left part of Figure 2. Note that in the implementation the local numbering
is used in the communication and calculation part. The global numbering is
used for post-processing purposes only. The example can be generalized easily
to other configurations. The situation is displayed in Figure 2.

Q4 Q Q
25 Q 3 47
21 22 23 24 8 97 8 9 3

16 17 18 19 20 4 5 6 4 5 6

11 12 13 14 15 1 2 3 1 2 3
7 8 9 7 8 9
Ii !7 IR 19 10
4 5 61 4 I) 6
1 2 3 4 5
1 1 2
Q 1 Q Q 1. j
3~
2 1

Fig. 2. A sketch of division of [2 into sub domains [21, ... , [24. Left figure represents
the global numbering of the unknowns, right figure represents the local numbering.

In Figure 2 on each subdomain Di a deflation vector Zi is created. For


all the interface nodes (say numbers 3, 8, 13, 12, 11 of D 1 ) the number of
neighbours is determined, then equation (2) is applied to determine the value
of the corresponding entry of ZI. This implies that it is necessary to have a
list of interface nodes for each sub domain and to have a list with the number
of neighbouring sub domains on which a particular interface node is located.
However, this information is not sufficient. For example the vector AZI must
be computed and also be multiplied with vectors Zj. The vector AZI has non-
zero entries not only inside the domain Dl and on its interfaces, but also in its
direct neighbours, i.e. the points 4, 9, 14, 16, 17, 18, 19. Therefore we also need
to extend the vectors ZI with these points to have a well defined matrix-vector
multiplication. In the same way the vectors Azj , j E {2, 3, 4} have non-zero
entries in D 1 . For the other deflation vectors we proceed analogously.
We further explain the computational part which is relevant to processor
1, i.e. subdomain WI only, the other subdomains are dealt with similarly. For
example AZ2 will have a non-zero contribution in all interface points of Dl
and D2 but also in the points 2, 7 and 12. All vectors in common points of
any subdomain and Dl are given the global value, i.e. the value that is the
result of addition. This requires communication between Dl and this particular
subdomain. This is in contrast to the matrix A, which is stored only locally
830 F.J. Vermolen et al.

per subdomain without the addition at common interfaces. So to compute


AZ2 on ill we need an extra list of neighbouring points of il2 in domain ill
that are not on the common interface. This, however, is not sufficient. For
example the value of AZ2 in nodal point 12 also has a contribution of il4 . So
in ill we need an extra list of points on common sides of ill and ilj that are
direct neighbours of ilk (j i- k) but are not on the interface of ill and ilk.
For example node 12 is a common point of ill and il4 and a neighbour of
il2, further il4 is a neighbour of il2. After communication and addition of the
values of Az at these particular nodes, the matrix E, consisting of the inner
products, is calculated and sent to processor l.
Then, for a given vector v we compute at each processor its inner product
with Zi (zT v). Then all these inner products are sent to processor 1 (ill) where
Y = E-l (zi v zI v zT v z'.[ v) T is computed by Choleski and subsequently the
results are sent to all the other neighbouring processors. Then, Pv = v - Z Ay
is computed locally. All the steps are displayed schematically in algorithm 2,
where we explain the situation for a case with two processors. Note that E
has a profile structure where the profile is defined by the numbering pattern
of the subdomains. Hence for a block structure in two dimensions we obtain
a similar sparsity pattern for E as for a two-dimensional discretization. If,
however, a layered structure is used, then E gets the same pattern as a one-
dimensional discretization matrix.
Algorithm 2 (Parallellization of Deflation) P = I - AZ (ZT AZ) -1 ZT.

Processor 1 Processor 2
Make Zl Make Z2
Communication
Make Az l , Az[ Make Az2, Az[
Communication
AZl = AZl + Az[ AZ2 = AZ2 + Az[
Ell = zi AZl E22 = zI AZ2
E12 = zi Az[ E22 = zI Az[
Send E to proc 1
Choleski decomp E
Pv=
v - AZE-1ZTv
Compute ziv Compute zIv
Send zIv

Send y to proc 2
v - ylAz l - Y2Az[ v - Y2Az2 - Y1Az[
Parallel deflated Ieee method 831

4 Numerical experiments

To illustrate the advantage of the deflation method we present the number


of CG-iterations and wall-clock time as a function of the number of layers
(left and right graphs respectively in Figure 3). We start with one layer and
extend the domain of computation with one horizontal layer, which is placed
on top. This is done consecutively up to 7 layers. In the examples we choose
the number of elements the same in each layer. The problem size increases
as the number of layers increases. It can be seen that if deflation is not used
then the convergence will take more time since the number of CG iterations
increases. The use of deflation yields that the number of iterations and wall-
clock time for the parallel case does not depend on the number of layers. This is
also observed for the number of CG-iterations for the sequential computations.
This makes the method scalable.
Further, we present the number of iterations as a function of the number
of layers as in the preceeding example for three methods: no projection, coarse
grid correction and deflation. The results are shown in Figure 4 (left graph). It
can be seen that both the coarse grid correction and the deflation methods are
scalable, however deflation gives the best results. This is in agreement with the
analysis as presented in [3] where it is proven that the deflated method con-
verges faster than the coarse grid correction. The same behaviour is observed
if the domain is extended in a blockwise distribution of added sub domains (see
the right graph in Figure 4).

..... no deflation seq. ...=.~nO~de~fla~tjo=n=s.e=q··o-l~--~----~~


61r.=,
300 .•.• deflation seq.
. ... - deflation seq
.. ... no deflation par. .' 5 .. ... no deflation par
250 - deflation par. - deflation par.

200
~
,g
~150

.. .... ----------------
"

..' .............
50 ........
~L---~--~--~4--~~--~--~
~
°1L---~--~---4~--~--~--~

number of layers number of layers

Fig. 3. Left figure: The number of iterations as a function of the number of layers for
deflated and non-deflated parallellized and sequential Ieee method. Right figure:
The wall clock-time as a function of the number of layers for deflated and non-deflated
parallellized and sequential Ieee method.
832 F.J. Vermolen et al.

roL,--~~--~,----7,----,~--~--~ ro~,--~--~~,---7,--~.~-7--~~
number of subdomains number of subdomains

Fig. 4. The number of iterations as a function of the number of layers for the
parallellized ICCG method for three methods: no projection, coarse grid correction
and deflation. Left graph: layered extension of the domain of computation, Right
graph: blockwise extension of the domain of computation

5 ConcI usions

The deflation technique has been implemented successfully in a parallellized


and sequential ICCG method to solve an elliptic problem by the use of finite
elements. The domain decomposition can be chosen blockwise and layerwise.
Some numerical experiments are shown in the present paper. Further, the
number of iterations and wall-clock time become independent of the number of
added layers if deflation is applied in a parallel ICCG method. Hence deflation
is favourable in both sequential and parallel computing environments.

References

1. J. Frank and C. Vuik. On the construction of deflation-based preconditioners.


SIAM J. Sci. Comput., pages 442-462, 2001.
2. E. F. Kaasschieter. Preconditioned Conjugate Gradients for solving singular
systems. Journal of Computational and Applied Mathematics, 24:265-275, 1988.
3. R. Nabben and C. Vuik. A comparison of deflation and coarse grid correction
applied to porous media flow. Technical report 03-10, Delft University of Tech-
nology, Delft University of Technology, Delft, The Netherlands, 2003.
4. R.A. Nicholaides. Deflation of Conjugate Gradients with applications to bound-
ary value problems. SIAM J. Numer. Anal., 24:355-365, 1987.
5. E. Perchat, L. Fourment, and T. Coupez. Parallel incomplete factorisations for
generalised Stokes problems: application to hot metal forging simulation. Report,
EPFL, Lausanne, 2001.
6. B. Smith, P. Bj¢rstad, and W. Gropp. Domain Decomposition. Cambridge
University Press, Cambridge, 1996.
7. F.J. Vermolen, C. Vuik, and A. Segal. Deflation in preconditioned Conjugate
Gradient methods for finite element problems. J. Comput. Meth. in Sc. and
Engng., to appear, 2003.
Parallel deflated ICCG method 833

8. C. Vuik, A. Segal, L. el Yaakoubli, and E. Dufour. A comparison of various de-


flation vectors applied to elliptic problems with discontinuous coefficients. Appl.
Numer. Math., 41:219-233, 2002.
9. C. Vuik, A. Segal, and J. A. Meijerink. An efficient preconditioned CG method
for the solution of a class of layered problems with extreme contrasts in the
coefficients. J. Comput. Phys., 152:385-403, 1999.
10. C. Vuik, A. Segal, J. A. Meijerink, and G. T. Wijma. The construction of
projection vectors for a Deflated ICCG method applied to problems with extreme
contrasts in the coefficients. J. Comput. Phys., 172:426-450, 2001.
Advantages of Binomial Checkpointing
for Memory-reduced Adjoint Calculations

Andrea Waltherl and Andreas Griewank 2

1 Institute of Scientific Computing, Technische Universitat Dresden, Germany


awalther@math. tu-dresden. de
2 Institute of Mathematics, Humboldt University Berlin, Germany
[email protected]

Summary. Checkpointing techniques become more and necessary for the compu-
tation of adjoints. This paper presents the more common multi-level checkpointing
as well as the less known binomial checkpointing. The checkpointing approaches are
compared with respect to the number of time steps the adjoint of which can be calcu-
lated, the run-time needed for the adjoint calculation and the memory requirement.
Some examples illustrate the shown results

1 Introduction

For many time-dependent applications, the corresponding simulations can be


performed using ordinary or partial differential equations. Furthermore, quite
often there are quantities that influence the result of the simulation. Through-
out, we assume that these quantities are control functions, for example heating
in and/or at the boundary of a domain. To compute an approximation of the
simulated process for a time interval [0, T], one applies an appropriate inte-
gration scheme given by

i = 1, ... ,N,

where Yi E R n denotes the state and Ui E Rrn the control at time ti for a given
time grid to, ... , tN with to = 0 and tN = T. The operator Fi : Rn xRrn x R f--+
R n defines the time step to compute the state at time t i . Note that we do not
assume a uniform grid. To optimize a specific criterion or to obtain a desired
state, the cost functional

J(U) = l(y(u), u)

measures the quality of y(u) and U = (Ul, ... ,UN).


Here, y(u) = (Yl(U)"" ,YN(U)) describes the dependence of the state Y on the
control u. For applying a calculus-based optimization method, one may use an
adjoint integration

YN = 0, Yi-l i = N, ... ,l, (1)


Advantages of Binomial Checkpointing 835

motivated by the adjoint differential equation that belongs to the differen-


tial equation describing the state. Subsequently or concurrently, the desired
derivative information Ju ( u) can be reconstructed from Y = (Yo, ... ,YN). The
specific choice of the adjoint steps Pi depends on the forward integration and
whether one prefers the continuous adjoint or the discrete adjoint formulation,
see e.g. [7, 3, 5]. For the purpose of this paper, it is only important to note
that the adjoint integration has to be performed backwards in time and that
the complete forward trajectory Y = (Yo, ... ,YN-I) is required. Hence, storing
all states (Yo, ... , YN-I) during the forward integration and reading them in
reverse order during the adjoint integration forms one simple possibility to
overcome this difficulty. Then the computing time for the adjoint calculation
consists of the evaluation of N time steps Pi storing the state Yi-I and the
evaluation of N adjoint steps Pi.
The storage requirement of the basic approach to calculate adjoints is pro-
portional to the number N of time steps. If we want to calculate the adjoint of
a real-world problem with thousands of time steps this memory requirement of
the basic approach may become a serious problem. For example, for computing
3D flows with unstructured grids one may need easily 10 to 100 MBytes to
store only one state vector Yi [10]. Therefore, it is reasonable to assume that
due to their size, only a very limited number of intermediate states can be
kept in memory. They may serve as checkpoints, such that the required infor-
mation for the backward integration is generated piecewise during the adjoint
calculation. Sections 2 and 3 present two different checkpointing techniques.
The resulting run-times and memory requirements are compared in Section 4.
Finally, some conclusions and an outlook are given in Section 5.

2 Uniform Checkpoint Distribution


To distribute the checkpoints equidistantly over the given number of time steps
forms one obvious solution to the storage requirement problem. Subsequently
the adjoints are computed for each of the resulting groups of time steps sepa-
rately. Denoting the number of checkpoints used by c, the corresponding cal-
culation of the adjoint values can be performed using the following algorithm
where the counter i is identified with the state Y;:
Two-level Checkpointing
Initialization: Reserve space for CI checkpoints, store the initial state Yo
in the first one and set
C2 - { iN/(CI + l)l if cIiN/(CI + l)l < N
- LN/(CI + l)J else

Advance: Starting from the initial state, advance to state CI . C2 by per-


forming the time steps Pi, 1 ::; i ::; CI . C2. While integrating forward, store
the states (j - 1) C2 in the checkpoints j for j = 2, ... , CI.
836 A. Walther, A. Griewank

Reverse:
dOP=Cl,O,-1
Evaluate the time steps F i , p. C2 < i < N storing the states i, p. C2 :::;
i < N -1,
perform the adjoint steps Pi, N ?: i > p. C2 to calculate the adjoints,
set N = p' C2, if P > 0 read the contents of checkpoint p.
end do
Fig. 1 sketches the two-level checkpointing for N = 16 time steps and
C = Cl + C2 = 6. Throughout, the time steps are plotted along the vertical
axis and the computing time required for the adjoint calculation is represented
by the horizontal axis. Each solid horizontal line including the horizontal axis
itself represents a checkpoint. The time, when a state is stored in a checkpoint,
is marked with a black circle for the first level and with a black square for the
second level. The slanted black lines represent the evaluation of time steps. The
adjoint steps are drawn as dashed slanted lines. Finally, black arrows depict
the usage of a state Yi for an adjoint step Fi+l without performing the corre-
sponding time step F i . This adjoint calculation is possible due to the assumed
structure (1) of the adjoint steps. Note, that it may be required to evaluate
FN once to initialize the adjoints. This evaluation can be introduced right af-
ter the evaluation of F N - 1 for p = Cl. For illustration purposes, we suppose
throughout that all time steps and all adjoint steps have the same temporal
complexity normalized to 1. However, to apply the presented optimal check-
pointing techniques, only the identical temporal complexity of all time steps
is required. In this example, 24 time steps are performed. Hence, the number

1 10 20 30 40
,t

10

Fig. 1. Two-level checkpointing for N = 16 time steps and C = Cl + C2 = 6 check-


points

of additional time step evaluations caused by the two-level checkpointing com-


pared to the basic approach equals 9. Furthermore, at most 6 states have to
be kept in memory.
The two-level checkpointing has been proposed several times in the litera-
ture, e.g. [11, 1], and is easy to implement. Naturally, one can apply two-level
checkpointing repeatedly for the groups of time steps that are separated by
Advantages of Binomial Checkpointing 837
1 10 20 30 40 50 60

1
• level 1
• level 2
• level 3
N

Fig. 2. Three-level checkpointing for N = 18 time steps and c = 5 checkpoints

equidistant checkpoints. This approach is called multi-level checkpointing [6]


and sketched by Fig. 2 for the three-level case. The multi-level checkpointing
is defined by the number of levels r, the number of checkpoints Ci that are
uniformly distributed at level i, i = 1, ... ,r - 1, and the number of states Cr
that have to be stored at the highest level r. Hence, the parameters of the
adjoint calculation shown in Fig. 2 are Cl = 2, C2 = 2, and C3 = 1. For a given
r-level checkpointing, one easily derives the following identities
r r r N r r
Nr = IT(Ci + 1), Mr = 2: Ci' Tr = 2::,+
i=l
i
; = rN - 2: IT
i=l
(Cj + 1) ,
i=l i=l j=l
j::;i:-i

where N r denotes the number of time steps for which the adjoint can be
calculated using the specific r-level checkpointing. The corresponding memory
requirement equals Mr. The number of time step evaluations required for the
adjoint calculation is given by TTl since at the first level c1Nr / (Cl +1) time steps
have to be evaluated to reach the second level. At the second level, one group
of time steps is divided into C2 + 1 groups. Hence, c2(Nr/Cl + 1)/(c2 + 1) time
steps have to be evaluated in each group to reach the third level. Therefore, we
obtain (Cl + 1)C2(Nr/Cl + 1)/(c2 + 1) = C2Nr/(C2 + 1) at the second level and
so on. It follows that each time step Pi is evaluated at most r times. Hence,
if we apply two-level checkpointing, each time step is evaluated no more than
two times.
The two- as well as the multi-level checkpointing technique have the draw-
back that at each level the checkpoints are not reused. Each checkpoint stores
at each level only one state and becomes idle as soon as the data that is stored
in the checkpoint has been used for the adjoint calculation. A method that
reuses the checkpoints as soon as possible is proposed in the next section.

3 Binomial Checkpoint Distribution

When one applies the checkpointing technique proposed in [5], the adjoint
values are again generated piece by piece but only one state is employed for
838 A. Walther, A. Griewank

the adjoint calculation at any time. Therefore, the checkpointing procedure


has to be adapted as follows:
Binomial Checkpointing
Initialization: Reserve space for c checkpoints and store the initial state
Yo in the first one.
do p = N, 1,-1
Advance: Starting from the last checkpoint assigned, advance to state
p-l.
If checkpoints are free, set as many of them as possible to states i along
the way.
Reverse: Perform the adjoint step Fp to calculate the adjoint.
If state p - 1 is stored in a checkpoint, free the checkpoint up.
end do
The memory requirement of this checkpointing procedure equals Mb = c. Nat-
urally, the question arises where one should place the checkpoints in the action
"Advance" of the algorithm to minimize the number of time step evaluations.
The application of the routine revolve ensures that the initiated checkpointing
process is provably optimal with respect to the run-time increase for a given
number of checkpoints [5]. More specifically, for the structure (1) ofthe adjoint
steps considered here, the following complexity result holds:
Theorem 1. Let N be the total number of time steps for which the adjoint has
to be calculated. Suppose, up to c checkpoints are available at any time. Then
the minimal number of time step evaluations needed for the adjoint calculation
equals

(c
n = N r_ + r) ,
r-1
where r the unique integer satisfying

(2)

The proof of Theorem 1 (see [5]) constructs recursively checkpointing schedules


that attain the minimal number n.
For the optimal checkpointing procedures
the positions of the checkpoints are given by binomial coefficients. This fact
explains the name binomial checkpointing. Furthermore, the proof of Theo-
rem 1 shows that each time step Fi is evaluated at most r times. Hence, r
has the same meaning as in the previous section. It was proved earlier that a
logarithmic growth of memory and run-time can be achieved using binomial
checkpointing by providing an appropriate number of checkpoints [4].
The routine revolve implements the optimal binomial checkpointing and
can be incorporated easily in an existing adjoint calculation [2, 8]. Moreover,
one can build a heuristic based on revolve such that the adjoint calculation
Advantages of Binomial Checkpointing 839

using binomial checkpointing becomes applicable also if the number of time


steps is not known a-priori, e.g. due to adaptive time stepping, and/or if the
temporal complexity of the time steps is not constant, e.g. due to implicit
methods [9].
One optimal checkpointing schedule computed with revolve for N = 16
time steps and c = 3 checkpoints is shown in Fig. 3. Once more, it might be

1 10 20 30 40 50 60
. ' t

10

Fig. 3. Binomial Checkpointing for N = 16 time steps and c = 3 checkpoints

necessary to evaluate FN once to initialize the adjoints. Since the situation is


the same for the multi-level checkpointing and does not influence the results
in the sequel, we ignore throughout the evaluation of F N . For the example
shown above, the number of time step evaluations equals Tr = 33. Compared
to the two-level checkpointing, the computing time for the adjoint calculation
increases by less than 50 %. Furthermore, only 3 states have to be kept in
memory. Hence, the storage requirement is reduced by 50 %. The relation
between the two checkpointing approaches will be discussed in more detail in
the next section.

4 Comparison of Both Checkpoint Distributions

The integer r has the same meaning for both checkpointing approaches, namely
the maximal number of times any particular time step Fi is evaluated during
the adjoint calculation. Hence for comparing both approaches, assume at the
beginning that r has the same value and that the same amount of memory is
used, i.e. Mr = Mb = C.
Now, we examine the maximal number of time steps N* for which an
adjoint calculation can be performed using the two approaches. Assuming that
r is a divisor of c and Mr = c, one obtains the identity

C
with Ci=-, i=l, ... ,r,
r

for the uniform checkpoint distribution because of the structure of N r . Theo-


rem 1 yields
840 A. Walther, A. Griewank

for the binomial checkpoint distribution. Obviously, one has


C C
-+1<--.+1 for 0 <i ::::: r - 1 .
r r- z
These inequalities yield N; < Ni; if r ;:::: 2. Hence for all r ;:::: 2 and and a given
c, binomial checkpointing allows the adjoint calculation for a larger number of
time steps compared to uniform checkpointing. In more detail, using Stirling's
formula we obtain

N*
_b
N;
~ (c+r) r )-r
r (C-+1 =-1-
v27fr
(C-+1 )C ~--exp(r).
r 1
v27fr
Hence, the ratio of Ni; and N; grows exponentionally in r without any de-
pendence on the number of available checkpoints. Fig. 4 shows N; and Ni; for
the most important values 2 ::::: r ::::: 5. Since r denotes the maximal number

Maximal N for r=2 Maximal N for r=3


30000
1400 binomial checkpointing - - binomial checkpointing - -
uniform checkpointing ..... 25000 uniform checkpointing ..
1200
1000 20000
*Z SOO *Z 15000
600
10000
400
200 5000
0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
#checkpoints #checkpoints
Maximal N for r=4 Maximal N for r=5
SOOOO SOOOO r-~~bin-o-llll~'-a1~c'Te-c~k-po~in-t-in~g~'i~--'
binomial checkpointi g - -
70000 uniform checkpoin g ......... 70000 uniform eckpointing ..I--..
60000 60000 /

z
50000 50000 ./
*Z 40000 40000 .
30000 30000
20000 20000
10000 10000
0 O'--~""-~=...-~~~~~----'

0 5 10 15 20 25 30 35 40 45 50 o 5 10 15 20 25 30 35 40 45 50
#checkpoints #checkpoints

Fig. 4. N; and Nb for r = 2,3,4,5

of times each time step is evaluated, we have the following upper bounds for
the number of time steps evaluated during the adjoint calculation using r-level
checkpointing and binomial checkpointing, respectively:
Advantages of Binomial Checkpointing 841

Tr
c
= C ( :;: + 1
)r-l < r N; and .Lb
rr> = r N*b - (c + r) <
r-l
r Nb* .

For example, it is possible to compute the adjoint for N = 23000 time steps
with only 50 checkpoints, less than 3N time step evaluations, and N adjoint
steps using binary checkpointing instead of three-level checkpointing, where
N; ::; 5515. If we allow 4N time step evaluations then 35 checkpoints suffice
to compute the adjoint for 80000 time steps using binomial checkpointing,
where Nt ::; 9040. These number are only two possible combinations taken
from Fig. 4 to illustrate the really drastic decrease in memory requirement
that can be achieved if binomial checkpointing is applied.
However, usually the situation is the other way round, i.e. one knows N
andlor c and wants to compute the adjoint as cheap as possible in terms of
computing time. Here, the first observation is that r-level checkpointing intro-
duces an upper bound on the number of time steps the adjoint of which can
be computed, because the inequality N ::; (clr + It must hold. Furthermore,
binomial checkpointing allows for numerous cases also a decrease in run-time
compared to the uniform checkpointing. For a given r-level checkpointing and
Mr = c, one has to compare Tr and T b. Let rb be the unique integer satisfying
(2). Since at least one checkpoint has to be stored at each level, one obtains
the bound r ::; c. I.e., one must have c >= log2(N) to apply uniform check-
pointing. Therefore, the following combinations of rand rb are possible for the
most important, moderate values of r:

r = 3 :=} rb E {2,3}, r =4 :=} rb E {3,4}, r =5 :=} rb E {3, 4, 5} .

For 3 ::; r ::; 5, one easily checks that Tr > n holds if rb < r. For r = rb, one
can prove the following, more general result:
Theorem 2. Suppose for a given N and a r-level checkpointing with Mr = c
that the corresponding rb satisfying (2) coincide with r. Then, one has

T2 = 2N - c- 2 = n ifr = rb = 2
Tr >n if r = rb > 2.
Proof: For rb = r = 2 the identity T2 = n is clear. For r = rb > 2, the
inequality

holds. Using the definitions of Tr and T b, this relation yields immediately


Tr >n. •
842 A. Walther, A. Griewank

Hence, except for the case r = rb = 2, where Tr and n


coincide, the run-time
caused by binomial checkpointing is less than the one caused by multi-level
checkpointing if r = rb.

5 Conclusions
This article discusses several checkpointing techniques, namely multi-level
checkpointing and binomial checkpointing. A detailed analysis of the num-
ber of time steps the adjoint of which can be calculated, the run-time needed
for the adjoint calculation and the memory requirement is given.
One can conclude that binomial checkpointing allows adjoint calculations
with a surprisingly small fraction of the memory needed by the basic approach.
This storage reduction causes only a very moderate increase in run-time. On
the other hand, we see that r-level checkpointing induces for a given number of
checkpoints an upper bound on the number of time steps the adjoint of which
can be computed. This upper bound can only be increased by introducing a
next level of checkpointing. In addition it is shown that the run-time required
for the adjoint calculation with r-level checkpointing exceeds the run-time
needed for binomial checkpointing for the most important values of r > 2,
whereas for r = 2 both methods yield the same run-time. However, for r = 2
and a given amount of memory, binomial checkpointing allows the adjoint
computation for a larger number of time steps. Hence, even for r = 2 binomial
checkpointing is preferable.
Moreover, it is quite often the case that the number N of time steps is not
known a-priori, for example due to an adaptive time stepping method. Then,
it becomes difficult to distribute the checkpoints for the two- or multi-level
checkpointing such that the minimal run-time is attained. For binomial check-
pointing the extension a-revolve deals with the unknown number of time steps
by using a heuristic for the checkpoint placements. In addition, a-revolve can
also handle time steps with varying temporal complexity. For time steps the
cost of which do not change drastically, the heuristics implemented in a-revolve
work well such that the corresponding adjoint calculation is only a few percent-
ages slower than the one based on revolve [9]. Hence, binomial checkpointing
provides memory-reduced adjoint calculation also in more general situations.

References
1. Charpentier, I. (2001): Checkpointing schemes for adjoint codes: Application to
the meteorological model Meso-NH. SIAM J. Sci. Comput., 22, 2135-2151
2. Gockenbach, M., Reynolds, D., Symes, W. (2002): Efficient and Automatic Im-
plementation of the Adjoint State Method. Trans. Math. Soft., 28, 22-44
3. Griesse, R. (2003): Parametric Sensitivity Analysis in Optimal Control of a
Reaction-Diffusion System-Part II: Practical Methods and Examples. To appear
in Opt. Meth. Soft.
Advantages of Binomial Checkpointing 843

4. Griewank, A. (1992): Achieving logarithmic growth of temporal and spatial


complexity in reverse automatic differentiation. Opt. Meth. Soft., 1, 35-54
5. Griewank, A., Walther, A. (2000): Revolve: An implementation of checkpointing
for the reverse or adjoint mode of computational differentiation. Trans. Math.
Soft., 26, 19-45
6. Heimbach, P., Hill, C., Giering, R. (2003): An efficient exact adjoint of the
parallel MIT general circulation model, generated via automatic differentiation.
To appear in Future Generation Computer Systems.
7. Hinze, M. (1999): Optimal and instantaneous control of the instationary Navier-
Stokes equations. Habilitationsschrift, Fachbereich Mathematik, Technische
Universitiit Berlin
8. Hinze, M., Walther, A. (2002): Discrete approximation schemes for reduced
gradients and reduced Hessians in Navier-Stokes control utilizing an optimal
memory-reduced procedure for calculating adjoints. Preprint MATH-NM-06-
2002, Tech. Uni. Dresden. Submitted
9. Hinze, M., Sternberg, J. (2003): A-Revolve: An adaptive memory- and run-time-
reduced procedure for calculating adjoints; with an application to the instation-
ary Navier-Stokes system. To appear in Opt. Math. Soft.
10. Kaltenbach, H.-J., Jurgens, W., Spille, A. (2001): Numerische Simulation, Bee-
influssung und Eigenmoden-Analyse einer abgelosten Stromung mit Querkom-
ponente. In: Ergebnisberichte SFB 557 TP A6
11. Kubota, K. (1998): A Fortran77 preprocessor for reverse mode automatic dif-
ferentiation with recursive checkpointing. Opt. Meth. Soft., 10, 319-335
An Efficient Multigrid FEM Solution
Technique for Incompressible Flow with
Moving Rigid Bodies

Decheng Wan, Stefan Turek and Liudmila S. Rivkind

Institute of Applied Mathematics L8 III, University of Dortmund,


Vogelpothsweg 87, 44227 Dortmund, Germany

Summary. This paper uses the fictitious boundary method described in [1] for the
solution of incompressible flow with moving rigid bodies in complex geometries. The
method is based on a special treatment of Dirichlet boundary conditions inside of
a FEM approach in the context of a hierarchical multigrid scheme such that the flow
can be efficiently computed on a fixed computational mesh while the solid boundaries
are allowed to move freely through the given mesh. In this paper, we focus on the
calculations of the drag and lift forces acting on the moving solid bodies which are
not captured by the mesh. The comparison between the present and benchmark
results for the flow around a circular cylinder with different Reynolds numbers is
first presented, and then the result for a circular cylinder oscillating in a channel
is given. The simulation results compared with corresponding reference results are
found to be very reasonable and satisfactory.

1 Introduction

Incompressible flow problems with moving rigid bodies in complex geometries


have drawn attention of numerous investigators. Their studies have been mo-
tivated by the desire to understand the fundamental physics of such flows as
well as their practical importance in various areas. The phenomena of such flow
problems are visible everywhere around our living environments such as: flow
around high-rise building, the drag force induced by driving car accelerating in
the wind, ocean current interaction with the offshore structures, sedimentation
flow in estuary and sand flow in desert, etc.
From the numerical point of view, incompressible flow with moving rigid
bodies in complex geometries is quite hard to simulate, since it can require
a huge amount of time for the generation or deformation of the computational
grid when the corresponding boundaries are complex or changing. Such prob-
lems have motivated the development of numerous algorithms, which can be
broadly classified into two families. One of them is a 'body-conformal approach'
which always keeps the computational mesh in accordance to the geometrical
details [2, 3]. Another one is a 'fixed grid approach' in which case the mesh
is (arbitrarily) fixed and internal objects are allowed to move freely through
the mesh [4, 5]. One big advantage of such 'fixed grid approaches' over the
An Efficient Multigrid FEM Solution Technique 845

conventional 'body-conformal approaches' is that the computational mesh re-


mains unchanged such that the CPU cost per time step can be significantly
decreased - less computational effort due to saving the expensive mesh genera-
tion - and that such techniques can be easily incorporated into standard CFD
codes which mostly allow fixed computational grids without local adaptivity
only; however, the resulting accuracy is not clear. Therefore, the overall aim
is to deal successfully with the moving boundaries such that the accuracy of
the numerical approximation is sufficiently high while at the same time also
the computational cost is decreased.
In the spirit of the 'fixed grid approaches', a simple and efficient 'fictitious
boundary method' for the detailed simulation of incompressible flow with com-
plex geometries and/or moving interfaces was developed in the paper [1]. The
method is based on a fixed unstructured FEM background grid. It starts with
a coarse mesh which contains already many of geometrical fine-scale details,
and employs a (rough) boundary parametrization which sufficiently describes
all large-scale structures with regard to the boundary conditions. Then, treat
all fine-scale features as interior objects such that the corresponding compo-
nents in all matrices and vectors are unknown degrees of freedom which are
implicitely incorporated into all iterative solution steps (see [1]).
In this paper, we used the fictitious boundary method for the simulation of
incompressible flow with moving rigid bodies in complex geometries. In many
cases, the calculation of forces acting on the moving rigid bodies is very im-
portant for the further study of the interaction between fluid and body, like in
particulate flow, sedimentation flow, and fluid-structure flow, etc. However, in
the fictitious boundary method, it is not so easy and straightforward to com-
pute these interesting forces, because the drag coefficient Cd and lift coefficient
C1 acting on the moving solid bodies are a very delicate quantity: they include
the results directly on the wall surface of the moving rigid bodies which is rep-
resented implicitly in the fictitious boundary method due to the use of a fixed
grid rather than a body-conformal grid. Therefore, the integral of forces only
over the wall surface of rigid bodies cannot be implemented directly in the
fictitious boundary method. For overcoming this difficulty, a volume integral
instead of the conventional surface integral for the calculation of the Cd and
C 1 by introducing an auxiliary function [7] or two additional functions [8] is
suggested. Obviously, in such volume integral calculations, the reconstruction
of the wall surface of the moving rigid bodies can be avoided. In this paper,
we use the Duchanoy's idea of the volume integral [7], and expand his imple-
mentation in a finite volume method into the finite element method and the
fictitious boundary method.

2 The fictitious boundary method


The details of the fictitious boundary method have been described in [1]. For
the following considerations, let n be a bounded domain with a piecewise
846 D. Wan et al.

smooth boundary r. The equations to be solved are the incompressible Navier-


Stokes equations
au
Pat + pu· V'u = f - V'p + p,V' 2 u, V'·u=o, (1)

where u is the velocity, p the pressure, p, the dynamic viscosity coefficient,


p the density, f the source term which may include the gravitational force.
The above equations are to be solved with u(x, t) = ua(x, t) on parts of
the boundaries of the flow domain where ua(x, t) is the prescribed boundary
velocity, including time-dependent moving boundaries. The details of solving
such incompressible flow problems can be found in the FeatFlow software [10,
11] which is based on (nonconforming) FEM discretizations, adaptive implicit
time-stepping, nonlinear Newton-like methods, (geometrical) multigrid solvers
(for velocity and pressure separately) on quite arbitrary domains.
In the following part, we give the description of a volume integral approach
for the calculation of the drag coefficient Cd and lift coefficient C l acting on
the moving solid bodies. Let S be the wall surface of the rigid bodies, liS
be the inward pointed unit normal with respect to [l and tangential vector
T = (ny, -nx). The drag and lift forces are usually calculated by a surface
integral as follows

while the drag and lift coefficient are calculated via

(3)

where (J is the characteristic velocity, and D the characteristic length.


From Eq.(3) and Eq.(2), we can see that the surface integral around the
wall surface of the rigid bodies should be conducted for the calculation of the
Cd and C l . However, in the present fictitious boundary method, the shapes
of the wall surface of the moving rigid bodies is implicitly imposed in the
fluid field. If we reconstruct the shapes of the wall surface of the moving rigid
bodies, it is not only a time consuming work, but also the accuracy is only first
order due to a piecewise constant interpolation. For overcoming this problem,
we use the following method to calculate the Cd and Cl in which the surface
integral is replaced by a volume integral. We define a parameter a as

a(x) = {~ for
for
x E [le,
x E [l,
(4)

where x denotes the coordinates of the edge midpoints of cells, [le is the
domain occupied by the rigid bodies, [l is the fluid domain, the whole domain
is [IT = [l U [le. The importance of such a definition of the parameter can be
An Efficient Multigrid FEM Solution Technique 847

seen from the fact that the gradient of a is zero everywhere except at the wall
surface of the rigid bodies, and equal to the normal vector n defined on the
grid [7, 12], i.e.

n = -Va. (5)
The total stress tensor a- of the fluid flow is

(6)
Hence the forces acting over the wall surface of the rigid bodies can be
computed by

FT = 1 DT
a-ndQ = -1
DT
a-VadQ. (7)

The drag force and lift force can be obtained from the Eq.(7),

FD = - 1 [ (-aau-oaa +aua-yayoa)
DT
J.L
x x
- oa]
- p - dQ,
ax (8)

FL =- 1 [ (---+---
DT
ovoa ovoa)
J.L
a a ayayx x
-p-
oa] dQ.
ay (9)

Therefore through Eq.(8), Eq.(9) and Eq.(3) we can calculate the new
drag and lift coefficients (Cd and Cz) via the volume integral over the whole
domain Q T instead of the surface integral over the wall surface of the rigid
bodies in Eq.(2). The integral over each element covering the whole domain
QT is evaluated with the standard 3 x 3 point Gaussian quadrature. Since the
gradient Va is non-zero only at the wall surface of the rigid bodies, thus the
volume integrals need to be computed only in one layer of mesh cells around
the rigid bodies. It is convenient for the present fictitious boundary method to
calculate the Cd and Cz.

3 Numerical tests

This section consists of two parts. The first part presents a quantitative exam-
ination for the benchmark case of flow around a circular cylinder with Re = 20
and 100 solved by the present fictitious boundary method. The second part
gives the computing results for a circular cylinder oscillating in a channel. For
comparison, corresponding reference results are also presented.
848 D. Wan et al.

3.1 Flow around a circular cylinder

We consider the benchmark case of flow around a circular cylinder described


in the paper [9]. The body-conformal mesh of Fig. 1 (a) is used to provide
reference results, while the channel meshes of Fig. 1 (b) and (c) are employed
by the present fictitious boundary method. The channel height is H = 0.41
m, the cylinder diameter D = 0.1 m. The Reynolds number is defined by
Re = UDjv with the mean velocity U = 2U(0, Hj2, t)j3. The kinematic
viscosity of the fluid is given by v = /1) P = 10- 3 m 2 j s and its density by
P = 1 kgjm 3 . The inflow profiles are parabolic with different U such that
the resulting Reynolds numbers are Re = 20 (steady case) and Re = 100
(nonsteady case).

1111111 (a) body-conformal mesh (LEVEL = 2)

1111111111111 I II I I
(b) channel mesh I (LEVEL = 1)

BII (c) channel mesh II (LEVEL = 2)

Fig. 1. Different coarse meshes

Table 1. The number of elements for different refined meshes


LEVEL 3 4 5 6 7
body-conformal mesh 384 1536 6144 24576 98304
channel mesh I 1088 4352 17408 69632 278528
channel mesh II 416 1664 6656 26624 106496

We first perform a stationary simulation (Re = 20), based on the body-


conformal mesh, the channel mesh I and the channel mesh II, respectively.
An Efficient Multigrid FEM Solution Technique 849

The shown coarse meshes will be successively refined by connecting opposite


midpoints. Table 1 gives the number of elements for these meshes after such
global refinements. Here LEVEL corresponds to the number of refinements.
The following Table 2 shows the comparison of the drag coefficient Cd and
the lift coefficient C1 based on the body-conformal mesh, the channel mesh I
and the channel mesh II, respectively. The calculation of Cd and C1 based on
the body-conformal mesh uses the surface integral formula in Eq.(2) which is
referred as reference results, while for the cases of using the channel mesh I
and the channel mesh II, the volume integral formula in Eq.(8) and Eq.(9) are
employed. In this table, the results calculated from LEVEL = 3 to LEVEL = 7
are all shown together. The corresponding benchmark values in [9] are also
listed in the table. From the comparisons, it can be seen that the results
calculated by the present fictitious boundary method agree sufficiently well
(rv 1%) with both the reference results. The results for the channel mesh I are
found to be not completely satisfying, while the results for channel mesh II are
improved since there is local refinement of the mesh near the wall surface of the
cylinder. The results for such low Reynolds number simulations show that an
appropriate global grid refinement as well as adequate local mesh adaptation
are necessary. The present fictitious boundary method proves to be competitive
with the standard approaches for such typical CFD applications.

Table 2. Comparison of Cd and Cl for Re = 20


Cd Cl
LEVEL Ref. ch. mesh I ch. mesh II Ref. ch. mesh I ch. mesh II
3 0.53450D+Ol 0.55296D+0l 0.54196D+0l 0.56128D-02 0.12165D-Ol 0.24435D-02
4 0.55066D+Ol 0.53537D+01 0.54207D+01 0.84683D-02 0.10742D-01 0.67612D-02
5 0.55404D+0l 0.54278D+0l 0.55161D+01 0.98915D-02 0.61455D-02 0.89128D-02
6 0.55581D+01 0.55012D+01 0.55571D+01 0.10384D-0l 0.99024D-02 0.94709D-02
7 0.55683D+0l 0.55421D+01 0.55640D+0l 0.10554D-01 0.97706D-02 0.10192D-0l
Ref. [9] 0.55795D+01 0.10618D-01

.. -....

.:;~....................

310~---;;-';;;'-------;;';-------C~-----;CO'--'-----C~-----;C'---------;C;;;---;! "o~---;;-';;;'-------;;';-------C~-----;Co'--'-----c~-----;C,__----;C;;;--:!
, ,
Fig. 2. Periodical results of Cd and Cl for Re = 100
850 D. Wan et al.

Next, we further examine the resulting accuracy for a medium range


Reynolds number Re = 100 which leads to periodical time-dependent vor-
tex shedding behind the cylinder. Since we are mainly interested in the spatial
accuracy of the fictitious boundary method, in particular capturing the impor-
tant effects near the cylinder, we try to eliminate the temporal discretization
error by choosing very small time steps. Then, we proceed the nonstationary
simulations until a fully periodical flow behaviour of all quantities has been
observed. Finally, we compare the results for one period. Figure 2 shows the re-
sults of the Cd and C z • The" Standard L = 7" means that the body-conformal
mesh of Fig. 1 (a) with 7 level refinement was used to provide the reference
result, while "FB L = 3 rv 7" represents the results obtained by the present
fictitious boundary method using the channel mesh I in Fig. 1 (b) with differ-
ent refinement levels. From these figures, we can see that the various results
are identical with regard to the reference results. The results also show that
the present fictitious boundary method leads to comparative results like the
standard approaches with 'body-fitted' meshes. It can also be claimed that
results with higher accuracy can be reached via local mesh adaptivity (see for
example the concept of aligned adaptive computational meshes in [1]).

3.2 An oscillating cylinder in a channel

To demonstrate the ability of the present fictitious boundary method to han-


dle flows with complex moving boundaries, we have chosen a flow configura-
tion with a cylinder undergoing sinusoidal transverse oscillation in a channel
with specified amplitudes and frequencies. The channel mesh of Fig. 3 (a) is
employed by the present fictitious boundary method. The computational do-
main size is (2.2 x 0.41). The mean location of the cylinder center (Xo, Yo) is
(1.1,0.2) relative to the left bottom corner of the domain. The cylinder diam-
eter D is equal to 0.1. No-slip is prescribed on the left, right, top and bottom
boundaries. The cylinder is oscillating sinusoidally such that the location of
its center (Xc, Yc) is given by (Xc(t) = Xo + A sin(27r f t), Yc(t) = Yo), where
t is the time, and A = 0.25 and f = 0.25 are amplitude and frequency of
the oscillation, respectively. The kinematic viscosity of the fluid is given by
v = 14 p = 10- 3 m 2 j s and its density by p = 1 kgjm 3 . The fluid in channel
is initially at rest. Since there is no benchmark result available for compari-
son, we carried out a reference calculation to provide comparing data. In the
reference calculation, the body-conformal mesh of Fig. 3 (b) is used, we fix
the cylinder but set the coordinate system moving with the same motion but
with opposite moving direction of the moving cylinder in the calculation of
the fictitious boundary method. Table 3 gives the number of elements for the
channel mesh and body-conformal mesh in Fig. 3 with different numbers of
refined levels.
Fig. 4 gives contour plots for the vorticity distribution obtained by the
fictitious boundary method based on the channel mesh. These pictures show
that the flow in the channel is disturbed by the oscillating cylinder, and the
An Efficient Multigrid FEM Solution Technique 851

IIII~IIIIIIIIIIIIIIIIIIIIII_IIII
(a) channel mesh (LEVEL = 1) (b) body-conformal mesh (LEVEL = 2)
Fig. 3. Coarse meshes used for the oscillating cylinder in a channel

vortex is generated periodically in the wake of the cylinder. The range of


wakes becomes longest when the cylinder is at the end of the moving direc-
tion (t = to + tT, to + ~T, T is time period), while when the cylinder is in
the middle position of its oscillation, the flow is seriously perturbed and be-
comes more complex (t = to, to + ~T). Fig. 5 illustrates the comparison of the
drag coefficient Cd and lift coefficient C z between the results of the fictitious
boundary method based on the channel mesh and the reference calculation
based on the body-conformal mesh. The results calculated from LEVEL = 4
to LEVEL = 7 are all shown together. The corresponding coefficients Cd and
Cz for one period between t = 19.79 to 23.79 are shown in Fig. 5 (c) and (f),
the solid line represents the results of the reference calculation based on the
body-conformal mesh at LEVEL = 7, while the dash line denotes the results
obtained by the fictitious boundary method based on the channel mesh at
LEVEL = 7. From the comparisons, we can see that both FB and Ref. results
are identical with the increase of the mesh refinements. The FB results calcu-
lated by the present fictitious boundary method are agreeable very well with
the reference results, although the FB results exhibit small oscillations due to
the non-aligned cylinder movement through the grid lines.

Table 3. The number of elements for different refined meshes


LEVEL 4 5 6 7
channel mesh 8448 33792 135168 540672
body-conformal mesh 1792 7168 28672 114688

(a) t = to (b) t = to + ~T

(c) t = to + ~T (d) t = to + ~T
Fig. 4. Vorticity contour plot for an oscillating cylinder in a channel
852 D. Wan et al.

(a) Cd of FB (b) Cd of Ref. (c) one periof of Cd

(d) Cl of FB (e) C1 of Ref. (f) one periof of Cl


Fig. 5. The comparison of Cd and C 1 between fictitious boundary (FB) and reference
(Ref.)

4 Conclusions
The presented fictitious boundary method for simulating incompressible flows
with moving rigid bodies in complex geometries has been validated in a two-
dimensional configuration. We showed that the use of an arbitrarily fixed (un-
structured) FEM background mesh is accurate enough to calculate those sen-
sitive quantities (drag coefficient and lift coefficient) on the wall surface of the
cylinder. Comparisons of the results using a body-fitted mesh and a fixed
structured mesh show good agreement. The advantage of the present method
is that since the body motion is independent of the mesh, problems associated
with mesh reconfiguration and motion are avoided, computations on a fixed
grid are cheaper than on a body-fitted one, and finally, the extension of the
method to 3D is straightforward. It is also worthy to note that the availability
of the present method to accurately compute the forces acting on the moving
rigid bodies provides a good and solid base for further study of particulate flow
as well as the interaction between fluid and structure as proposed by Glowinski
in the paper [4].

References
1. Turek, S., Wan, D.C. and Rivkind, L.S.: The Fictitious Boundary Method for the
implicit treatment of Dirichlet boundary conditions with applications to incom-
pressible flow simulations. Lecture Notes in Computational Science and Engineer-
ing, Volume 35, Springer Verlag, (2003)
An Efficient Multigrid FEM Solution Technique 853

2. Feng, J., Hu, H.H and Joseph, D.D.: Direct simulation of initial value problems
for the motion of solid bodies in a Newtonian fluid; part 2, Couette and Poiseuille
flows. J. Fluid Mesh. 277 (1994) 271
3. Hu, H.H.: Direct simulation of flows of solid-liquid mixtures. Int. J. Multiphase
Flow. 22 (2) (1996) 335
4. Glowinski, R, Pan, T.W., Hesla, T.r. and Periaux, J.: A fictitious domain ap-
proach to the direct numerical simulation of incompressible viscous flow past
moving rigid bodies: application to particulate flow. J. Comput. Phys. 169 (2001)
363
5. Glowinski, R: Handbook of numerical analysis (Volume IX): Numerical methods
for fluids (Part 3). Ciarlet, P.G and Lions, J.L. editors, North-Holland, (2003)
6. Heywood, J., Rannacher, R and Turek, S.: Artificial boundaries and flux and
pressure conditions for the incompressible Navier-Stokes equations. Int. J. Numer.
Meth. Fluids 22 (1996) 325
7. Duchanoy, C. and Jongen, T.RG.: Efficient simulation of liquid-solid flows with
high solids fraction in complex geometries. Compters and Fluids 32 (2003) 1453
8. John, V.: Higher order finite element methods and multigrid solvers in a bench-
mark problem for the 3D Navier-Stokes equations. Int. J. Num. Meth. Fluids 40
(2002) 775
9. Schafer, M. and Turek, S.: Benchmark computations oflaminar flow around cylin-
der. in E.H. Hirschel (editor) Flow Simulation with High-Performance Computers
II. Volume 52 of Notes on Numerical Fluid Mechanics, Vieweg (1996) 547
10. Turek, S.: Efficient Solvers for Incompressible Flow Problems. Springer Verlag,
Berlin-Heidelberg-New York (1999)
11. Turek, S. et al.: FEATFLOW . Finite element software for the incompressible
Navier-Stokes equations: User Manual, Release 1.2 (1999) (www.featflow.de)
12. Brackbill, J.U., Kothe, D.E. and Zemach, C.: A continuum method for modelling
surface tension. J. Comput. Phys. 100 (1992) 335
Higher-Order FEM for a System of Nonlinear
Parabolic PDE's in 2D with A-Posteriori Error
Estimates

Martin Zitka 1 , Karel Segeth2 and Pavel Solin3

1 Academy of Sciences, Prague zitka@math. cas. cz


2 Academy of Sciences, Prague [email protected]
3 Rice University, Houston, Texas [email protected]

Summary. Initial-boundary value problems for systems of nonlinear parabolic par-


tial differential equations arise in many important practical applications in electro-
magnetics, chemistry, modelling of diffusion and heat transfer processes and other
fields. We are concerned with their solution by means of the method of lines with
higher-order finite element spatial discretization on unstructured triangular meshes.
Obviously, development of realistic a-posteriori error estimates plays an essential role
in the application of a strategy of this type.

1 Introduction

Initial-boundary value problems for systems of nonlinear parabolic partial dif-


ferential equations (PDE's) arise in many important practical applications in
electromagnetics, chemistry, modelling of diffusion and heat transfer processes
and many other fields 1 .
We are concerned with the numerical solution of such problems by the
method of lines (MOL) combined with fully automatic hp-adaptive finite el-
ement (FE) discretization on unstructured triangular meshes in space. This
approach has the potential of reducing the size of discrete problems signifi-
cantly while preserving the accuracy of results.
Until now, automatic hp-adaptivity has been applied almost exclusively to
stationary problems (see, e.g., [1, 4, 2] and references therein). It is our aim
to extend the promising automatic hp-adaptive strategies for elliptic problems
[7, 8] to parabolic equations and in this paper we present two basic steps to-
wards this goal:

- Efficient implicit time-adaptive higher-order FE solver PARSYS_2D for sys-


tems of nonlinear parabolic PDE's with general boundary conditions for all
solution components.
1 This work was supported by the Grant Agency of the Czech Republic under
projects No. GP102/01/D114 and 201/01/1200.
Higher-Order FEM for Parabolic PDE's 855

- A-posteriori error estimates appropriate for the class of evolutionary prob-


lems studied.

2 Definition of the problem


Let [7 C ]R.2 be a bounded domain with piecewise-polynomial Lipschitz-
continuous boundary and J = (0, T] a finite time interval. We consider a sys-
tem of Neq nonlinear parabolic equations
u(x, t) - 'V. (a(u, 'Vu, x, t)'Vu(x, t)) = f(u, 'Vu, x, t) in [7, t E J,
u(x,O) = v(x) in [7,
(1)
Ui(X,t)=up(x,t) onrp,
ai(u, 'Vu, x, t)aui/an = gi(X, t) on riN,

(it stands for au/at) where u = (Ul,"" UNeq ) is the solution, a and fare
smooth vector-valued functions, n is the unit outward normal to the bound-
ary and 0[7 = rp u riN for all i = 1, ... , N eq . Further we denote g =
(gl, ... , 9Neq )· The 'V (nabla) operator is defined as usual, 'V = (0/ aXl, 0/ aX2).
The vector-valued coefficient a = (al,"" aNeq ) is bounded,

0< J-L :::; ai(.) :::; M, (2)


and both a and f are Lipschitz-continuous,
Ila(r) - a(s)11 :::; Lllr - sll, (3)
Ilf(r) - f(s)11 :::; Lllr - sll Yr, s E ]R.3Neq +3. (4)
Without loss of generality we restrict ourselves to homogeneous Dirichlet
boundary conditions for the formulation of the variational problem. In the
nonhomogeneous case, an appropriate vector-valued lift function is chosen that
yields an additional contribution to the right-hand side (see, e.g., [8] for de-
tails). The variational form of (1) reads
(u(x, t), ip)+(a(u, 'Vu, x, t)'Vu(x, t), 'Vip)
= (f(u,'Vu,x,t),ip) + (g(x,t),ip)r N , YipEV (5)
u(x,O) = v(x),

where t E (0, T) and the form of the space V C (Hl )Neq is dictated by rp,
1 :::; i :::; N eq . The symbol (.,.) stands for the L2([7) scalar product.
Following the concept of MOL, we discretize the spatial variable first and
leave the temporal variable continuous. Consider a finite element mesh Th,p cov-
ering il, where a polynomial order P(Ki) ~ 1 is associated with each element
Ki E Th,p, 1 :::; i :::; N elem . Let Vh ,p([7) be an appropriate piecewise-polynomial
subspace of V. We pose the semidiscrete problem to find Uh,p E Vh ,p([7) for
all t E J, such that
856 M. Zitka et al.
(Uh,p(t), X(x))+(a(uh,p, 'VUh,p, x, t)'VUh,p(X, t), 'VX(x))
= (f(Uh,p, 'VUh,p, x, t), X(x)) + (g(x, t), X(x ))rN, (6)
Uh,p(X,O) = Vh,p(X),
for all X E Vh,p(D) and t E J. Here, Vh,p E Vh,p(D) is the Hl-projection of
v E V(D).

3 Finite-dimensional spaces
Let Th,p be a mesh on D consisting of N e1em disjoint triangles K i , i =
1, ... ,Ne1em . The polynomial order P(Ki) ::::: 1 is assigned to each element
K i . Local polynomial orders on edges are determined using the minimum rule
(minimum of polynomial orders on adjacent elements). Polynomial orders on
all elements are collected into a vector p = {p(K l ),p(K2 ), ... ,p(KNelem)}' We
further define

V';'p,o(D) = {'P E [HJ(D)]Neq , 'PklKi E Pp(Ki )(Ki ), i = 1, ... ,Ne1em ,


'Pklej EPp(ej)(ej); j=l, ... ,Ne; k=l, ... ,Neq }. (7)
By V';,p,o(D) we denote the space V:)1',o(D) corresponding to a uniform distri-
bution of the polynomial order p = t q, q, ... , q}.
The design of hierarchic basis functions of V';'p,o (D) is well-known (see, e.g.,
[8]). The basis consists of vertex functions (associated with mesh vertices), edge
functions (associated with edges where p(e) ::::: 2) and bubble functions (asso-
ciated with element interiors where p(K) ::::: 3). To simplify the explanation,
we introduce the following notation:
- Brvh,Pl 1 ... the set of all vertex functions in the basis,
- B;'h,p,q ... the set of all edge functions of the polynomial order q,
- Brb
h,p'
q ... the set of all bubble functions of the polynomial order q,
- Bei ... the set of all edge functions associated with the edge ei,
- BKi ... the set of all bubble functions associated with the triangle K i .

4 A-posteriori error estimation

We extend a technique for a-posteriori estimation that was first proposed in


[3] and further developed in [5] (in both cases for ID problems). For the sake
of clarity we restrict the explanation to the scalar case (Neq = 1) with ho-
mogenous Dirichlet conditions on the whole boundary aD, with a and f only
depending on the exact solution u.
The error of the solution to the semi discrete problem is defined as usual,

e(x, t) = u(x, t) - Uh,p(X, t). (8)


Higher-Order FEM for Parabolic PDE's 857

By Uh,p we denote the elliptic projection of the exact solution to the space
Vt,p,o (fl) (i.e. V~,p,o (S?) with q = p). The main idea of the error estimation ap-
proach can be outlined as follows: The problem is solved with Uh , p E v'hP,p, o(D).
Vt+~(D)
Then the problem is solved once again with Uh'p ,E p ,
(i.e. V~,p, o(D) with
q = p + 1). The estimate is based on the difference Uh,p - Uh,p'
Let us begin with the identity

(e, x)+(a(uh,p+e)V'e, V'X) = (J(uh,p+e), X)-(Uh,p, x)-(a(uh,p+e)V'uh,p, V'X)


(9)
for almost every t E J and all functions X E HJ, that follows directly from (8),
(5) and (6). Now our aim is to introduce an easily computable error estimate
E(x, t) that should be close to the quantity e.

Definition 1. Let us define the space

V~,p,o(D) = span {B~h'P,q U B~h'P,q}·


We define three estimates E pN , E pL and EEL as functions associated with
Vt,~,~(D) by the means of the identity (9).

Definition 2. We say that a function E = EpN is a nonlinear parabolic error


estimate if the identity

(E,X) + (a(uh,p + E)V'E, V'X)


= (J(Uh,p + E), x) - (Uh,p, x) - (a(uh,p + E)V'Uh,p, V'X) (10)

holds for almost every t E J and all functions X E Vt+~(D),


,p, and if the identity

(a(v)V'E, V'X) = (a(v)V'(v - Uh,p), V'X) (11)

holds for t =0 and all functions X E Vt,~,~ (D).

Definition 3. We say that a function E = EpL (or E = EEL) is a linear


parabolic (elliptic) error estimate if the identity

or

(a(uh,p)V'E, V'X) = (J(Uh,p),X) - (Uh,p,X) - (a(uh,p)V'Uh,p, V'X) (13)

holds for almost every t E J and all functions X E Vt,~,~(D), and if identity
(11) holds for t = 0 and the same functions X.

Further, let us introduce a function eh,p E Vt,~,~(D) such that Uh,p + eh,p is
the elliptic projection of U to Vt,~,~(D).
858 M. Zitka et al.

Definition 4. By £h,p(X, t) we denote a function in V{~,~(D) such that

(a(u)\7(uh,p + £h,p), \7X) = (a(u)\7u, \7X) (14)

holds for almost every t E J and all functions X E V{~,~ (D).


Definition 5. By B we denote the difference

B(x, t) = Uh,p(X, t) - Uh ,p(X, t). (15)


Further we define functions 'rJ = 'rJPN, 'rJPL, 'rJEL for E = EpN, EpL, EEL such
that
'rJ(X, t) = £h,p(X, t) - E(x, t). (16)
The following proposition compares our estimate with the quantity £h,p' By
II.IIT we denote the HT(D) norm.
Lemma 1. Let EpN E V{~,~(D) be the error estimate given by (10) and
(11), and £h,p E V{~,~(D) be the function defined by (14). Let IIB(., t)11 and
II'rJPN(" t)11 be nondecreasing functions of the variable t and let

Let £h,p and EpN depend on t in a sufficiently smooth way. Then there exists
a constant C > 0 such that

(18)

Remark 1. Analogous propositions can be shown also for II'rJPLlh and II'rJELlll
with minor differences in the proof.

Proof. We start with the formula

(E, X) + (a(uh,p + E)\7 E, \7X) + (Uh,p, X) + (a(uh,p + E)\7uh,p, \7X)


- (iih,p, X) - (~h,p, X) - (p, X) - (a(u)\7(uh,p + £h,p), \7X) (19)
= (f(Uh,p + E), X) - (f(u), X),
that follows from (5), (14) and (10), and use

p(X, t) = u(x, t) - Uh,p(X, t) - £h,p(X, t). (20)

By rearranging terms, substituting 'rJ E V{~,~ (D) for X, applying the Schwarz
inequality and using the fact that

where h,p is the interpolation operator from HJ into V{p,o(D), we arrive to

(22)
Higher-Order FEM for Parabolic PDE's 859

where S > O. We omit the term with 11\7'1]11 2 on the left-hand side. Integrating
on both parts of (22), we find

11'I](t)112:::; 11'1](0)11 2 + C3 (t)h 2p +2 + 21a t


C1 (s)II'I](s)112 ds (23)

if the corresponding functions are Lebesgue integrable on the interval [0, T].
Employing now (17), assuming that the corresponding functions are again
Lebesgue integrable on [0, TJ, and applying Gronwall's lemma, we obtain the
bound
(24)
Turning now back to the inequality (22) and assuming that 11'1]11 is nondecreas-
ing, we can write
(25)
Using (24) and (25), we finally obtain from (22) that

11\7'1]11 2 :::; Ch 2p +2. (26)


The statement of the lemma follows directly from (24) and (26). 0
For each of the three above error estimates let us introduce an effectivity index
8 PN , 8 PL , and 8 EL , respectively, defined as

8= IIEI11. (27)
IIel11
The following theorem contains a principal result related to limh-+o 8:
Theorem 1. Let u(t) E HP+1 n HJ and Uh,p(t) E Vt,p,oU?) be the solutions
of (5) and (6) for all t E J. Let E E vt,;,1U?) be the solution of (10), (11)
(for EpN ), (11), (12) (for EpL ), or (13) (for EEL). Then, under appropriate
regularity assumptions,

lim8=1 (28)
h-+O
for almost every t E J, where 8 is 8 PN , 8 PL or 8 EL .

Proof. The proof is rather technical (see [10]). Its main idea is the same as in
the ID case presented in [5]. 0

5 Numerical results

5.1 Model problem

To verify the theoretical results, we set up a model problem (1), where we


chose the analytical solution as
860 M. Zftka et al.

and the coefficient a = a(s) = 1 + s, s E 1Ft The right-hand side function f


was calculated to yield the exact solution u.
The values of GEL and GpL for t = 0.4, h = J13/36 and for different
p's and mesh parameters are summarized in Tables 1 and 2. The implicit
Euler formula with the step 6t = 0.02 has been used for the time integration.
According to Theorem 1, the values should tend to 1 as h ---+ o.

p=l p=2 p=3 p=4


h 4.09651988491 2.68370145897 2.07009170501
h/2 1.01140677283 1.00652321679 0.98450775723 0.96472314783
h/4 1. 002515 76073 0.99884836859 0.99764998322 0.99629420488
h/8 0.99790790301 0.99587534487 0.99996695628 1.01800491423

Table 1. Values of BEL for different h's and p's.

p=l p=2 p=3 p=4


h 4.08160620473 2.67782363272 2.06631766115
h/2 1.00953184295 1.00589904628 0.98419486064 0.96447454547
h/4 1.00221228302 0.99868374826 0.99752249029 0.99620694415
h/8 0.99774286135 0.99582769340 0.99993253125 1.01798017133

Table 2. Values of B pL for different h's and p's.

6 Brief description of PARSYS_2D

During the first stage of the project we implemented a robust higher-order


FE solver PARSYS_2D for systems of nonlinear parabolic PDE's in 2D analo-
gously as we did in 1D (see the PARSYS package [6]). Implicit time-integration
methods allow the solver also to solve systems of elliptic PDE's after leaving
out the time-derivative. We use our software XGEN (see [9]) to generate un-
structured triangular meshes of very good quality on domains with arbitrarily
complicated geometries. All the software is available in Internet 2 .
2 The mentioned C++ software packages PARSYS, PARSYS...2D and XGEN
can be downloaded free of charge from the web page of Pavel Solin,
http://www.caam.rice.edu/-solin.
Higher-Order FEM for Parabolic PDE's 861

6.1 Definition of the solved system of PDE's

It is not trivial to find an efficient and user-friendly way to define a system of


non-linear PDE's. We approached this problem by allowing the user to specify
a large number of matrix- and vector-valued coefficients. The solved equation
has a general form

au(x, t) a (au ) a (au ) a (au )


C at = aXl PI aXl (x, t) + aXl P2aX2 (x, t) + aX2 P3aXl (x, t)
a (au
+ -,:;- P4 -,:;- (x, t) ) + -,:;-
a (P5U(X, t)) + -,:;-
a (P6U(X, t))
uX2 uX2 uXl uX2
+ P7 U(x, t) + F,
(30)

where U(x,t) = (Ul(X,t), ... ,UNeq(x,t)) is the solution vector and Ui E


Hl(D) for t E J, i = 1, ... N eq . The user is requested to prescribe seven
Neq x Neq matrices

Pi = Pi (u, :~, :~, t), x, i = 1, ... ,7,

together with a diagonal matrix

au au au au )
C = diag ( C 1 (U, -,:;-, -,:;-, x, t), ... ,CNeq (U, -,:;-, -,:;-, x, t)
uXl uX2 uXl uX2
and a source term vector

au au
( F1(U, -,:;-, au au )T
F = -,:;-, x, t), ... FNoq (U, -,:;-, -,:;-, x, t)
uXl uX2 uXl uX2
Remark 2. In (30), we apply a/aXl and a/aX2 to vectors. By this notation we
mean that these operators are applied to each component of the vector.
The vector-valued initial condition has the standard form U;(x,O) = UP(x),
i = 1, ... ,Neq. To each solution component U;, i = 1, ... ,Neq , we can prescribe
either Dirichlet or Neumann boundary conditions. The Dirichlet boundary
conditions have the form

Ui(x, t) = uF (x, t), x E aD, i = 1, ... N eq . (31)


The Neumann boundary conditions are prescribed in the form

where n = (nl' n2) is the outer unit normal to the boundary. One can prescribe
various types of boundary conditions on different parts of the boundary.
862 M. Zitka et al.

6.2 Spatial and temporal discretization

The solver distributes the polynomial orders P(Ki) in the mesh Th,p accordingly
to user's input and uses the above mentioned minimum rule to decide about
local polynomial orders associated with edges. Partially novel Lobatto-based
hierarchic shape functions (see [8]) up to the 10th order were used in order to
obtain finite elements with excellent conditioning properties. For the sake of
efficiency, economical Gauss quadrature points and weights from the CD-ROM
[8] up to the 20th polynomial order were utilized.
The spatial semidiscretization yields a system of nonlinear ordinary differ-
ential equations (ODE's). The user can choose either to use the built-in implicit
Euler scheme which is only first-order accurate, or to utilize absolutely sta-
ble implicit higher-order adaptive ODEPACK subroutines. These subroutines
are very sophisticated and one of their significant advantages is that they are
based on explicit evaluation of the right-hand side of the ODE system - thus,
no linearization whatsoever is needed by the user. The solvers are capable of
numerically obtaining information about the Jacobi matrix of the right-hand
side that is needed for the backward differentiation formula (BDF) on which
they are based.

6.3 Outlook

The combination of hp-adaptive higher-order FE discretization in space with


the adaptive higher-order time-integration methods of ODEPACK offers excit-
ing perspectives for the numerical solution of large evolutionary problems. We
hope to report on progress on the implementation of automatic hp-adaptive
strategies in our 2D solver soon.

References
1. Demkowicz, L., Oden, J. T., Rachowicz, W., Hardy, O. (1989): Toward a universal
hp-adaptive finite element strategy. Part 1: Constrained approximation and data
structure, Comput. Methods Appl. Math. Engrg., 77, 79-112
2. Demkowicz, L., Rachowicz, W., Devloo, Ph. (2002): A fully automatic hp-
adaptivity, J. Sci. Comput., 17, 127-155.
3. Moore, P.K. (1994): A posteriori error estimation with finite element semi- and
fully discrete methods for nonlinear parabolic equations in one space dimension.
SIAM J. Numer. Anal., 31, 149-169
4. Rachowicz, W., Oden, J. T., Demkowicz, L. (1989): Toward a universal hp-
adaptive finite element strategy. Part 3: Design of hp meshes, Comput. Methods
Appl. Mech. Engrg., 77, 181-212
5. Segeth, K. (1999): A posteriori error estimation with the finite element method
of lines for a nonlinear parabolic equation in one space dimension. N umer. Math.,
83,455-475
Higher-Order FEM for Parabolic PDE's 863

6. Segeth, K., Solfn, P., Kocifik, M. (2002): Some algorithmic aspects of higher-
order finite element schemes in multi dimensions. In: Software and Algorithms
of Numerical Mathematics 14. (Proceedings of Summer School, Kvilda 2001),
Plzen, University of West Bohemia, 199-221
7. Solfn, P., Demkowicz, L. (2003): Goal-Oriented hp-Adaptivity for Elliptic Prob-
lems, Comput. Methods App!. Mech. Engrg., accepted
8. Solfn, P., Segeth, K., Dolezel, 1. (2003): Higher-Order Finite Element Methods.
Chapman & Hall I CRC Press, Boca Raton, FL
9. Solin P. (2000): On a mesh generation technique based on a special smoothing
procedure for uniform inner point distribution, Acta Technica CSAV 45, 397-417
10. Z{tka, M. (2002): Higher-Order FEM for a System of Nonlinear Parabolic PDE's,
Diploma Thesis, Charles University, Prague

You might also like