0% found this document useful (0 votes)
27 views

Olb Manual

Uploaded by

juruedam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Olb Manual

Uploaded by

juruedam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 220

OpenLB User Guide

Associated with Release 1.7 of the Code

Authors: Adrian Kummerländer, Tim Bingert, Fedor Bukreev,


Luiz Eduardo Czelusniak, Davide Dapelo, Simon Englert, Nicolas
Hafen, Marc Heinzelmann, Shota Ito, Julius Jeßberger, Florian
Kaiser, Eliane Kummer, Halim Kusumaatmaja, Jan E. Marquardt,
Michael Rennick, Tim Pertzel, František Prinz, Martin Sadric,
Maximilian Schecher, Stephan Simonis, Pascal Sitter, Dennis
Teutscher, Mingliang Zhong, Mathias J. Krause
Copyright © 2006-2008 Jonas Latt
Copyright © 2008-2024 Mathias J. Krause
[email protected]

Permission is granted to copy, distribute and/or modify this document under the terms of the
GNU Free Documentation License, Version 1.2 or any later version published by the Free Soft-
ware Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the Section entitled “GNU Free Documentation License”
(Section A.3).

2
Abstract

OpenLB is a generic implementation of lattice Boltzmann methods (LBM) that is shared with
the open source community under the terms of the GPLv2 license. Since the first release in
2007 [71], the code continues to be improved and extended, resulting in fifteen releases [59–
70, 72, 73] and counting. The OpenLB framework is written in C++ and covers the full scope
of simulations – from pre-processing over parallel and efficient execution to post-processing
of results. It offers both the possibility of setting up new simulation cases using the existing
rich collection of models and of implementing new custom models. OpenLB supports MPI,
OpenMP, AVX(51)2 vectorization and CUDA for parallel execution on systems ranging from
low-end smartphones over multi-GPU workstation up to supercomputers.
Contents

1 Introduction 10
1.1 Lattice Boltzmann Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 OpenLB Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Scope and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 What Makes OpenLB Special? . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.3 Which Features are Currently Implemented? . . . . . . . . . . . . . . . . 13
1.2.4 Project Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.5 Getting Help with OpenLB . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.6 How to Cite OpenLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Core Concepts 18
2.1 Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 (Super,Block)Lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.2 BlockLattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.3 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.4 Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Collision Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Non-local operators or Post Processors . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Platform-transparency of Models . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.2 Supported Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6.1 Basic Functor Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6.2 Functor Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Geometry Creation and Meshing 33


3.1 Creating a Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Setting Material Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Building Simulation Domains with Geometry Primitive Functors . . . . . . . . 35

4
3.4 Reading STL-files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 Excursus: Creating STL-files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Simulation Models 38
4.1 Non-dimensionalization and Choice of Simulation Parameters . . . . . . . . . . 38
4.2 Porous Media Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Power Law Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Multiphysics Couplings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.1 Shan–Chen Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.2 Implementation of pseudo-potential multi-component Fluid . . . . . . . 43
4.4.3 Free Energy Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.4 Coupling Between Momentum and Energy Equations . . . . . . . . . . 47
4.5 Advection–Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5.1 Closer Look on Advection Diffusion Boundary Conditions . . . . . . . . 51
4.5.2 Convergence Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5.3 Creating an Application with AdvectionDiffusionDynamics . . . . . . . 54
4.5.4 Obtaining Results in Thermal Simulations . . . . . . . . . . . . . . . . . 57
4.5.5 Conduction Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.6 Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.6.1 Class ParticleSystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6.2 Class SuperParticleSystem . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6.3 Class ParticleManager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6.4 Resolved Lattice Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.5 Particle Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.6 Particle Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6.7 Particle Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6.8 Discrete Contact Model for Surface Resolved Particles . . . . . . . . . . 70
4.6.9 Sub-grid Legacy Framework . . . . . . . . . . . . . . . . . . . . . . . . . 72

5 Initial and Boundary Conditions 82


5.1 Define Boundary Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.1 Wet-node Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1.2 Link-wise Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1.3 AdvectionDiffusionBoundary . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.1.4 Robin-type boundary condition . . . . . . . . . . . . . . . . . . . . . . . 86
5.1.5 Additional Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2 Define Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3 Define Boundary Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6 Input and Output 89

5
6.1 Output Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Write Simulation Data to VTK File Format . . . . . . . . . . . . . . . . . . . . . . 90
6.3 CSV Writer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4 Write Images Instantaneously . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.1 GifWriter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.4.2 Heatmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.5 Gnuplot Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.5.1 Regression with Gnuplot . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.6 Console Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.7 Console Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.8 Read and Write STL Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.9 XML Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.10 Visualization with Paraview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.10.1 Clip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.10.2 Contour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.10.3 Glyph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.10.4 Stream Tracer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.10.5 Resample To Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.11 Application of Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.11.1 Extract Simulation Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.11.2 Define Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.11.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.11.4 Arithmetic and Advanced Functor Usage . . . . . . . . . . . . . . . . . . 103
6.11.5 Setting Boundary Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.11.6 Flux Functor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.11.7 Discrete Flux Functor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.11.8 Atmospheric Boundary Layer Functor . . . . . . . . . . . . . . . . . . . . 109
6.11.9 Porosity & Velocity Volume Functor . . . . . . . . . . . . . . . . . . . . . 109
6.11.10 Wall Shear Stress Functor . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.11.11 Error Norm Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.11.12 Grid Refinement Metric Functors . . . . . . . . . . . . . . . . . . . . . . . 111

7 Flow Control and Optimization 112


7.1 Problem formulation and solution strategy . . . . . . . . . . . . . . . . . . . . . 113
7.2 Gradient-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.2.1 Descent Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.2.2 Step control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.2.3 Derivative Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.3.1 Optimizer Classes: Optimization Methods . . . . . . . . . . . . . . . . . 115

6
7.3.2 OptiCase Classes: Gradient Computation . . . . . . . . . . . . . . . . . . 116
7.3.3 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3.4 Parameter Explanation and Reading from XML . . . . . . . . . . . . . . 117

8 Examples 123
8.1 Example Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.2 adsorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.1 adsorption3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.2 microMixer3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.3 advectionDiffusionReaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.3.1 advectionDiffusionReaction2d . . . . . . . . . . . . . . . . . . . . . . . . 126
8.3.2 reactionFiniteDifferences2d . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.3.3 advectionDiffusion1d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.3.4 advectionDiffusion2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
8.3.5 advectionDiffusion3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.3.6 advectionDiffusionPipe3d . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.3.7 convectedPlate3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.3.8 longitudinalMixing3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.4 laminar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.4.1 bstep2d and bstep3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.4.2 cavity2d, cavity2dSolver and cavity3d . . . . . . . . . . . . . . . . . . . . 131
8.4.3 cylinder2d and cylinder3d . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.4.4 poiseuille2d and poiseuille3d . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.4.5 poiseuille2dEoc and poiseuille3dEoc . . . . . . . . . . . . . . . . . . . . 131
8.4.6 powerLaw2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.4.7 testFlow3dSolver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.5 optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.5.1 domainIdentification3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.5.2 poiseuille2dOpti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.5.3 showcaseADf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.5.4 showcaseRosenbrock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.5.5 testFlowOpti3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.6 multiComponent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.6.1 airBubbleCoalescence3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.6.2 binaryShearFlow2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.6.3 contactAngle2d and contactAngle3d . . . . . . . . . . . . . . . . . . . . . 140
8.6.4 fourRollMill2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.6.5 microFluidics2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.6.6 phaseSeparation2d and phaseSeparation3d . . . . . . . . . . . . . . . . . 140
8.6.7 rayleighTaylor2d and rayleighTaylor3d . . . . . . . . . . . . . . . . . . . 140

7
8.6.8 waterAirflatInterface2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.6.9 youngLaplace2d and youngLaplace3d . . . . . . . . . . . . . . . . . . . 141
8.7 particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.7.1 bifurcation3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.7.2 dkt2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.7.3 magneticParticles3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.7.4 settlingCube3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.8 porousMedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.8.1 city3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.8.2 porousPoiseuille2d and porousPoiseuille3d . . . . . . . . . . . . . . . . 143
8.8.3 resolvedRock3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.9 reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.9.1 advectionDiffusionReaction2d . . . . . . . . . . . . . . . . . . . . . . . . 145
8.10 thermal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.10.1 galliumMelting2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.10.2 porousPlate2d, porousPlate3d and porousPlate3dSolver . . . . . . . . . 147
8.10.3 rayleighBenard2d and rayleighBenard3d . . . . . . . . . . . . . . . . . . 148
8.10.4 squareCavity2d and squareCavity3d . . . . . . . . . . . . . . . . . . . . . 149
8.10.5 stefanMelting2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.11 turbulent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.11.1 aorta3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.11.2 channel3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.11.3 nozzle3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.11.4 tgv3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.11.5 venturi3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.12 freeSurface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
8.12.1 breakingDam2d and breakingDam3d . . . . . . . . . . . . . . . . . . . . 156
8.12.2 fallingDrop2d and fallingDrop3d . . . . . . . . . . . . . . . . . . . . . . . 157
8.12.3 deepFallingDrop2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.12.4 rayleighInstability3d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

9 Building and Running 159


9.1 Install Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.1.1 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.1.2 Mac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.1.3 Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.2 Compiling OpenLB Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.2.1 Using NVIDIA GPUs in OpenLB . . . . . . . . . . . . . . . . . . . . . . . 162

10 Step by Step: Using OpenLB for Applications 169

8
10.1 Lesson 1: Getting Started - Sketch of Application . . . . . . . . . . . . . . . . . . 170
10.2 Lesson 2: Define and Use Boundary Conditions . . . . . . . . . . . . . . . . . . . 172
10.3 Lesson 3: UnitConverter - Lattice and Physical Units . . . . . . . . . . . . . . . . 174
10.4 Lesson 4: Extract Data From a Simulation . . . . . . . . . . . . . . . . . . . . . . 175
10.5 Lesson 5: Convergence Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.6 Lesson 6: Use an External Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.7 Lesson 7: Understand Genericity in OpenLB . . . . . . . . . . . . . . . . . . . . 178
10.8 Lesson 8: Use Checkpointing for Long Duration Simulations . . . . . . . . . . . 180
10.9 Lesson 9: Run Your Programs on a Parallel Machine . . . . . . . . . . . . . . . . 180
10.10 Lesson 10: Work with Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
10.11 Alternative Approach: Using a Solver Class . . . . . . . . . . . . . . . . . . . . . 183
10.11.1 Structure of an OpenLB Simulation in Solver Style . . . . . . . . . . . . . 183
10.11.2 Set up an Application in Solver Style . . . . . . . . . . . . . . . . . . . . 184
10.11.3 Parameter Explanation and Reading from XML . . . . . . . . . . . . . . 184

11 For Developers 190


11.1 Coding rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
11.2 Compile Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.3 GIT Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.4 Creating a Merge Request in GitLab . . . . . . . . . . . . . . . . . . . . . . . . . 194
11.5 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

References 196

A Appendix 206
A.1 Q&A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
A.2 List of Project Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
A.3 GNU Free Documentation License . . . . . . . . . . . . . . . . . . . . . . . . . . 215

9
1. Introduction

1.1. Lattice Boltzmann Methods


In general, the lattice Boltzmann method (LBM) can be interpreted as a bottom-up procedure to
numerically approximate a given partial differential equation (PDE). In contrast to the conven-
tional top-down discretization of the target equation (TEQ), the LBM emerges from the space
and time discretization of a discrete velocity Boltzmann-type equation. Subsequently, the so-
lution to the TEQ is recovered in a limiting process. Different PDEs can be approximated by
altering certain features of the method. Exemplary LBM building blocks which induce the re-
covery of the aimed at transport equation are discussed in the following sections. Thorough
derivations of specific LBM approaches are summarized more comprehensively for example
in [23, 108].
The LBM algorithm is typically divided into two steps: collision and streaming. During the
collision step every distribution function fi at a grid node x ∈ Ω△x ⊆ Rd receives a collision
term Ji , i.e.
fi∗ (t, x) = fi (t, x) + △tJi (t, x), (1.1)

where △t is the time step size and t ∈ I△t ⊆ R≥0 . The most prominent collision operator
introduced by Bhatnager, Gross and Krook (BGK) [83] reads

1
Ji (t, x) = − [fi (t, x) − fieq (t, x)] . (1.2)
τ

The relaxation time τ > 0 determines the speed at which the populations approach equilibrium
and can for example be related to the viscosity in the hydrodynamic limit. The local equilib-
rium function fieq approximates the Maxwell–Boltzmann distribution. At the streaming step all
populations are shifted to the next grid point

fi (t + △t, x + △tci ) = fi∗ (t, x), (1.3)

where i = 0, 1, . . . , q − 1 enumerates the discrete velocities ci of dimension d contained in the


underlying set DdQq. For the purpose of illustration, the D2Q9 and the D3Q19 discrete velocity
sets are shown in Figures 1.1c and 1.1d, respectively. As a linkage from the LBM to the TEQ,
the macroscopic conservative variables are obtained by respective moment summation over the
populations and typically yield an approximation up to second order in space and time.

10
y

x
z
(a) D2Q5 (b) D3Q7 (c) D2Q9 (d) D3Q19

Figure 1.1.: Exemplary discrete velocity sets used for the descriptors implemented in OpenLB.
(a, b): DdQ(2d + 1) for d = 2, 3, respectively; (c): D2Q9; (d): D3Q19. Coloring refers
to the corresponding energy shell, where orange, cyan and green denote zeroth, first
and second order nodes, respectively. Figure reproduced from [128].

The following references are suggested for further insight into LBM.

• The book The Lattice Boltzmann Method: Principles and Practice [2017] by Krüger et al. [108]
delivers a clear and complete introduction for beginners.

• A concise introduction is given by Mohamad [117]. The book Lattice Boltzmann Method
[2011] documents for example the formal derivation of macroscopic limit equations using
Chapman–Enskog expansion.

• An LBM predecessor—the Lattice-Gas Cellular Automata (LGCA)—is extensively de-


scribed in Wolf-Gladrow’s book Lattice-Gas Cellular Automata and Lattice Boltzmann Models
[2000]. Starting with the derivation of LGCA, the author derives the LBM step by step.
Furthermore, a helpful interpretation of LBM is given in the beginning of the book.

• A quick overview of LBM, can be obtained from the paper Lattice Boltzmann Method for
Fluid Flows [1998] by Chen and Doolen [90].

1.2. OpenLB Project


1.2.1. Scope and Overview
OpenLB is a generic implementation of LBM written using the C++ programming language
and is used by application programmers as well as method developers. It is the first imple-
mentation of a comprehensive platform for LBM, that is freely shared with the open source
community. Since its conception, OpenLB is developed as open source under the GPL2 license.
Therefore, everyone with access to the source code has the right to use, adapt and publish the
software under the same license. Since the first release in 2007 [71], the code has been con-
tinuously improved and extended across thirteen major releases. User guides and source code
documentation for developers are available on the project website. In summary, OpenLB is

11
a technically complex and feature-rich CFD software library with an extensive set of example
cases also written in modern C++.

Figure 1.2.: Resolved simulation with 250 million grid cells of turbulent mixing and chemical
reactions done by OpenLB on 24 GPUs on the HoreKa supercomputer funded by the
Ministry of Science, Research and the Arts Baden-Württemberg and by the Federal
Ministry of Education and Research.

OpenLB is actively used in both developing novel methodologies and their applications. De-
velopers of OpenLB, especially researchers from mathematics and computer science, focus on
the methodology-oriented approach. Here a selection of recent (after 2016) publications is given:

• General LBM: [12, 17, 34, 40–42]

• HPC LBM: [25, 53, 36]

• Particle LB – Methods: [2, 22, 27, 29, 55, 31, 32, 48–50]

• Turbulence LB – Methods: [14, 15, 37, 42, 44]

• Optimal Control LB – Methods: [20, 21, 38, 51]

The application-oriented approach is chosen by esp. engineering researchers. At the top level,
each simulation in OpenLB currently consists of a fully parallel mesh construction (without any
need for external tools ), physical parametrization, assignment of dynamics encoding the cell-
specific models , configuration of operators for multi-physics coupling, non-local boundary
treatment and a selection of post-processing functions for extracting the relevant flow features
for further evaluation [23]. OpenLB enables application in a broad set of areas (physical char-
acterizations) such as incompressible Newtonian and non-Newtonian fluid flows, multi–phase
and multi–component flows (cf. Figure 1.2), light and thermal radiation as well as thermal flows,
particulate flows using both Euler–Euler and Euler–Lagrange with resolved and sub-grid scale

12
models, turbulent flow models (LES models and wall function approaches) and porous media
models [23]. A selection of recent (after 2016) application-oriented publications is listed here:

• Particle LB – Applications: [52, 1, 3, 7–10, 16, 26, 28, 30, 33, 58, 46, 47]

• Turbulence LB – Applications: [4–6, 11, 13, 14, 35, 56, 45]

• Optimal Control LB – Application: [18, 19, 21, 39]

OpenLB is not only an open source code, but also a community that aims to share resources
for software development, produce reproducible scientific results, and provide a good start for
students in LBM. Through the years, there have been about over 150 scientific articles published,
over 90 with and over 60 without the OpenLB developers’ participation (2022). Until January
2023, more than 50 individuals from Germany, Switzerland, France, United Kingdom, Brazil,
Czech Republic and USA have contributed to OpenLB.

1.2.2. What Makes OpenLB Special?


OpenLB is a numerical framework for lattice Boltzmann simulations, created by researchers
with different backgrounds in computational fluid dynamics. The code can be utilized by CFD
users to implement specific flow geometries in a straightforward way, and by developers to for-
mulate new models. For this first group of users, OpenLB offers a convenient interface through
which it is possible to set up a simulation with little effort. For the second group, the structure
of the code is kept conceptually simple, implementing basic concepts of the lattice Boltzmann
theory step-by-step. Thanks to this, the code is an excellent framework for programmers to
develop pieces of reusable code that can be exchanged in the community.
One key aspect of the OpenLB code is genericity in its many facets. The core concept of
generic programming is to offer a single code that can serve many purposes. On one hand, the
code implements dynamic genericity through the use of object-oriented interfaces. One use of
this is that the behavior of lattice sites can be modified during program execution, to distinguish
for example between bulk and boundary cells, or to modify the fluid viscosity or the value of a
body force dynamically. Furthermore, C++ templates are used to achieve static genericity. As
a result, it is sufficient to write a single generic code for various 3D lattice structures, such as
D3Q15, D3Q19, and D3Q27 (for more information on lattice structures, see Section 2.1).

1.2.3. Which Features are Currently Implemented?


An excerpt of the the current features is given below. Note that an extended list is provided
in [23]. For detailed description of individual features, see the respective sections below.

Pre-processing

13
Simulation domain construction with volume meshes (VTI)
Simulation domain construction with surface meshes (STL) Section 3.1
Simulation domain construction with geometry primitives Section 3.1
Automatic parallel meshing Section 3.1
Automatic load balancing
Segmentation for boundaries setting Section 3.2
Assigning boundary and initial values Section 2.6, 6.11

Post-processing

Interfaces to parallel VTK formats Section 6.2


Interface CSV Section 6.3
Writing images build-in Section 6.4
Evaluating with Gnuplot Section 6.5
Analytical functors for e.g. error norms, integrals, ... Section 2.6, 6.11

Lattice Boltzmann Models

BGK model for fluids Section 2.3 Reference [90]


Regularized model for fluids Section 2.3 Reference [112]
Multiple Relaxation Times (MRT) Section 2.3 References [93, 136]
Entropic Lattice Boltzmann Section 2.3 Reference [81]
BGK with adjustable speed of sound Section 2.3 References [74, 91]
BGK and MRT with Smagorinsky model Section 2.3 References [57]
Porous media model Section 2.3
Power law model Section 2.3

Multiphysics Coupling
See Section 4.4.

Shan-Chen two-component fluid Section 4.4 Reference [126]


Free energy model for multicomponent fluids Section 4.4 Reference [125]
Thermal fluid with Boussinesq approximation Section 4.4 Reference [100]

Lattice Structures
See Section 2.1. Exemplary discrete velocity sets (lattices) available in OpenLB are D2Q5, D2Q9,
D3Q7, D3Q13, D3Q15, D3Q19, D3Q27.

14
Boundary Conditions for Straight Boundaries (Including Corners)
See Section 5.1.

Regularized local Default choice in examples


Finite difference (FD) velocity gradients non-local Default choice in examples
Inamuro local
Zou/He local
Non-linear FD velocity gradients non-local

Boundary Conditions for Curved Boundaries


See Section 5.1.

Bouzidi non-local References [85]

Input / Output
The basic mechanism behind I/O operations in OpenLB is the (de)serialization of core data
structures as well as VTK output of simulation results via the functor concept. This mechanism
is used to snapshot the state of a simulation and to produce output for post-processing with
external tools. For further details, see for example Section 6.1.

1.2.4. Project Participants


The OpenLB project was initiated in 2006. Between 2006 and 2008 Jonas Latt was the project
coordinator. As of 2009, Mathias J. Krause is coordinating the project. Since 2006 more than 45
persons contributed to OpenLB. A list is provided in Section A.2 of the Appendix.

1.2.5. Getting Help with OpenLB


The following resources are available for OpenLB users:

Web site: Most recent releases of the code and documentation, including this user guide, can
be found on the website https://www.openlb.net.

Git repository: While most active development takes place in private repositories, we are
starting to open up this important aspect by providing a public repository on Gitlab
https://gitlab.com/openlb/release.

Forum: If you encounter problems or have any questions on how to use OpenLB, feel free to
use the forum on the OpenLB homepage to reach out to the OpenLB community.

15
Bug reports: If you found a bug in OpenLB, we encourage you to submit a report in the pub-
lic repository, in the forum or to [email protected]. Useful bug reports include the full
source code of the program in question, a description of the problem, an explanation of
the circumstances under which the problem occurred, and a short description of the hard-
ware and the compiler used. Moreover, other Makefile switches, such as the mode of
parallelization found in config.mk can provide useful information too.

Spring School: To lower the hurdle and the assistance of people new to this field is the mo-
tivation in organizing yearly spring schools, namely Lattice Boltzmann Methods with
OpenLB Software Lab. The first spring school was organized in 2017 in Hammamet
(Tunisia), followed by Karlsruhe in 2018, Mannheim in 2019, Berlin in 2020, Krakow
(Poland) in 2022, London (UK) in 2023 and Heidelberg in 2024. All spring schools are
organized as open international workshops. The intention of creating an international
platform with courses for beginners in LBM, which are researchers from academia and
industry, was fulfilled with a response of 40 participants from 15 countries, averaged over
the years. The event is based on the interlaced educational concept of comprehensive and
methodology-oriented courses offered by the core developer team of OpenLB together
with the local partner group as well as professional guest lecturers within the field of
LBM. The first half of the week is dedicated to the theoretical fundamentals of LBM up
to ongoing research on selected topics. Followed by mentored training on case studies
using OpenLB in the second half, where the participants gain deep insights into LBM and
its various applications in different disciplines. More information about the next spring
school can be found on the OpenLB website.

Consortium: If you want detailed support or get deeply involved in the further development
of OpenLB, you can become a member of the consortium. If you are interested, please
send an email to [email protected]. The consortium manages the continuous
development and maintenance of OpenLB, especially for (1) improvement of the general
usage according to GNU GPL2 License and regular publication, (2) maintenance of the ex-
ecutability on current standard HPC-Hardware and (3) conservation of the current status
of the LBM research in OpenLB.

1.2.6. How to Cite OpenLB


For references to OpenLB in general, we recommend to cite the most recent overview paper [23].

@article{openlb2020,
author = "Krause, M. J. and Kummerl{\"a}nder, A. and Avis, S. J.
and Kusumaatmaja, H. and Dapelo, D. and Klemens, F.
and Gaedtke, M. and Hafen, N. and Mink, A.
and Trunk, R. and Marquardt, J. E. and Maier, M.-L.
and Haussmann, M. and Simonis, S.",

16
title = "OpenLB--Open source lattice Boltzmann code",
year = "2020",
journal = "Computers \& Mathematics with Applications",
doi = "10.1016/j.camwa.2020.04.033"
}

For references to a specific release of the code, each version is associated with a citeable Zenodo
publication. The latest release OpenLB 1.7 [70] is covered by the present user guide and citeable
via:

@Software{olbRelease17,
author = {Kummerl\"{a}nder, Adrian and
Bingert, Tim and
Bukreev, Fedor and
Czelusniak, Luiz Eduardo and
Dapelo, Davide and
Hafen, Nicolas and
Heinzelmann, Marc and
Ito, Shota and
Je\ss{}berger, Julius and
Kusumaatmaja, Halim and
Marquardt, Jan E. and
Rennick, Michael and
Pertzel, Tim and
Prinz, Franti\v{s}ek and
Sadric, Martin and
Schecher, Maximilian and
Simonis, Stephan and
Sitter, Pascal and
Teutscher, Dennis and
Zhong, Mingliang and
Krause, Mathias J.},
title = {{OpenLB Release 1.7: Open Source Lattice Boltzmann
Code}},
month = feb,
year = 2024,
publisher = {Zenodo},
version = {1.7.0},
doi = {10.5281/zenodo.10684609},
url = {https://doi.org/10.5281/zenodo.10684609}
}

Starting with release 1.6, metadata is also defined in the standardized Citation File Format and
included as a CITATION.cff in the release tarball.

17
2. Core Concepts
The basic data structure of any Lattice Boltzmann code is the eponymous Lattice or grid. For the
most widely accepted formulations, such a grid is a regular, homogeneous lattice Ωh with equal
spacing h ∈ R>0 in all directions. Each spatial location on this lattice can be referred to as a Cell
which is the core unit of the LB algorithm.
One of the main aspects motivating the use of LBM is its unique suitability for massive paral-
lel processing. This parallelization aspect naturally leads to the need for further spatial decom-
position of the lattice data structure in order to be able to distribute the processing to multiple
independent processing units. In OpenLB, this spatial decomposition into so called Blocks mo-
tivates the core naming convention of distinguishing between Super- and Block-level structures.
This approach respects the spirit of LBM well and leads to elegant and efficient implemen-
tations. For complex geometries, a multi-block approach provides another advantage: A given
domain can be represented by a certain number of regular blocks, which delivers a good com-
promise between highly efficient block-local processing and sparse memory consumption.
For most practical applications, the lattice data structure not only manages the population
values assigned to the individual cells and declared by the descriptor (cf. Section 2.1), but also
associated per-cell data such as force fields or reaction coefficients.

2.1. Descriptor
Descriptors are the structure that is used to define and access LB model specific information
such as the number of dimensions and discrete velocities as well as weights and declarations
of additional fields. As such, a descriptor is a central choice in any OpenLB application and
passed as a template argument throughout the code base.
1 using T = FLOATING_POINT_TYPE;
2 using DESCRIPTOR = descriptors::D2Q9<>;
3
4 // number of spatial dimensions
5 const int d = descriptors::d<DESCRIPTOR>(); // == 2
6 // number of discrete velocities
7 const int q = descriptors::q<DESCRIPTOR>(); // == 9
8
9 // second discrete velocity vector
10 const Vector<int,2> c1 = descriptors::c<DESCRIPTOR>(1); // == {-1,1}
11
12 // weight of the first discrete velocity
13 const T w = descriptors::t<T,DESCRIPTOR>(0); // == 4./9.

18
OpenLB provides a rich set of such descriptors, most of which are defined in src/dynamics/
latticeDescriptors.h. Despite the central role of this concept, the concrete definitions of
for example the D2Q9 descriptor, are quite compact. To illustrate this point, Listing 2.1 provides
the full definition of this descriptor including all of its data.
1 template <typename... FIELDS>
2 struct D2Q9 : public LATTICE_DESCRIPTOR<2,9,POPULATION,FIELDS...> {
3 D2Q9() = delete;
4 };
5
6 namespace data {
7
8 template <>
9 constexpr int vicinity<2,9> = 1;
10
11 template <>
12 constexpr int c<2,9>[9][2] = {
13 { 0, 0},
14 {-1, 1}, {-1, 0}, {-1,-1}, { 0,-1},
15 { 1,-1}, { 1, 0}, { 1, 1}, { 0, 1}
16 };
17
18 template <>
19 constexpr int opposite<2,9>[9] = {
20 0, 5, 6, 7, 8, 1, 2, 3, 4
21 };
22
23 template <>
24 constexpr Fraction t<2,9>[9] = {
25 {4, 9}, {1, 36}, {1, 9}, {1, 36}, {1, 9},
26 {1, 36}, {1, 9}, {1, 36}, {1, 9}
27 };
28
29 template <>
30 constexpr Fraction cs2<2,9> = {1, 3};
31
32 }

Listing 2.1: Definition of the D2Q9 descriptor

Many LBM based solutions to practical problems require the ability to store not just the popu-
lations of each cell but also additional data such as an external force (see e.g. Section 10.6). For
this reason every descriptor may explicitly declare such fields.
1 using DESCRIPTOR = descriptors::D2Q9<descriptors::FORCE>;
2
3 // Check whether DESCRIPTOR contains the field FORCE
4 DESCRIPTOR::provides<descriptors::FORCE>(); // == true
5 // Get cell-local memory location of the FORCE field
6 const int offset = DESCRIPTOR::index<descriptors::FORCE>(); // == 9
7 // Get size of the descriptor’s FORCE field
8 const int size = DESCRIPTOR::size<descriptors::FORCE>(); // == 2

A list of various predefined fields can be found in the src/dynamics/descriptorField.h

19
header. Note that one may add an arbitrary list of such fields to a given descriptor. Even more
so it is fully supported to add fields on a per-app basis. One only needs to make sure that the
type is defined prior to any custom descriptor that depends on it, i.e. a user could write inside
of a case.
1 struct MY_CUSTOM_FIELD: public FIELD_BASE<42,0,0> { };
2 using DESCRIPTOR = D2Q9<FORCE,MY_CUSTOM_FIELD>;

This custom descriptor can then be used in the same way as any other descriptor. This might
even be preferable for very specific fields that are only used by a small number of apps or
specific user-provided features.

2.2. (Super,Block)Lattice
The SuperLattice is OpenLB’s main class for representing and maintaining a concrete lattice
with all associated data, cell local models and non-local operators. It is constructed from a
load balanced geometry, i.e. given a SuperGeometry instance which is in turn obtained from a
CuboidGeometry and its associated LoadBalancer.

2.2.1. Context
The CuboidGeometry decomposes a given indicator geometry into individual cuboids. By design,
there is a bijection between the set of cuboids and the set of block lattices resp. block geome-
tries. Given a cuboid geometry, the LoadBalancer assigns the contained cuboids to the available
number of MPI processes. Based on this structure, each spatial cell location of each lattice loca-
tion encompassed by the cuboid geometry is assigned a so called material number managed by
the SuperGeometry (cf. Chapter 3 on meshing). Finally, a concrete BlockLattice is instantiated
by the SuperLattice instance of the responsible LoadBalancer-assigned MPI process. This class
maintains the actual data associated with all its lattice cells. As such, each process only has
access to the data of its own blocks, extended by a small overlap for inter-block communication.
This communication (i.e. synchronization and overlap data) is performed by SuperCommunicator
instances maintained in stages of the SuperLattice.

2.2.2. BlockLattice
The BlockLattice class is the high-level interface abstracting the platform-specific implemen-
tation details of the actual core structures managed by ConcreteBlockLattice class template
specializations.
Specifically, there are specializations for each supported platform: CPU_SISD, CPU_SIMD, and
GPU_CUDA. At its core, adding support for a new platform consists of specializing all platform-
templated classes.

20
2.2.2.1. Core Data

The very core data structure of OpenLB is the per-field (cf. Section 2.1) column or Structure-
of-Arrays (SoA) storage FieldArrayD. This FieldArrayD maintains (Cyclic)Column instances for
each component of its FIELD template argument in a fashion specific to the concrete platform.
In other words, the same FIELD is stored differently for each block / platform, while sharing the
same interface of column-wise access. This fundamental structure allows for exchanging the
entire core data structure implementation without touching the higher level interfaces against
which for example cell specific models are implemented (cf. Section 2.3).
On the next level, all FieldArrayD instances of a block are maintained within the Data class.
This class provides an on-demand allocating getFieldArray method for access to the field array
specific to a given FIELD.
The details of rendering this most performance-critical structure efficient across all platforms
exceeds the scope of this guide, but you can be assured that the developer team invests lots of
effort in preserving and increasing OpenLB’s performance.

2.2.3. Communication
While the cuboid decomposition maintained by the CuboidGeometry is disjoint, block geometries
and lattices overlap by a configurable overlap layer. Synchronizing this overlap layer, that is
communicating the data owned by a given block to its neighbor’s overlap layer, is how any
inter-block communication is handled in OpenLB.
The most frequent type of overlap communication is the synchronization of a single layer of
populations during the PostCollide stage of the core collideAndStream iteration.
Communication of given stages can be manually triggered using the following.
1 sLattice.getCommunicator(PostCollide{}).execute();

Similarly, communication stages may be configured w.r.t. to their extent and included fields.
For example, we can consider the setup of the default PostCollide stage.
1 auto& communicator = sLattice.getCommunicator(PostCollide{});
2 communicator.template requestField<descriptors::POPULATION>();
3 communicator.requestOverlap(1);
4 communicator.exchangeRequests();

2.2.4. Cell
Users familiar with previous versions of OpenLB may remember the original hierarchy of the
SuperLattice containing BlockLattice instances containing a D-dimensional array of Cell in-
stances containing a Q-dimensional array of population values.
This data-holding Cell was refactored by OpenLB 1.4 [67] to only provide a view of the
data maintained in field arrays (cf. Section 2.2.2.1). This provided an essential ingredient for the

21
implementation of both SIMD and GPU support in OpenLB 1.5 [68] (cf. Talk [54] for a summary
of the refactoring journey).
Today, there are multiple per-platform and use case versions of the cell interface. Core struc-
tures such as Dynamics and non-local operators are templated against the concept of a cell instead
of a specific implementation thereof. This is essential for being able to instantiate the same ab-
stract model implementations on diverse hardware, e.g. both in vectorized CPU collision loops
and CUDA GPU kernel functions.

2.3. Dynamics
In the context of OpenLB, the term Dynamics refers to the implementation of cell-local LB mod-
els. i.e. equilibrium distribution, collision step and momenta computation. Each cell of a lattice
is assigned a dynamics describing its local behavior. This way, it is easy to implement inhomo-
geneous fluids which use a different type of physics from one cell to another.
The non-local streaming step between cells is modeled independent of the choice of cell-local
models and remains invariant. OpenLB’s block-local propagation is implemented using the
Periodic Shift pattern [25] on all platforms.
This concept of per-cell dynamics is tied together in the Dynamics interface. Every dynamics
implementation describes the various components required to model the local behavior of the
assigned cell locations. Specifically, this includes the set of momenta, the equilibrium distribu-
tion as well as the actual collision operator. Fittingly, most dynamics are declared as a tuple of
those three components.
1 template <typename T, typename DESCRIPTOR>
2 using TRTdynamics = dynamics::Tuple<
3 T, DESCRIPTOR,
4 momenta::BulkTuple,
5 equilibria::SecondOrder,
6 collision::TRT
7 >;

In this example declaration of the TRTdynamics, we can tell at a glance that its assigned cells will
expose bulk momenta reconstructed directly from the population values and relax towards the
second order equilibrium using the TRT (two-relaxation-time) collision operator. The dynamics
tuple concept is quite powerful, e.g. we may apply a forcing scheme by adding a single line.
1 template <typename T, typename DESCRIPTOR>
2 using ForcedTRTdynamics = dynamics::Tuple<
3 T, DESCRIPTOR,
4 momenta::BulkTuple,
5 equilibria::SecondOrder,
6 collision::TRT,
7 forcing::Guo
8 >;

22
This forcing::Guo combination rule is the fourth and optional component of the dynamics tuple.
Combination rules may arbitrarily manipulate the previous momenta, equilibrium and colli-
sion components. In the case of Guo, forcing this means that the momenta argument is shifted
via momenta::Forced and the TRT collision operator is wrapped to force the post collision popu-
lations according to the Guo scheme.
OpenLB offers a comprehensive library of momenta, equilibria, collision operators and com-
bination rules that can be easily combined into many different dynamics tuples. See

• src/dynamics/momenta/aliases.h

• src/dynamics/equilibrium.h

• src/dynamics/collision.h

• src/dynamics/forcing.h

for some examples.


While this framework covers most collision steps, a fallback option is provided via dynamics
::CustomCollision. This class enables the combination of momenta with arbitrary collision and
equilibrium implementations. One example for this are the CombinedRLBdynamics that are used
as the foundation for various boundary conditions.
The full definition of the interface is available in src/dynamics/interface.h. Note
that due to recent large refactoring to support execution on GPUs and SIMD CPUs, Dynamics
currently contains various legacy methods that will be deprecated in future releases. New
dynamics implementations should be formulated either as a dynamics::Tuple or dynamics::
CustomCollision template.

2.3.1. Collision Operators


A partial summary of the currently supported collision operators can be found in the recent
publication on OpenLB [23]. Furthermore, five commonly used collision schemes are compared
in [15] for a typical 3D benchmark case of homogeneous isotropic turbulence. For derivations,
analysis and theoretical comparisons the interested reader is referred to [108].

2.3.1.1. Implementation in Dynamics

As was touched upon in Section 2.3, local collision operators are expressed as Dynamics in the
context of OpenLB. Specifically, the common dynamics tuple concept expresses collision oper-
ators as reusable elements alongside equilibria and momenta.
1 struct BGK {
2 using parameters = typename meta::list<descriptors::OMEGA>;
3
4 static std::string getName() {
5 return "BGK" ;

23
6 }
7
8 template <typename DESCRIPTOR, typename MOMENTA, typename EQUILIBRIUM>
9 struct type {
10 using EquilibriumF = typename EQUILIBRIUM::template type<DESCRIPTOR,MOMENTA>;
11
12 template <typename CELL, typename PARAMETERS, typename V=typename CELL::value_t>
13 CellStatistic<V> apply(CELL& cell, PARAMETERS& parameters) any_platform {
14 V fEq[DESCRIPTOR::q] { };
15 const auto statistic = EquilibriumF().compute(cell, parameters, fEq);
16 const V omega = parameters.template get<descriptors::OMEGA>();
17 for (int iPop=0; iPop < DESCRIPTOR::q; ++iPop) {
18 cell[iPop] *= V{1} - omega;
19 cell[iPop] += omega * fEq[iPop];
20 }
21 return statistic;
22 };
23 };
24 };

This is the complete listing of the well known BGK collision operator that is used by many
different dynamics. Each collision operator consists of three elements: A parameters type list
of fields that is used to parameterize the collision, a getName method that is used to generate
human readable names for dynamics tuples and a nested type template that contains the actual
apply method specific to each operator.
The nested type template is used to enable composition into dynamics tuples and will be
automatically instantiated for the required descriptor, momenta and equilibrium types. Addi-
tionally, this is used as a place for injecting partial specialization which enable usage of auto-
generated CSE-optimized kernels. CSE (common subexpression elimination) therefore allows the
transformation of the composable Dynamics-toolbox into efficient code.
As is the case for all other elements, the apply template method follows a fixed signature for
all collision operators. Each call is provided an instance of some platform-specific implementa-
tion of the cell concept alongside a parameters structure containing all requested values. Using
these two inputs the method can perform the local collision and return the computed density
and velocity magnitudes for usage in lattice statistics. This pattern is repeated at various places
of the library. Examples for other instances are equilibria and momenta elements as well as post
processors. Any implementations of this style are usable on any of OpenLB’s target platforms
(currently this means the scalar and vectorized CPU code as well as GPU support). They are
also amenable to automatic code generation.
For an introduction on how to write your own dynamics, see the FAQ in Section A.1.

2.4. Non-local operators or Post Processors


While the basic concept of dynamics assigned to cells in a block lattice is conceptually close to
the theory of LBM, it is not sufficiently general to address all possible requirements arising in

24
complex applications. As a case in point, some boundary conditions are non-local and need
to access neighboring nodes. Their execution is taken care of by a post processing step, which
instead of traversing the entire lattice a second time, applies to selected cell locations only.
While collision steps are easily parallelized due to their local nature, this doesn’t hold once
neighborhood access is required. For this reason, two adjacent concepts are used to group post
processors. Each post processor is assigned to a stage and a priority within this stage. For ex-
ample both OuterVelocityCornerProcessor3D and OuterVelocityEdgeProcessor3D are processed
in the PostStream stage, but the former is executed after the latter to avoid access conflicts. In
turn, post processors within the same stage and priority may be executed in parallel depending
on the execution platform. Note that both the number of priorities and stages may be freely
customized – e.g. the free surface code introduces a number of additional stages to interleave
post processing and custom communications steps.
Each (non-local) operator consists of scope and priority declarations in addition to an apply
template method. For the aforementioned OuterVelocityCornerProcessor3D this is declared
as follows (see src/boundary/boundaryPostProcessors3D.hh for the full implementa-
tion).
1 template<typename T, typename DESCRIPTOR,
2 int xNormal, int yNormal, int zNormal>
3 struct OuterVelocityCornerProcessor3D {
4 static constexpr OperatorScope scope = OperatorScope::PerCell;
5
6 int getPriority() const {
7 return 1;
8 }
9
10 template <typename CELL>
11 void apply(CELL& cell) any_platform {
12 // [...] implementation
13 }
14 };

Here, the OperatorScope::PerCell declares that the apply function will be provided a cell im-
plementation with neighborhood access for each assigned location. Other scopes such as
OperatorScope::PerBlock (used for example for statistics computation) enable different access
patterns. In any case, the assigned cell locations are maintained by OpenLB’s post processor
framework. For example
1 sLattice.addPostProcessor<PostStream>(indicator, meta::id<SomePerCellPostProcessor>{});

schedules a post processor for application to all indicated cells during the PostStream stage.
For an introduction on how to write your own post processors, see the FAQ in Section A.1.
Note that this section describes the new post processor concept adopted by OpenLB 1.5.
Some existing legacy post processors still use a different paradigm. Specifically, they are derived
from PostProcessor(2,3)D and override virtual methods such as void PostProcessor(2,3)D::
process(BlockLattice<T,DESCRIPTOR>&). All remaining legacy post processors will be ported

25
to the new approach in time. For CPU targets, legacy post processors can be used without
restrictions, but they are not supported on the GPU platform.

2.5. Parallelization
As CFD applications tend to require a large amount of computational resources, it is essential
to have the flexibility to transparently switch to a parallel platform in order to efficiently utilize
said resources. This section concentrates on parallelism on distributed memory machines using
MPI, as distributed memory is the most common model on large-scale, parallel machines. All
example cases in the OpenLB distribution can be compiled with MPI and executed in paral-
lel. Data which is spatially distributed, such as lattice fields, is handled through a data-parallel
paradigm. The data space is partitioned into smaller regions that are distributed over the nodes
of a parallel machine. In the following, these types of structures are referred to as data-parallel
structures. Other data types that require a small amount of storage space are duplicated on ev-
ery node. These are referred to as duplicated data. All native C++ data types are automatically
duplicated by virtue of the Single-Program-Multiple-Data model of MPI. These types should be
used to handle scalar values, such as the parameters of the simulation, or integral values over
the solution (e.g. the average energy).
For output on the console it is recommended to use OstreamManager since it transparently
manages output in case of parallel execution (cf Chapter 6.6).

2.5.1. Platform-transparency of Models


All model details such as functions to perform a certain collision step or a certain non-local
boundary condition should be implemented in a platform-transparent fashion. Such functions
commonly look similar to
1 template <typename CELL>
2 auto compute(CELL& cell) any_platform {
3 return cell.computeRho();
4 }

i.e. they accept the concept of a Cell by templatization (to be explicitly declared once OpenLB is
switched to C++20) and are declared as any_platform.
This any_platform keyword is not standard C++ but a macro that is replaced by nothing
for CPU targets and by the CUDA-specific non-standard declarators __device__ __host__ for
CUDA GPU targets. That is, any_platform is a OpenLB-specific macro to hide the platform-
specific and non-standard-compliant ways that different frameworks use to declare functions
as device-executable.
The main restriction is the __device__ declaration - unfortunately the specifics of what is
supported in device code are not easily summarized and change between CUDA versions. For
an incomplete overview of what works without issues:

26
• Basic C++ constructs including loops and branches

• Basic arithmetic

• olb::Vector operations

• Most mathematical olb::util::* functions (full coverage is work in progress)

• Access to cell, parameter methods such as CELL::(get,set)Field

One big restriction is that the entire C++ standard library is not guaranteed to work but may
work in part accidentally. Common containers of the standard library such as std::vector and
container algorithms (std::find and friends) don’t work. In any case dynamic memory alloca-
tion should be avoided.
A starting place for reading NVIDIA’s documentation on device code restrictions is
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#
c-language-support.

2.5.2. Supported Platforms


OpenLB models heterogeneous parallelization using its predominant block decomposition ar-
chitecture. Each individual block of a super lattice is assigned one of currently three possible
target plaforms: Platform::CPU_SISD, Platform::CPU_SIMD or Platform::GPU_CUDA. The availabil-
ity of each platform is controlled by adding its name to the PLATFORMS variable in the config.mk
build configuration. Note that further system specific changes to the compiler settings are al-
most certainly required for CPU_SIMD and GPU_CUDA. See the config/ folder for some examples.
Note that CPU_SISD must always be enabled as some host-side data structures rely on this
platform. In the absence of heterogeneous load balancing, the most efficient enabled platform is
used by default for all blocks. Specifically, CPU_SIMD if only the two CPU platforms are enabled
or GPU_CUDA (even if CPU_SIMD is also enabled at the same time).

2.5.2.1. SIMD

Modern CPUs offer vector instructions with a width of up to 512 bytes in the case of AVX-512.
This means that 8 respectively 16 individual scalar values can be processed in a single instruc-
tion. In some situations this can significantly speed up the bulk collision step, which is why
OpenLB supports this option for processing its Dynamics. This option is at its most powerful
when combined with the HYBRID parallelization mode s.t. OpenMP is used to further parallelize
the vectorized collision on each shared memory node. However, setting up the hybrid mode
correctly on a HPC system is non trivial, which is why we suggest to stick to the MPI-only mode
for users unexperienced in working with HPC systems.
Once enabled in the build configuration, vectorization is applied to the dominant collision of
each block lattice transparently without requiring any additional code changes.

27
2.5.2.2. GPU

General purpose graphics processing units (GPGPUs) are an ideal platform for many LBM-
based simulations due to their high memory bandwidth and high degree of parallelization.
OpenLB currently supports transparent usage of Nvidia GPUs via CUDA for almost all dy-
namics and a core set of boundary post processors and coupling operators. Similarly to both
other available platforms, enabling GPU support requires only adding GPU_CUDA to the PLATFORMS
variable in config.mk and some system-specific updates to the compiler settings. MPI paral-
lelization is fully supported for GPU blocks, enabling simulations on multi-GPU clusters.
Note that the resolution commonly needs to be increased significantly compared to the CPU-
accomodating default values in order to take full advantage of GPU acceleration. Especially on
desktop-grade GPUs changing the fundamental floating point type T to float is advisable. Also
see Section 9.2.1 for guidance on how to install, compile and run GPU-aware OpenLB cases.

2.6. Functors
In general, a functor is a class that behaves like a function. Objects of a functor class perform
computations by overloading the operator() in a standardized fashion, allowing for arithmetic
combination. One big advantage of functors over functions is that they allow the creation of a
hierarchy and bundle ”classes of functions”.

2.6.1. Basic Functor Types


The functor concept is a user friendly and efficient technique to process lattice data and extract
relevant data for post-processing. In the meantime, OpenLB deploys the functors also for the
geometry, which is a very intuitive and powerful choice.
Basically, functors are applications that operate either on the lattice N3 or more generally on
R3 . The values of such a functor may be three dimensional, as for example for the velocity,
where the mapping is defined as

Functor : Ω → Rd , for d ∈ N . (2.1)

The nomenclature is based on the dimension of the domain. Let’s say the functor acts on a 3d
(super) lattice, the the functor is called SuperLatticeF3D. If the functor value is density, then this
functor is called SuperLatticeDensity3D.

2.6.1.1. GenericF

The GenericF functor stands at the top of the hierarchy and is a virtual base class that provides
interfaces. Template parameter S defines the input data type and template parameter T, the

28
output. The essential interface is the unwritten (pure virtual function) operator(). Commonly,
this ()−operator is used as an evaluation of a certain functor, e.g. pressure at position x.

2.6.1.2. AnalyticalF

This a subclass of GenericF, for functions that live in SI-units, e.g. for setting velocities in m/s.
Parts of this class are, for example, constant, linear, interpolation and random functors, which
can be evaluated by the ()-operator. There is an AnalyticalCalc class, which inherits from
AnalyticalF and establishes arithmetic operations (+, −, ·, /) between every type of

AnalyticalF3D : R3 → Rd , d∈N. (2.2)

2.6.1.3. IndicatorF

This is another subclass of GenericF that returns a vector with elements 0 or 1, i.e.

IndicatorF3D : R3 → {0, 1} . (2.3)

These are used to construct geometries, e.g. IndicatorSphere3D creates a sphere using an origin
and radius. Evaluation returns 1, if the vector is inside the sphere and 0 otherwise. In analogy to
the AnalyticalF, there are arithmetic operations as well, but with a slightly different definition.
The returned object of an addition is the union, multiplication returns the intersection, and
subtraction represents the relative complement.

2.6.1.4. SmoothIndicatorF

Smooth indicators define a small epsilon region around the object such that it has a smooth
transition from 0 to 1. In general, the mapping is defined as

SmoothIndicatorF3D : R3 → [0, 1] . (2.4)

2.6.1.5. SuperLatticeF

These functors are defined on the lattice via

SuperLatticeF3D : N3 → Rd , d∈N, (2.5)

and commonly represent the raw simulation data, e.g. macroscopic moments such as pressure
and velocity. SuperLattice functors are part of the parallelism layer and they delegate the cal-
culations to the corresponding BlockLattice functors.

29
2.6.1.6. InterpolationF

Interpolation functors establish conversion between the analytical and lattice functors. They
are very important in setting analytical boundary conditions, by evaluating the given analytical
function on the lattice points. The reverse direction - from lattice to analytical functors - is where
this functor receives its name, as the conversion is achieved by interpolation between the lattice
points.

2.6.2. Functor Arithmetic


Functor arithmetic expressions are based on functor instances wrapped in std::shared_ptr<F>
smart pointers. This layer is necessary to enable automatic memory management for trees of
interdependent functors.
1 std::shared_ptr<SuperF3D<T>> aF(
2 new SuperConst3D<T>(superStructure, {1.0, 2.0}));
3 std::shared_ptr<SuperF3D<T>> bF(
4 new SuperConst3D<T>(superStructure, {2.0, 1.0}));
5 std::shared_ptr<SuperF3D<T>> cF = aF + bF;
6 // cF->operator() returns {3.0, 3.0}

Listing 2.2: Basic showcase for std::shared_ptr based functor arithmetic

Note that cF can be passed out of scope without any regard for aF and bF, as managed pointers
are stored internally. At first glance this new functor arithmetic may seem unnecessarily verbose
for basic usage such as simply adding two functors and directly using the result. As such
legacy functor arithmetic is still available for basic use cases. Usage of std::shared_ptr functor
arithmetic is supported by both FunctorPtr and a DSL to ease development of more complex
functor compositions.
The FunctorPtr helper template is used throughout the functor codebase to transparently ac-
cept functors independently of how their memory is managed. This means that functors man-
aged by std::shared_ptr are accepted as arguments in any place where raw functor references
were used previously. As a nice benefit FunctorPtr transparently forwards any calls of its own
operator function to the operator of the underlying functor.
1 T error(FunctorPtr<SuperF3D<T>>&& f, T reference) {
2 T output[1] = { };
3 const int origin[4] = {0,0,0,0};
4 f(output, origin);
5 return fabs(output[0] - reference);
6 }
7
8 std::shared_ptr<SuperF3D<T>> managedF(
9 new SuperConst3D<T>(superStructure, 1.0));
10 SuperConst3D<T> rawF(superStructure, 1.0);
11
12 // error(managedF, 1.1) == error(rawF, 1.1) == 0.1

30
Listing 2.3: FunctorPtr and std::shared_ptr based functor arithmetic

Functor arithmetic expressions may also contain constants in addition to functors. Any
scalar constant used in the context of managed functor arithmetic is implicitly wrapped into
a SuperConst(2,3)D instance.
1 std::shared_ptr<SuperF3D<T>> aF(/*...*/);
2 auto bF = 0.5 * aF + 2.0; // scalar multiplication and addition

Listing 2.4: Constant scalars in managed functor arithmetic

Constant vectors are also supported if they are explicitly passed to the SuperConst(2,3)D con-
structor. Note that arithmetic operations of equidimensional functors are performed compo-
nentwise (i.e. aF * aF is not the scalar product).
1 std::shared_ptr<SuperF3D<T>> vectorF(
2 new SuperConst3D<T>(superStructure, {1.0, 2.0, 3.0}));
3 auto cF = aF / vectorF; // componentwise division

Listing 2.5: Constant vectors in managed functor arithmetic

2.6.2.1. Functor Composition

Composing multiple managed functors which in turn need multiple arguments by themselves
such as when calculating error norms in a reusable fashion can quickly lead to expressions that
are fairly hard to read. For this reason, the functor_dsl namespace offers a set of conveniently
named helper functions in order to deobfuscate such functor compositions.
Consider for example the following snippet which constructs and evaluates a relative error
functor based on the L2 norm.
1 using namespace functor_dsl;
2 // decltype(wantedF) == std::shared_ptr<AnalyticalF3D<double,double>>
3 // decltype(f) == std::shared_ptr<SuperF3D<double>>
4 // decltype(indicatorF) == std::shared_ptr<SuperIndicatorF3D<double>>
5 auto wantedLatticeF = restrictF(wantedF, sLattice);
6 auto relErrorNormF = norm<2>(wantedLatticeF - f, indicatorF))
7 / norm<2>(wantedLatticeF, indicatorF);
8 const int input[4];
9 double result[1];
10 relErrorNormF->operator()(result, input);
11 std::cout << " R e l a t i v e e r r o r : " << result[0] << std::endl;

Listing 2.6: functor_dsl supported functor composition

31
Note that lines 5 to 7 contain the full implementation of the expression

∥wantedF − f ∥2
,
∥wantedF∥2

i.e. the L2-normed relative error of an arbitrary functor f as compared to the analytical so-
lution wantedF. This simply allows for moving even basically one-off functor compositions
into reusable and easily verifiable functors whose implementation is as close to the actual
mathematical definition as is reasonably possible. Correspondingly a more developed version
of Listing 2.6 can be found in SuperRelativeErrorLpNorm3D which is used extensively by the
poiseuille3d example to compare simulated and analytical solutions.
1 template <typename T, typename W, int P>
2 template <template <typename U> class DESCRIPTOR>
3 SuperRelativeErrorLpNorm3D<T,W,P>::SuperRelativeErrorLpNorm3D(
4 SuperLattice<T,DESCRIPTOR>& sLattice,
5 FunctorPtr<SuperF3D<T,W>>&& f,
6 FunctorPtr<AnalyticalF3D<T,W>>&& wantedF,
7 FunctorPtr<SuperIndicatorF3D<T>>&& indicatorF)
8 : SuperIdentity3D<T,W>([&]()
9 {
10 using namespace functor_dsl;
11
12 auto wantedLatticeF = restrictF(wantedF.toShared(), sLattice);
13
14 return norm<P>(wantedLatticeF-f.toShared(), indicatorF.toShared())
15 / norm<P>(wantedLatticeF, indicatorF.toShared());
16 }())
17 {
18 this->getName() = " relErrorNormL " + std::to_string(P);
19 }

Listing 2.7: Functor composition in SuperRelativeErrorLpNorm3D’s constructor

Disregarding the addition of FunctorPtr as well as further templatization, lines 12 to 15 are


equivalent to the ad-hoc error norm in Listing 2.6. Also note how the actual composition hap-
pens inside of a lambda expression and is then returned to be stored by SuperIdentity3D. This
allows for assigning composed functors their own name and renders them indistinguishable
from primitive functors.

32
3. Geometry Creation and Meshing
This chapter presents how geometry data can be loaded in or created by OpenLB. Furthermore,
the concept of material numbers is shown.

3.1. Creating a Geometry


OpenLB provides an interface for STL based geometry data and generates fully automated a
structured regular mesh. On the other hand, geometries can be build out of primitive shapes
such as cuboids, spheres and cylinders. By the implemented arithmetic that includes intersec-
tion, union and complement, those primitives can be assembled very generally. A computa-
tional domain such as the SuperLattice is created in 6 simple steps (see also Fig 3.1):

Step 1: Create an Indicator3D instance by


1. Reading an STL file, see example aorta3d 8.11.1 and section 3.4.
2. Pre-defined primitive shapes (cuboid, circle, cylinder, sphere) and their combinations
(+, −, ·, /), as described in the example venturi3d, see Section 8.11.5.

Step 2: Construct CuboidGeometry3D. During construction, the geometry from step 1 is divided
into the predefined number of cuboids that are thereafter automatically removed, shrunk
and weighted for a good load balance. By weighting the user can choose between weight
and volume strategies. For complex shapes the last option is preferable. A larger cuboids
number removes more unnecessary nodes, but implies higher communication costs.

Step 3: Construct LoadBalancer that assigns cuboids to threads. The standard option is the
HeuristicLoadBalancer, whereby there are also other variants.

Step 4: Construct SuperGeometry3D that links material numbers to voxels (see Section 3.2).

Step 5: Set material numbers to different simulation space regions, where afterwards dynamics
and boundaries are defined.

Step 6: Construct SuperLattice to perform stream and collide algorithm.

1 // Step 1: Create Indicator


2 STLreader<T> stlreader( " f i l e n a m e . s t l " , voxelSize, stlSize, method);
3 IndicatorCuboid3D<T> indicator( extend, origin );
4 // Step 2: Construct cuboidGeometry.

33
5 CuboidGeometry3D<T> cuboidGeometry(stlReader / indicator, voxelSize, noOfCuboids,
weightingStrategy);
6 // Step 3: Construct LoadBalancer.
7 HeuristicLoadBalancer<T> loadBalancer(cuboidGeometry);
8 // Step 4: Construct SuperGeometry.
9 SuperGeometry<T,3> superGeometry(cuboidGeometry, loadBalancer);
10 // Step 5: Set material numbers.
11 // set material number 2 for whole geometry
12 superGeometry.rename(0,2);
13 // change material number from 2 to 1 for inner (fluid) cells, so that only boundary
cells have material nunmer 2
14 superGeometry.rename(2,1,{1,1,1});
15 // or simply use an indicator that changes its lattices to one
16 superGeometry.rename(2,1,fluidIndicator);
17 // additional material numbers for other boundary conditions, the 3rd argument in the
brackets is the material number which the boundary cells should face
18 superGeometry.rename(2,3,1,cylinderInFLow);
19 superGeometry.rename(2,4,1,outflowIndicator0);
20 superGeometry.rename(2,5,1,outflowIndicator1);
21 // Step 6: Construct SuperLattice.
22 SuperLattice<T,DESCRIPTOR> sLattice(superGeometry);

Listing 3.1: Create geometry based on STL or primitive shapes. All six steps are presented
briefly as source code.

The powerful application of the geometry generation of OpenLB can be demonstrated on


the example aorta3d. This example is based on a very complex geometry and illustrates the
highly user friendly and automated process from STL to computation grid the SuperLattice,
see Figure 3.1.

3.2. Setting Material Numbers


OpenLB has a general concept for representation of a geometry. A specific number called the
material number is assigned to each cell, defining whether that cell belongs to the boundary or
to the fluid domain or whether it is superfluous in the computations. Figure 3.2 illustrates
this using the example of a channel flow with an obstacle. The different collision and streaming
steps on the boundary and the fluid are defined with respect to the material number. The benefit
of using material numbers in CFD simulations is the automatic determination of streaming
directions on boundary nodes, as this is not always practical by hand e. g. if material numbers
of a complex geometry are obtained from a STL file.
Besides creating the domain, IndicatorFXD functions can be used to set material numbers with
the help of one of the rename functions in SuperGeometryXD.
1 /// replace one material with another
2 void rename(int fromM, int toM);
3 /// replace one material that fulfills an indicator functor condition with another
4 void rename(int fromM, int toM, IndicatorF3D<bool,T>& condition);
5 /// replace one material with another respecting an offset (overlap)

34
Figure 3.1.: Six steps to create a Geometry. It starts by reading an STL file with the help of an
STLreader and ends with the creation of a SuperLattice.

6 void rename(int fromM, int toM, { unsigned offsetX, unsigned offsetY, unsigned offsetZ
});
7 /// renames all voxels of material fromM to toM if the number of voxels given by
testDirection is of material testM
8 void rename(int fromM, int toM, int testM, std::vector<int> testDirection);
9 /// renames all voxels of material fromM to toM if two neighbour voxels in the
direction of the discrete normal are voxels with material testM in the region
where the indicator function is fulfilled
10 void rename(int fromM, int toM, int testM, IndicatorF3D<bool,T>& condition);

Listing 3.2: Different rename functions to set material numbers.

3.3. Building Simulation Domains with Geometry Primitive


Functors
For the purpose of setting up simulation domains (here called geometries), OpenLB provides
several functors (see Section 2.6) for the creation of basic geometric primitives such as cuboids,
circles, spheres, cones etc. These can be combined using the mathematical operators (+ union, −
set difference, · intersection) to create more complex domains. It can be done in the application
setup (see below) or in the XML interface (see Section 8.11.5).
1 Vector<T,3> C0(0,50,50);
2 Vector<T,3> C1(5,50,50);

35
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 5 5 0 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 5 0 0 0 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 5 5 0 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Figure 3.2.: Lattice nodes of the geometry are associated with material numbers. Material num-
ber zero, one, two, three, four and five correspond to external region, fluid, bounce
back boundary, inflow, outflow and obstacle cells, (1=fluid, 2=no-slip boundary,
3=velocity boundary, 4=constant pressure boundary, 5=curved boundary, 0=do
nothing).

3 Vector<T,3> C2(40,50,50);
4 ///...
5 Vector<T,3> C13(115,7,50);
6
7 T radius1 = 10 ;
8 T radius2 = 20 ;
9 T radius3 = 4 ;
10
11 IndicatorCylinder3D<T> inflow(C0, C1, radius2);
12 std::shared_ptr<IndicatorCylinder3D<T>> cyl1 (
13 new IndicatorCylinder3D<T> ( (C1, C2, radius2));
14 std::shared_ptr<IndicatorCone3D<T>> co1 (
15 new IndicatorCone3D<T> ( (C2, C3, radius2, radius1));
16 ///...
17 IndicatorCylinder3D<T> outflow1(C12, C13, radius1);
18 /// IndicatorIdentity3D collects indicator functors in one object
19 IndicatorIdentity3D<T> venturi(cyl1 + co1 + others );

Listing 3.3: Geometry operations.

3.4. Reading STL-files


For a correct STL representation, the STL reader should be properly set up.
1 STLreader<T> stlreader( " f i l e n a m e . s t l " , voxelSize, stlSize, method, verbose);

Listing 3.4: STL reader

The STL file can be stored not only in the current application folder, but also somewhere else.
The path there can be written in the first argument of the reader. The scaling factor stlSize

36
should be set to the units of the STL part. If it is exported in meters, the scaling factor is 1, if
in millimeters then it is 0.001. The reading methods can be chosen depending on the geometry
complexity. For easy geometries the option 0 can be chosen, for the complex and possibly
untight shapes the option 1 is to be set, whereby it is slower. The verbose argument can be true
or false. It prints the information about the STL file in the terminal.

3.5. Excursus: Creating STL-files


The general process chain assumes that the geometry is already given in form of an STL file, if
not created by the IndicatorFXD-functions. More complex geometries can be created using a full
CAD tool like FreeCAD (www.freecad.org). An introduction to modeling with FreeCAD can
be found for example in http://www.youtube.com/watch?v=geIrH1cOCzc. The general
procedure is mostly similar to the following description.
Firstly, a 2D sketch is created on a selected plane (e.g. the xy-plane) using different bi-
dimensional shapes. In the next step the sketch is extruded in the third dimension. Several
such 3d objects can be combined using operations like union, cut, intersection, rotation, trace,
etc. to obtain the target geometry. Creating a square and a circle for the example cylinder3d
in Figure 3.3 is an easy task. Also, complex geometries as that of a filter or a porous media can
be set up easily with OpenLB’s indicator approach.

Figure 3.3.: The geometry of the example cylinder3d from Section 8.4.3 opened in FreeCAD.

37
4. Simulation Models

4.1. Non-dimensionalization and Choice of Simulation


Parameters
Basically, to describe a physical quantity you need a reference scale. By dividing a quantity by
a variable of the same dimension, a dimensionless quantity is derived. The result is a number,
which is called "the lattice value of the quantity" or the value of the amount "in lattice units"
(denoted with ·LB ), if you consider the Lattice Boltzmann Method. The reference scale is named
conversion factor between two reference systems. Furthermore, the conversion factor for the
quantity ϕ is called Cϕ and the non-dimensional quantity gets the symbol ϕ∗ .
To get into this topic, the book The Lattice Boltzmann Method [108], especially Chapter 7.1
and 7.2 therein, and https://en.wikipedia.org/wiki/Dimensional_analysis are
strongly recommended.

4.2. Porous Media Model


The permeability parameter K is a physical parameter that describes the macroscopic drag in a
porous media model. For laminar flows it is defined by Darcy’s law

QµL
K=− , (4.1)
△P

where Q = U A is the flow rate, U is a characteristic velocity, A is a cross-sectional area, µ denotes


the dynamic viscosity, L is a characteristic length, and △P specifies the pressure difference in
between starting point and endpoint of the volume.
Vf l
The permeability can be derived from porosity ϵ = Vt as well, which is ration of volume
occupied by fluid Vf l to the total volume Vt . For that, different equations exist depending on
the application case. For flow through packed bed, for example, Kozeny-Carman equation can
be applied
ϵ3 d2p
K = Φ2s , (4.2)
180(1 − ϵ)2
where Φs is the particle sphericity and dp is their relevant diameter.
The lattice porosity-value d ∈ [0, 1] is a lattice-dependent value, where d = 0 means the

38
medium is solid and d = 1 denotes a liquid. According to Brinkman [87, 88], Borrvall and
Petersson [84] and Pingen et al. [123], the Navier-Stokes equation is transformed (see Dornieden
[94] and Stasius [131]). The discrete formulation of d describes a flow region by its permeability

νLB τLB
d = 1 − hdim−1 , (4.3)
K

where τLB is the relaxation time, νLB is the discrete kinematic viscosity and h is the cell length.
Therefore K ∈ [νLB τLB hdim−1 , ∞]. To describe the porosity or permeability of a medium, a
descriptor for porosity must be used, such as the one below.
1 using DESCRIPTOR = descriptors::PorousD3Q19Descriptor;

Be aware that the porous media model only works in the generic compilation mode. In the
function prepareLattice, dynamics for the corresponding number of the porous material are
defined for example as follows.
1 void prepareLattice(..., Dynamics<T, DESCRIPTOR>& porousDynamics, ...){
2 /// Material=3 --> porous material
3 sLattice.defineDynamics(superGeometry, 3, &porousDynamics);
4 ...
5 }

In the function setBoundaryValues, the initial porosity value and external field are defined.
1 void setBoundaryValues(..., T physPermeability, int dim, ...){
2 // d in [0,1] is a lattice-dependent porosity-value
3 // depending on physical permeability K = physPermeability
4 T d = converter->latticePorosity(physPermeability);
5 AnalyticalConst3D<T,T> porosity(d);
6 sLattice.defineField<descriptors::POROSITY>(superGeometry, 3, porosity);
7 ...
8 }

In the main function, the required parameters as well as the porous media dynamics are defined.
1 int main(int argc, char* argv[]) {
2 ...
3 T physPermeability = 0.0003;
4 ...
5 PorousBGKdynamics<T, DESCRIPTOR> porousDynamics(converter->getOmega(),
6 instances::getBulkMomenta<T, DESCRIPTOR>());
7 ...
8 }

Additionally, the class UnitConverter in src/core/units.h provides useful functions for


conversion between physical and lattice values:
1 /// converts a physical permeability K to a lattice-dependent porosity d
2 /// (a velocity scaling factor depending on Maxwellian distribution
3 /// function), needs PorousBGKdynamics
4 T latticePorosity(T K) const
5 { return 1 - pow(physLength(),getDim()-1)*getLatticeNu()*getTau()/K; }
6

39
7 /// converts a lattice-dependent porosity d (a velocity scaling factor
8 /// depending on Maxwellian distribution function) to a physical
9 /// permeability K, needs PorousBGKdynamics
10 T physPermeability(T d) const
11 { return pow(physLength(),getDim()-1)*getLatticeNu()*getTau()/(1-d); }

4.3. Power Law Model


The two most common deviations from Newton’s Law observed in real systems are pseudo-
plastic fluids and dilatant fluids. For pseudo-plastic fluids the viscosity of the system decreases
as the shear rate is increased. On the other hand, as the shear rate by dilatant fluids is increased,
the viscosity of the system also increases. The simplest model, that describes these two types of
deviations, is called the (Ostwald–de Waele) Power Law model and defines the viscosity as

µ = mγ̇ n−1 , (4.4)

where m is the flow consistency index, γ̇ is the shear rate, and n denotes the flow behavior
index. Then

• n < 1: pseudoplastic fluids,

• n = 1: Newtonian fluids,

• n > 1: dilatant fluids.

To simulate a power law fluid, a descriptor for dynamic omega must be used, such as:
1 using DESCRIPTOR = descriptors::DynOmegaD2Q9Descriptor;

In the function setBoundaryValues, the initial same omega-argument is defined.


1 AnalyticalConst2D<T,T> omega0(converter.getOmega());
2 sLattice.defineField<descriptors::OMEGA>(
3 superGeometry.getMaterialIndicator({1,2,3,4}), omega0);

In the main function, the power law dynamics are defined.


1 int main(int argc, char* argv[]) {
2 ...
3 PowerLawBGKdynamics<T, DESCRIPTOR> bulkDynamics(converter.getOmega(), instances::
getBulkMomenta<T, DESCRIPTOR>(), m, n, converter.physTime());
4 }

In 1.1 the kinematic viscosity is no longer constant and then also the omega-argument is no longer
constant. With using the power law model 4.4, the kinematic viscosity is computed in each step
as
1
ν= mγ̇ n−1 . (4.5)
ρ

40
The shear rate γ̇ is computable via using the second invariant of the strain rate tensor DII , i.e.
p
γ̇ = 2DII , (4.6)

where
d
X
DII = Eαβ Eαβ , (4.7)
α,β=1

and
q−1
!
1 1 X h
Eαβ =− 1− f ξiα ξiβ . (4.8)
τ 2ϱν i=0 i

This concept is very significant, since fih ξiα ξiβ is usually computed during the collision pro-
cess and therefore, in comparison to other CFD methods, adds almost no additional com-
putational cost. The computation of a new omega-argument is done in src/dynamics/
powerLawBGKdynamics.h via:
1 T computeOmega(T omega0_, T preFactor_, T rho_, T pi_[util::TensorVal<DESCRIPTOR >::n]
);.

4.4. Multiphysics Couplings


4.4.1. Shan–Chen Model
For the simulation of both multiphase and multicomponent flow the Shan–Chen model is im-
plemented in OpenLB. Since its first introduction [126], many variants of the model have been
developed. In this implementation, there are several forcing schemes [99, 127] and interaction
potentials to choose from.

4.4.1.1. Implementation of Shan–Chen Two-phase Fluid

The two phases can be simulated on the same lattice instance.


1 SuperLattice<T, DESCRIPTOR> sLattice(superGeometry);

Then the dynamics are chosen, which have to support external forces.
1 ForcedShanChenBGKdynamics<T, DESCRIPTOR> bulkDynamics1 (
2 omega1, instances::getExternalVelocityMomenta<T,DESCRIPTOR>() );

Possible choices for the dynamics are Forced-BGK-dynamics and ForcedShanChen-BGK-dynamics.


Then the interaction potential is chosen.
1 ShanChen93<T,T> interactionPotential;

41
Viable interaction potentials for one component multiphase flow are ShanChen93, ShanChen94,
CarnahanStarling and PengRobinson. In this model PsiEqualsRho should not be used, because
this would make all the mass gather in the same place.
To enable interaction between the fluid, they have to be coupled, so the kind of coupling
has to be chosen (here: ShanChenForcedSingleComponentGenerator3D) and the material numbers
to which it applies. Since in the case of single component flow there is only one lattice, it is
coupled with itself.
1 const T G = -120.;
2 ShanChenForcedSingleComponentGenerator3D<T,DESCRIPTOR> coupling(
3 G,rho0,interactionPotential);
4 sLattice.addLatticeCoupling(superGeometry, 1, coupling, sLattice);

The interaction strength G has to be negative and the correct choice depends on the chosen
interaction potential. When using PengRobinson or CarnahanStarling interaction potential, G is
canceled out during computation, so the result is not affected by it (though it still has to be
negative).
Finally, during the main loop the lattices have to interact with each other (or in the case of
only one fluid component the lattice with itself).
1 sLattice.communicate();
2 sLattice.executeCoupling();

These steps are placed immediately after the collideAndStream command.


Examples for the implementation of a LB simulation using the Shan–Chen model for two-
phase flow are examples/phaseSeparation2d and examples/phaseSeparation3d.

4.4.1.2. Implementation of Shan–Chen Two-component Fluid

Two lattice instances are needed, one for each component (though operating on one geometry).
1 SuperLattice<T, DESCRIPTOR> sLatticeOne(superGeometry);
2 SuperLattice<T, DESCRIPTOR> sLatticeTwo(superGeometry);

Then the dynamics are chosen, which have to support external forces.
1 ForcedShanChenBGKdynamics<T, DESCRIPTOR> bulkDynamics1 (
2 omega1, instances::getExternalVelocityMomenta<T,DESCRIPTOR>() );
3 ForcedShanChenBGKdynamics<T, DESCRIPTOR> bulkDynamics2 (
4 omega2, instances::getExternalVelocityMomenta<T,DESCRIPTOR>() );

Possible choices for the dynamics are ForcedBGKdynamics and ForcedShanChenBGKdynamics. One
should keep in mind that tasks like definition of dynamics, external fields and initial values and
the collide and stream execution have to be carried out for each lattice instance separately. The
same is true for data output. Then the interaction potential is chosen.
1 PsiEqualsRho<T,T> interactionPotential;

42
In the multicomponent case, the most frequently used interaction potential is PsiEqualsRho, but
ShanChen93 for example, would also be a viable choice.
To enable interaction between the fluid, they have to be coupled, so the kind of coupling has
to be chosen (here: ShanChenForcedGenerator3D) and the material numbers to which it applies.
1 const T G = 3.;
2 ShanChenForcedGenerator3D<T,DESCRIPTOR> coupling(
3 G,rho0,interactionPotential);
4 sLatticeOne.addLatticeCoupling(superGeometry, 1, coupling, sLatticeTwo);
5 sLatticeOne.addLatticeCoupling(superGeometry, 2, coupling, sLatticeTwo);

The interaction strength G has to be positive. If the chosen interaction potential is PsiEqualsRho,
G > 1 is needed for separation of the fluids, but it should not be much higher than 3 for stability
reasons.
Finally, during the main loop the lattices have to interact with each other.
1 sLatticeOne.communicate();
2 sLatticeTwo.communicate();
3 sLatticeOne.executeCoupling();

These steps are placed immediately after the collideAndStream command.


Examples for the implementation of a LB simulation using the Shan–Chen model for two-
component flow are examples/multiComponent2d and examples/multiComponent3d.

4.4.2. Implementation of pseudo-potential multi-component Fluid


With n components, n lattice instances are needed – one for each component (though there is
still only one geometry):
1 SuperLattice<T, DESCRIPTOR> sLattice1( superGeometry );
2 SuperLattice<T, DESCRIPTOR> sLattice2( superGeometry );
3 SuperLattice<T, DESCRIPTOR> sLattice3( superGeometry );

Then the dynamics are chosen, which have to support external forces and shared velocity:
1 using BulkDynamics = MultiComponentForcedBGKdynamics<T, DESCRIPTOR>;
2 ...
3 sLattice1.defineDynamics<BulkDynamics>(superGeometry, 1);
4 sLattice2.defineDynamics<BulkDynamics>(superGeometry, 1);
5 sLattice3.defineDynamics<BulkDynamics>(superGeometry, 1);

One should keep in mind that tasks like definition of dynamics, external fields, initial values,
the collide and stream and coupling execution have to be carried out for each lattice instance
separately. The same is true for data output.
The interaction potential is chosen when initialized with the SuperLatticeCoupling:
1 SuperLatticeCoupling coupling(
2 MCMPForcedPostProcessor<interaction::MCPRpseudoPotential>{},
3 names::Component1{}, sLattice1,
4 names::Component2{}, sLattice2,
5 names::Component3{}, sLattice3);

43
In the lattice preparation, the binary interaction parameters of the mixture components as well
as general equation of state parameters of all components are initialized:
1 ...
2 coupling.template setParameter<COUPLING::GI>(gI_L);
3 coupling.template setParameter<COUPLING::GII>(gII_L);

In the particular pseudo-potential model of MCMPForcedPostProcessor, the surface tension can


be adjusted by adjusting the sigma parameter.
1 coupling.template setParameter<COUPLING::SIGMA>(sigma);

This parameter is in lattice units, so the true physical surface tension should be measured in a
Young–Laplace test.
Finally, during the main loop the lattices have to interact with each other:
1 sLattice1.getCommunicator(stage::PreCoupling()).communicate();
2 sLattice1.executePostProcessors(stage::PreCoupling());
3 sLattice2.getCommunicator(stage::PreCoupling()).communicate();
4 sLattice2.executePostProcessors(stage::PreCoupling());
5 sLattice3.getCommunicator(stage::PreCoupling()).communicate();
6 sLattice3.executePostProcessors(stage::PreCoupling());
7 coupling.execute();

These steps are placed immediately after the collideAndStream command.


Examples for the implementation of a LB simulation using the pseudo-potential model
(Shan–Chen force with a thermodynamic equation of state) for three-component flow are exam
ples/multiComponent/airBubbleCoalescence3d and examples/multiComponent/
waterAirFlatInterface2d.

4.4.3. Free Energy Model


As an alternative option for simulating multi-component flow, the free energy model has been
implemented in OpenLB, and can be used for either two or three fluid components. Examples
for the binary case are given in youngLaplaceXd and contactAngleXd, while an example
of the ternary case with boundaries is provided in microFluidics2d. These are all contained
within the examples/multiComponent/ folder.
The approach taken in OpenLB is similar to that given in [125] and assumes equal densities
and viscosities for each of the fluids. In the next sections the method will be outlined briefly for
three components. The two component case is identical to taking the third fluid component to
be zero and instead only uses two lattices.

44
4.4.3.1. Bulk Free Energy Model

Three lattices are required to track the density ρ, and order parameters ϕ and ψ. These are
related to the individual component densities Ci by

ρ = C1 + C2 + C3 , ϕ = C1 − C2 , ψ = C3 , (4.9)

respectively. By considering the free energy, a force is derived to drive the fluid towards the
thermodynamic equilibrium. The density therefore obeys the Navier–Stokes equation with
this added force. The equation of motion for the order parameters is the Cahn–Hilliard equa-
tion. The dynamics chosen for the first lattice must therefore include an external force, such as
ForcedBGKdynamics, while for the second and third lattices FreeEnergyBGKdynamics is required.
1 using bulkDynamics1 = ForcedBGKdynamics<T, DESCRIPTOR>;
2 using bulkDynamics2 = FreeEnergyBGKdynamics<T, DESCRIPTOR>;
3 ...
4 sLattice1.defineDynamics<bulkDynamics1>(superGeometry, 1);
5 sLattice2.defineDynamics<bulkDynamics2>(superGeometry, 1);
6 sLattice3.defineDynamics<bulkDynamics2>(superGeometry, 1);

To compute the force, two lattice couplings are required. The first computes the chemical po-
tentials for each lattice using the equations

α2
(κ1 + κ2 ) ∇2 ψ − ∇2 ρ + (κ2 − κ1 )∇2 ϕ ,
  
µρ =A1 + A2 + (4.10)
4
α2
(κ2 − κ1 ) ∇2 ρ − ∇2 ψ − (κ1 + κ2 )∇2 ϕ ,
  
µϕ =A1 − A2 + (4.11)
4
α2 
µψ = − A1 − A2 + κ3 ψ(ψ − 1)(2ψ − 1) + (κ1 + κ2 )∇2 ρ
4
− (κ2 − κ1 )∇2 ϕ − (κ1 + κ2 + 4κ3 )∇2 ψ ,

(4.12)

where A1 and A2 are defined as

κ1
A1 = (ρ + ϕ − ψ)(ρ + ϕ − ψ − 2)(ρ + ϕ − ψ − 1), (4.13)
8
κ2
A2 = (ρ − ϕ − ψ)(ρ − ϕ − ψ − 2)(ρ − ϕ − ψ − 1), (4.14)
8

respectively. The α the κ parameters are input parameters for the lattice coupling and can
be used to tune the interfacial width and surface tensions. The interfacial width is given by
α and the surface tensions are γmn = α(κm + κn )/6. The chemical potential values are stored
using FIELDS named CHEM_POTENTIAL and declared with the descriptor: using DESCRIPTOR = D2Q9
<CHEM_POTENTIAL,FORCE>.
The second lattice coupling then computes the force using

F = −ρ∇µρ − ϕ∇µϕ − ψ∇µψ . (4.15)

45
The two lattice couplings are applied using SuperLatticeCoupling:
1 SuperLatticeCoupling coupling1(
2 ChemicalPotentialCoupling2D{},
3 names::Component1{}, sLattice1,
4 names::Component2{}, sLattice2,
5 names::Component3{}, sLattice3);
6 coupling1.restrictTo(superGeometry.getMaterialIndicator({1}));
7
8 SuperLatticeCoupling coupling2(
9 ForceCoupling2D{},
10 names::Component1{}, sLattice1,
11 names::Component2{}, sLattice2,
12 names::Component3{}, sLattice3);
13 coupling2.restrictTo(superGeometry.getMaterialIndicator({1}));

The components parameters must be provided after the coupling1 declaration:


1 ...
2 coupling1.template setParameter<ChemicalPotentialCoupling2D::ALPHA>(alpha);
3 coupling1.template setParameter<ChemicalPotentialCoupling2D::KAPPA1>(kappa1);
4 coupling1.template setParameter<ChemicalPotentialCoupling2D::KAPPA2>(kappa2);
5 coupling1.template setParameter<ChemicalPotentialCoupling2D::KAPPA3>(kappa3);

The fields CHEM_POTENTIAL and RhoStatitics must be communicated to allow proper computa-
tion between blocks. The following is then used in the main loop to calculate the force at each
timestep.
1 sLattice1.executePostProcessors(stage::PreCoupling());
2 sLattice2.executePostProcessors(stage::PreCoupling());
3 sLattice3.executePostProcessors(stage::PreCoupling());
4
5 sLattice1.getCommunicator(stage::PreCoupling()).communicate();
6 sLattice2.getCommunicator(stage::PreCoupling()).communicate();
7 sLattice3.getCommunicator(stage::PreCoupling()).communicate();
8
9 coupling1.execute();
10
11 sLattice1.getCommunicator(stage::PostCoupling()).communicate();
12 sLattice1.executePostProcessors(stage::PostCoupling());
13
14 coupling2.execute();

4.4.3.2. Boundaries in Free Energy Models

Bounce-back wall boundaries with controllable contact angles may be added using the
setFreeEnergyWallBoundary function.
1 setFreeEnergyWallBoundary<T,DESCRIPTOR>(sLattice1, superGeometry, 2,
2 alpha, kappa1, kappa2, kappa3, h1, h2, h3, 1);
3 setFreeEnergyWallBoundary<T,DESCRIPTOR>(sLattice2, superGeometry, 2,
4 alpha, kappa1, kappa2, kappa3, h1, h2, h3, 2);
5 setFreeEnergyWallBoundary<T,DESCRIPTOR>(sLattice3, superGeometry, 2,
6 alpha, kappa1, kappa2, kappa3, h1, h2, h3, 3);

46
The final parameter, latticeNumber, is necessary to change each lattice differently, while the hi
parameters are related to the contact angles at the boundary. The contact angles θmn , are given
by the following relation

(ακn +4hn )3/2 − (ακn −4hn )3/2 (ακm +4hm )3/2 − (ακm −4hm )3/2
cos θmn = √ − √ . (4.16)
2(κm + κn ) ακn 2(κm + κn ) ακm

Notably, to set neutral wetting (90◦ angles), the values can be set to hi = 0.
A demonstration of using these solid boundaries for a binary fluid case is provided in the
contactAngle(2,3)d examples. The examples compare the simulated angles to those given
by equation 4.16, respectively for dimensions d = 2, 3.
Open boundary conditions can also be implemented using the setFreeEnergyInletBoundary
and setFreeEnergyOutletBoundary functions. These can be used to specify constant density or
velocity boundaries. The first lattice is used to define the density or velocity boundary condi-
tion, while on the second and third lattices ϕ and ψ must instead be defined. For example, to
set a constant velocity inlet, see the code snippet below.
1 setFreeEnergyInletBoundary(
2 sLattice1, omega, inletIndicator, " v e l o c i t y " , 1 );
3 setFreeEnergyInletBoundary(
4 sLattice2, omega, inletIndicator, " v e l o c i t y " , 2 );
5 setFreeEnergyInletBoundary(
6 sLattice3, omega, inletIndicator, " v e l o c i t y " , 3 );
7
8 sLattice1.defineU( inletIndicator, 0.002 );
9 sLattice2.defineRho( inletIndicator, 1. );
10 sLattice3.defineRho( inletIndicator, 0. );

However, this alone is insufficient to set a constant density outlet because ρ, ϕ, and ψ are rede-
fined by a convective boundary condition on each time step. In this case an additional lattice
coupling is required, using DensityOutletCoupling2D.
There are two additional requirements for open boundaries. The first is that the velocity
must be coupled between the lattices using InletOutletCoupling2D because this is required for
the collision step. The second is that the communication of the external field must now include
two values. This ensures that ρ, ϕ, and ψ are properly set on block edges at the outlet. To see a
full example of applying these boundary conditions, see the microFluidics2d example.

4.4.4. Coupling Between Momentum and Energy Equations


As explained in reference [118], there are different schemes to couple the momentum and en-
ergy equations by means of a buoyancy force (also called Boussinesq approximation). Some
schemes add an extra force term to the collision term, other methods shift the velocity field ac-
cording to Newton’s second law, and others combine an extra force term and a velocity shift.
The implementation applied in OpenLB belongs to this last group of schemes.

47
Once the boundary values for the velocity and temperature fields are set, collision and stream-
ing functions are called. The dynamics with an external force F used for the velocity calculation
(e.g. ForcedBGKdynamics) shifts the velocity v before executing the collision step. The shift fol-
lows equation (4.17),
F
ushift = u + . (4.17)
2
The code snippet responsible for this shift is defined in the collision function of the file /dynam
ics/dynamics.hh for the class ForcedBGKdynamics.
1 this->momenta . computeRhoU( cell , rho , u) ;
2 FieldPtr<T,DESCRIPTOR,FORCE> force = cell.template getFieldPointer<descriptors::FORCE
>();
3 for ( int iVe l=0; iVel<DESCRIPTOR >::d; ++iVel ) {
4 u[iVel] += force[iVel] / T{2};
5 }

Listing 4.1: Velocity shift

After the corresponding collision step using the shifted velocity, the value of the density distri-
bution functions fi is modified by the external force with the call to the function.
1 lbm< Lattice >::addExternalForce( cell , u , omega )

This function follows


 
¯
 ω ci − u ci · u
fi = fi + 1 − wi + 4 ci · F , (4.18)
2 c2s cs

where f˜i represents the new distribution function (see reference [99] for the BGK model and
[97] for MRT models), wi are the weights of the discrete velocities ci , and ω = τ −1 denotes the
relaxation frequency. The coupling in the collision step for the temperature field is given by the
use of the velocity from the isothermal field.
1 auto u = cell.template getFieldPointer<descriptors::VELOCITY>();

The equilibrium density distribution function for the temperature only has terms of first order
(see e.g. [45]). After the collision step, the coupling function is called NSlattice.executeCoupling
(), where the values of the external force in the NSlattice and of the advected velocity in the
ADlattice are updated.
1
2 auto u = tPartner->get(iX, iY).template getFieldPointer<descriptors::VELOCITY>();
3 blockLattice.get(iX, iY).computeU(u);

Listing 4.2: Velocity coupling

The new force is computed via the Boussinesq approximation

T − T0
F =ρ g. (4.19)
△T

48
The temperature T is obtained from the ADlattice, T0 is the average temperature between the
defined cold and hot temperatures, whereas △T is the difference between the hot and cold
temperatures.
1
2 auto force = blockLattice.get(iX, iY).template getFieldPointer<descriptors::FORCE>();
3 T temperature = tPartner->get(iX, iY).computeRho();
4 T rho = blockLattice.get(iX, iY).computeRho();
5 for (unsigned iD = 0 ; iD < L :: d ; ++iD) {
6 force[iD] = gravity * rho * (temperature - T0) / deltaTemp * dir[iD];
7 }

Listing 4.3: Computation of the Boussinesq force

4.5. Advection–Diffusion Equation


The advective and diffusive transport of a macroscopic density, energy or temperature is gov-
erned by the advection–diffusion equation

∂c
+ ∇ · (vc) = ∇ · (D∇c) in Ω × I, (4.20)
∂t

where c : Ω × I → R, (x, t) 7→ c(x, t) is the considered physical quantity (temperature, chemical


concentration, particle density), D > 0 is the diffusion coefficient and v is a velocity field affect-
ing c. It is possible to approximate this equation with LBM by using an equilibrium distribution
function different from the one for the Navier–Stokes equations [91, 40, 41]
 
ci · v
gieq = wi ρ 1 + 2 , (4.21)
cs

that takes the advective transport into account. In equation (4.21), wi is a weighting factor,
ci a unit vector along the lattice directions, cs the speed of sound, and i denotes the discrete
velocity counter. To use this implementation the dynamics object has to be replaced by special
advection–diffusion dynamics:
1 template<typename T, typename DESCRIPTOR, typename MOMENTA=momenta::
AdvectionDiffusionBulkTuple>
2 using AdvectionDiffusionBGKdynamics = dynamics::Tuple<
3 T,DESCRIPTOR,
4 MOMENTA,
5 equilibria::FirstOrder,
6 collision::BGK,
7 AdvectionDiffusionExternalVelocityCollision
8 >;

Listing 4.4: Advection diffusion dynamics object

49
Additionally, a different descriptor with fewer lattice velocities is used [103]:
1 using DESCRIPTOR = D3Q7<VELOCITY,SOURCE>;

Listing 4.5: Examplary advection diffusion descriptor with field for a source term

In OpenLB, the descriptors D2Q5 and D3Q7 are implemented for the advection–diffusion equa-
tion, whereby higher lattice velocities numbers can be used as well. Since the advection–
diffusion equation is a scalar transport equation differing from the Navier-Stokes momentum
conservation, another set of boundary conditions is needed. Dirichlet, Neumann and Robin
boundary conditions are implemented in OpenLB. Dirichlet condition sets a constant scalar
value in the cells with chosen material number, Neumann zero gradient boundary copies flux
from the neighbor fluid cells and with Robin boundary condition a zero or non-zero flux can be
set in the boundary cells.
1 void prepareLattice(...) {
2 ...
3 /// Material=3 -> Dirichlet boundary with constant scalar value for axes aligned
surfaces (vertical or horizontal)
4 setAdvectionDiffusionTemperatureBoundary<T,ADDESCRIPTOR>(
5 sLatticeAD, superGeometry, 3);
6
7 /// Material=3 -> Dirichlet boundary with constant scalar value for curved or inclined
surfaces
8 sLatticeAD.defineDynamics<FirstOrderEquilibriumBoundary>(superGeometry, 3);
9
10 /// Material=4 -> Neumann boundary with zero flux gradient
11 setZeroGradientBoundary<T,ADDESCRIPTOR>(
12 sLatticeAD, superGeometry, 4);
13
14 /// Material=5 -> Robin boundary with constant reaction rate on the boundary surface
15 setRobinBoundary<T,ADDESCRIPTOR>(
16 sLatticeAD, omegaAD, superGeometry, 5);
17 T reactionRate = converter.getCharLatticeVelocity(); //determines speed of reaction,
here equal to velocity
18 AnalyticalConst3D <T,T> coefficients(reactionRate, -converter.getLatticeDiffusivity(),
reactionRate * Ceq); //set Robin coefficients
19 sLatticeAD.template defineField<descriptors::G>(superGeometry.getMaterialIndicator({5})
, (coefficients)); //save coefficient in a field
20 ...
21 }

Listing 4.6: Setup of advection difusion boundary conditions

To apply convective transport, a velocity vector has to be passed. This can either be done indi-
vidually on each cell by using the following.
1 T velocity[3] = {vx,vy,vz};
2 ...
3 cell.defineField<descriptors::VELOCITY>(velocity);

Listing 4.7: Add advective velocity on a cell

50
Alternatively, it can be passed to the whole SuperLattice using:
1 AnalyticalConst3D<T,T> velocity(vel);
2 ...
3 /// sets advective velocity for material 1
4 superLattice.defineField<descriptors::Velocity>(superGeometry, 1, velocity);

Listing 4.8: Add advective velocity on a superlattice

Here, vel is a std::vector<T>.

4.5.1. Closer Look on Advection Diffusion Boundary Conditions


4.5.1.1. Dirichlet Boundary Condition

At the boundaries of a lattice, only the outgoing directions of the distribution functions are
known, while those towards the domain need to be computed. Several types of implementa-
tions for Dirichlet boundary conditions are summarized in Section 5.1. At a Dirichlet bound-
ary for the advection–diffusion equation, the observable, e.g. temperature, is set to a constant
value. This boundary condition can be applied to flat walls, corners and edges (for three-
dimensional domains). The algorithm to set a certain temperature on a wall is defined in
the dynamics class AdvectionDiffusionBoundariesDynamics in the file boundary/dynamics/
advectionDiffusionBoundaries.h and works as described below.
First, the index i of the unknown distribution function gi incoming to the fluid domain is
determined.
1 template <typename CELL, typename PARAMETERS, typename V=typename CELL::value_t>
2 CellStatistic<V> apply(CELL& cell, PARAMETERS& parameters) any_platform {
3 constexpr auto unknownIndices = util::subIndexOutgoing<DESCRIPTOR, direction,
orientation>();
4 constexpr auto knownIndices = util::subIndexOutgoingRemaining<DESCRIPTOR, direction
, orientation>();
5 }

Listing 4.9: Collision step for a temperature boundary

Then, the sum of the rest of the populations is computed.


1 V sum = V{0};
2 for (unsigned iPop : knownIndices) {
3 sum += cell[iPop];
4 }

The difference between the desired scalar value (given when setting the boundary condition)
and this sum is the value assigned to the unknown distribution.
1 V dirichletTemperature = MomentaF().computeRho(cell);
2 V difference = dirichletTemperature - V{1} - sum;
3 cell[unknownIndices[0]] = difference;

51
After that, all distribution functions are determined and a regular collision step is performed.
1 return typename DYNAMICS::template exchange_momenta<MOMENTA>::CollisionO().apply(cell,
parameters);

As an example, take the case of a left wall in 2D. After the streaming step, all populations are
known except for g3 . Once the desired transported scalar, e.g. temperature Twall , at the wall is
known, the value of the unknown distribution is computed via

4
X
Twall = gi (4.22)
i=0

⇔ g3 = Twall − (g0 + g1 + g2 + g4 ) . (4.23)

In the case of curved surface, the Bouzidi Dirichlet boundary condition can be applied.

4.5.1.2. Neumann Zero Gradient Boundary Condition

For flat walls there is also an opportunity to prescribe a zero flux through the wall as a boundary
condition which is

∇c · n = 0 . (4.24)

Here n stands for the outer normal vector of the boundary.


The setter setZeroGradientBoundary adds a postprocessor
zeroGradientLatticePostProcessor3D which propagates an average population value of
the two neighbor cells to the boundary cell. As an example, we take an outlet wall and have
that

1
gi (xout ) = (gi (xout − △x) + gi (xout − 2△x)) . (4.25)
2

4.5.1.3. Robin Boundary Condition

The most complex boundary condition that can prescribe convective and diffusive fluxes at a
flat surface is Robin boundary condition, which can be written as

∂c
un c + D =s. (4.26)
∂n

Here, un , D and s can be defined by the user.

4.5.1.4. Adiabatic Boundary Condition

Additionally, thanks to the simplified lattice velocity sets used in the thermal descriptors (D2Q5
and D3Q7, cf. the discrete velocities in Figure 1.1 colored in red and orange), it is possible to
implement an adiabatic boundary using bounce-back dynamics [115].

52
An adiabatic boundary condition requires no scalar transport in the normal direction of the
boundary. In a general situation the adiabatic boundary is set on a solid wall, meaning that the
normal velocity to the wall is zero. To implement an adiabatic wall, take a 2D south wall as
example. The distribution function g4 and the scalar, e.g. temperature, at the wall are unde-
termined. The population g4 can be computed from the distribution function in the opposite
direction, in order to ensure that at the macroscopic level there is no heat conduction, i.e.

g4 = g2 . (4.27)

This procedure corresponds to the the bounce-back scheme. With all the distribution functions
known, the temperature at the wall can be determined from its definition

4
X
Twall = gi . (4.28)
i=0

By curved surfaces, the Bouzidi boundary condition can be applied to get the 2nd conver-
gence order. In that case, populations are weighted according to the real distance from the
boundary point to the surface. The equation 4.27 can be rewritten than as

1
g2 (t + 1) = 2qg2 (t) + (1 − 2q)g4 (t), if q< , (4.29)
2
1 2q − 1 1
g2 (t + 1) = g2 (t) + g4 (t), if q≥ . (4.30)
2q 2q 2

Here, q is the ratio between distance from the boundary point to surface and cell size.

4.5.2. Convergence Criterion


For thermal applications, the following convergence criterion can be applied to one of the com-
puted fields or to both of them. Generally, a value tracer on the average energy is used, which
is also available for any other quantity and in turn applicable for any TEQ approximated with
LBM in OpenLB (see Section 10.5). Here, the average energy is defined proportional to the ve-
locity squared, which makes it independent to use the NSlattice or the ADlattice, since both
share the same macroscopic velocity field u.
The parameters to initialize the tracer object are the characteristic velocity of the system
converter.getU( ), the characteristic length of the system converter.getNy( ), and the desired
precision of the convergence eps. The listing 4.10 shows how the object is defined in the main
function, and how its value is updated and checked at each time step.
1 util::ValueTracer<T> converge( converter.getU( ), converter.getNy( ), eps );
2 for ( iT=0; iT<maxIter ; ++iT) {
3 converge.takeValue( ADlattice.getStatistics( ).getAverageEnergy( ), true );
4 if ( converge.hasConverged() ) {
5 break;

53
6 }
7 }

Listing 4.10: Convergence check

4.5.3. Creating an Application with AdvectionDiffusionDynamics


4.5.3.1. Lattice Descriptors

Lattice descriptors used for the AdvectionDiffusionDynamics are D2Q5 and D3Q7, which have less
degrees of freedom for the velocity space than the classical discrete velocity sets for the Navier–
Stokes equations (see Figure 1.1). With the Chapman–Enskog expansion it can be shown
that approximating the advection–diffusion equation as a target does not require fourth order
isotropic lattice tensors (see for example [137]), therefore descriptors with less discrete velocities
can be used without loss of accuracy.
To approximate the Navier–Stokes velocity field u, a first descriptor should be defined
that can include an external force, e.g. ForcedD3Q19Descriptor. Another descriptor is neces-
sary to approximate the temperature T governed by an advection–diffusion equation, e.g.
AdvectionDiffusionD3Q7Descriptor.

4.5.3.2. Preparing the Geometry

This step is similar to the isothermal procedure (without the second lattice for T ). With the help
of indicators and STL files the desired geometry can be created. Cells can be assigned different
to material numbers, which in turn can be used to specify several bulk, initial and boundary
collision schemes and/or dynamics.

Reading STL files To conveniently read STL files, the OpenLB class stlReader is provided.
However, there are differences when compared to the isothermal case, due to the differing con-
verter objects. An example of its use could be:
1 STLreader<T> nameIndicator( " fileName . s t l " , converter.getConversionFactorLength(),
stlSize );

Listing 4.11: Initialization of a STLreader object

The offsets between the STL file and the global geometry are much easier handled, if they are
directly defined when creating the STL file, rather than trying to do it in the application code
afterwards.

54
4.5.3.3. Preparing the Lattices

Recall that in a typical thermal application there are two independent lattices: one for the
isothermal flow (usually referred to as NSlattice), and one for the thermal variables (e.g. the
temperature, usually referred to as ADlattice). For each material number the desired dynam-
ics behavior has to be defined. Commonly used possibilities are instances::getNoDynamics (do
nothing), instances::getBounceBack (no slip), or bulkDynamics (previously set collision for the
bulk). For a thermal lattice the we can define a boundary with a given temperature.
1 setAdvectionDiffusionTemperatureBoundary<T,TDESCRIPTOR>(
2 ADlattice, Tomega, superGeometry, 2);

Listing 4.12: Definition of a temperature boundary

The chosen dynamics for a material number may differ between the isothermal and the thermal
lattices, e.g. an obstacle with a given temperature inside a flow channel would have a no-slip
behavior for the fluid part, but be part of the bulk and have a given temperature in the thermal
lattice.

4.5.3.4. Initialization of the Lattices (iT=0)

NSlattice For all material numbers defined as bulkDynamics, an initial velocity and density has
to be set (usually fluid flow at rest). Additionally, since the velocity field and the temperature
are related by a force term, an external field has to be defined. The easiest way to do this is by
the material number.
1 NSlattice.defineField<descriptors::FORCE>( superGeometry, 1, force );

Listing 4.13: Initialization of an external force field

Here force is an element of type AnalyticalF, which can initially be set to zero.

ADlattice For the advection–diffusion lattice, an initial temperature is set (similar to the den-
sity variable on the Navier–Stokes lattice), as well as the distribution functions corresponding
to this temperature value:
1 T Texample = 0.5;
2 T zerovel[descriptors::d<T,DESCRIPTORS>()] = {0., 0.};
3 ConstAnalyticalF2D<T,T> Example( Texample );
4 std::vector<T> tEqExample(descriptors::q<T,DESCRIPTORS>() );
5 for ( int iPop = 0; iPop < descriptors::q<T,DESCRIPTORS>(); ++iPop )
6 {tEqExample [ iPop ] = advectionDiffusionLbHelpers<T,TDESCRIPTOR>::
7 quilibrium( iPop, Texample, zerovel ); }
8 ConstAnalyticalF2D<T,T> EqExample( tEqExample );
9 ADlattice.defineRho( superGeometry, 1 ,Example );
10 ADlattice.definePopulations( superGeometry, 1, EqExample );

Listing 4.14: Initialization of the temperature field

55
To apply convective transport, a velocity vector has to be passed, which can be also done by
material number.
1 std::vector<T> zero ( 2, T( ) );
2 ConstAnalyticalF2D<T,T> velocity ( zero );
3 ADlattice.defineField<descriptors::VELOCIY>(superGeoemtry, 1,velocity);

The last step is to make the lattice ready for the simulation:
1 NSlattice.initialize( );
2 ADlattice.initialize( );

Listing 4.15: Initialization of the lattices

4.5.3.5. Setting the Boundary Conditions

If the value of a boundary condition has to be updated during the simulation, e.g. via increasing
the velocity at the inflow or changing the temperature of a boundary, this can be achieved
following the same procedure as for the initial conditions (see Section 5.2 further below).

4.5.3.6. Getting the Results

The desired data is saved using the VTKwriter objects, which can write the value of functors in
VTI files (VTK format used e.g. by Paraview [75]). The functors which are usually saved are
the velocity field from the NSlattice, and the temperature field (referred to as density) of the
ADlattice. Thermal and isothermal information must be saved in two different objects, since
they have two different descriptors.
1 SuperLatticeVelocity2D<T,NSDESCRIPTOR> velocity( NSlattice );
2 SuperLatticeDensity2D<T,TDESCRIPTOR> density( ADlattice );
3 vtkWriterNS.addFunctor( velocity );
4 vtkWriterAD.addFunctor( density );
5 vtkWriterNS.write(iT);
6 vtkWriterAD.write(iT);

Listing 4.16: Saving results in VTK files

It is important to emphasize that the data saved is in lattice units. However, the data
can also be directly saved in SI units, by substituting e.g. SuperLatticeVelocity2D with
SuperLatticePhysVelocity2D where appropriate.

4.5.3.7. Structure of the Program

It is advisable to structure the main loop of the .cpp-file along the following steps:

1. Initialization The converter between dimensionless and lattice units is set via e.g. N , △t and
the parameters for the simulation Ra, P r, Tcold , Thot , Lx,y,z .

56
2. Prepare geometry The mesh is created and voxels are classified with different material
numbers according to their behavior (inflow, outflow, etc.).

3. Prepare lattice The lattice dynamics are set according to the material numbers assigned
before. The boundary conditions are initialized. Since there are two different lattices,
the definition of the dynamics and the kind of boundary conditions (though not the ac-
tual values yet) have to be made for each of them separately. At this point the coupling
generator is initialized (usually on the NSlattice) and then it is indicated which material
numbers are to be coupled with the ADlattice.

4. Main loop with timer The functions setBoundaryValues, collideAndStream, and getResults
are called repeatedly until the maximum of iterations is reached or the simulation has
converged (if a convergence criteria is set).

5. Definition of initial and boundary conditions The values for the boundary functions are
set. In some applications the values are to be refreshed at each time step. Thermal
and isothermal lattices are treated separately. As indicated before, velocity and density
(NSlattice), as well as temperature (ADlattice) have to be defined. Additionally, the cou-
plings, and external forces and velocities should be initialized and reused as required.

6. Collide and stream execution The collision and the streaming steps are performed. This
function is called for each of the lattices separately. After the streaming step, the coupling
between the lattices (here, based on the Boussinesq approximation) is executed.

7. Computation and output of the results Console and data outputs of the results at certain
time steps are created.

4.5.4. Obtaining Results in Thermal Simulations


Here, the Rayleigh and the Prandtl numbers are the dimensionless numbers which control the
physics of a convection problem. The Rayleigh number for a fluid is associated with buoyancy
driven flow. When the Rayleigh number is below the critical value for that fluid, heat transfer is
primarily in the form of conduction. When it exceeds the critical value, heat transfer is primarily
in the form of convection. For natural convection, it is defined as


Ra = △T L3 , (4.31)
να

where g is the acceleration magnitude due to gravity, β is the thermal expansion coefficient, ν
is the kinematic viscosity, α is the thermal diffusivity, △T is the temperature difference, and
L denotes the characteristic length. The Prandtl number is defined as the ratio of momentum
diffusivity ν to thermal diffusivity α
ν
Pr = . (4.32)
α

57
To handle differences between the converter objects for isothermal and thermal simulations
some of the isothermal functions have been re-implemented for thermal simulations in a way
that they only depend on lattice parameters and the Rayleigh and Prandtl numbers by modify-
ing existing functors in the following files:

• /functors/lattice/blockLatticeIntegralF3D.(h,hh)

• /functors/lattice/superLatticeIntegralF3D.(h,hh)

4.5.4.1. Velocity

The resulting velocity magnitude vres , independent of the lattice velocity latticeU selected, can
be computed by
vLB √ vLB √
vres = Ra P r = Ra P r. (4.33)
latticeU N △t
The lattice velocity latticeU is obtained from the function converter.getCharLatticeVelocity
() in the thermal converter object.

4.5.4.2. Pressure

The pressure in physical units is derived from the lattice pressure by using its definition from
the isothermal converter object

physcForce
pphys = pLB (4.34)
physLd−1
physLd+1 1
= pLB (4.35)
physT2 physLd−1
 2
physL
= pLB (4.36)
physT
 charU 2
= pLB (4.37)
latticeU
Ra P r
= pLB . (4.38)
latticeU2

The lattice pressure can easily be computed from the lattice density using

ρ−1
pLB = . (4.39)
3

58
The physical force can also be obtained from the computed lattice force

physLd+1
Fphys = FLB (4.40)
physT2
d+1
charL latticeL
charL
= FLB (4.41)
charL latticeUlatticeL 2

charU charL
 charU 2
= FLB latticeLd−1 latticeU (4.42)
latticeLd−1
= FLB Ra P r , (4.43)
latticeU2

where d is the number of dimensions in the problem.


For most applications the value of the force coefficients in the different coordinate directions
can be of interest, which can be computed with

1 1
CFi = Fi,phys 1 2 d−1
= Fi,LB 1 2
, (4.44)
2 charU · counti · latticeL 2 latticeU · counti

where counti is the number of cells in the surface perpendicular to the direction i of the force
coefficient computed.

4.5.5. Conduction Problems


For heat conduction problems there is no velocity field that advects the temperature (see [116]
and [105] ). In absence of convection, radiation and heat generation, the energy equation for a
homogeneous medium is given by
∂T
= α∆T . (4.45)
∂t
A conduction simulation can be executed using an independent advection–diffusion lattice,
without any velocity field coupled. In the same way as in the convection–diffusion heat transfer,
the temperature is obtained after summing the distribution functions over all directions. The
equilibrium distribution function in this case with the BGK approximation is given by

gieq = ωi ρ = ωi T , (4.46)

which is equivalent to the one used in advection–diffusion simulations with the flow velocity set
to zero. It means that conduction problems could be computed based on the available OpenLB
code by only using a lattice with advection–diffusion dynamics and by setting the external
velocity field to zero at any time.

59
4.5.5.1. Multiple-Relaxation-Time (MRT)

The implementation of the thermal lattice Boltzmann equation using the multiple-relaxation-
time collision model is done similarly to the procedure used with the BGK collision model.
A double MRT-LB is used, which consists of two sets of distribution functions: an isothermal
MRT model for the mass-momentum equations, and a thermal MRT model for the temperature
equation. Both sets are coupled by a force term according to the Boussinesq approximation. The
macroscopic governing equation for the temperature is

∂T
+ v∇T = α∆T , (4.47)
∂t

where α is the thermal diffusivity coefficient.


The isothermal MRT model with an external force is already implemented in OpenLB for the
D2Q9 and D3Q19 lattice models (dynamics class ForcedMRTdynamics). This means that only the
thermal MRT counterparts for 2D and 3D have to be developed.
The computation of the force term in the MRT model in the ForcedMRTdynamics class uses the
body force as described in [109]. However, it does not include a velocity shift like the BGK
model, due to negligible differences in benchmark tests.

D2Q5 thermal model The formulation for the D2Q5 thermal MRT model is based on [114]. The
temperature field distribution functions gi are governed by the following equation

gi (x + ci △t, t + △t) − gi (x, t) = −Ni−1 θi [n(x, t) − neq (x, t)], (4.48)

where g and n are column vectors with entries gi and ni for i = 0, 1, . . . , q − 1, and denote
the distribution functions and the moments, respectively. The vectors Ni are the rows of the
orthogonal transformation matrix N and θi are the entries of a non-negative, diagonal relaxation
matrix. The macroscopic temperature T can be calculated by

4
X
T = gi . (4.49)
i=0

The weight coefficients for each lattice direction are given in equation 4.50 with

3, if i = 0,
5
ωi = (4.50)
1
10 , if i = 1, 2, 3, 4.

The transformation matrix N maps the distribution functions for the temperature gi to the cor-
responding moments ni , i.e.
n = Ng. (4.51)

60
The transformation matrix N and its inverse matrix N−1 are shown in equations 4.52 and 4.53.
There are some differences in the order of the columns with respect to what is specified in the
reference [114]. This is due to the different sequence used in numbering the velocity directions.
   
⟨1| 1 1 1 1 1
 ⟨ex |   0 −1 0 1 0
   
   
N=  ⟨ey |  =  0
  0 −1 0 1 , (4.52)
 2
⟨5e − 4| −4 1 1 1 1
  

⟨e2x − e2y | 0 1 −1 1 −1
1
0 − 51

5 0 0
1
 5 − 12 0 1
20
1 
4 
N−1 = 
 
1
5 0 − 2 20 − 14 
1 1
 . (4.53)
1 1 1 1
5 2 0 20 −4
1 1 1
5 0 2 20 − 14

The equilibrium moments neq are defined as

T
neq = (T, ux T, uy T, ϖT, 0) , (4.54)

where ϖ is a constant of the D2Q5 model, which we set to −2. The diagonal relaxation matrix
θ is defined by
θ = diag (0, ζa , ζa , ζe , ζν ) . (4.55)

The first relaxation rate, corresponding to the temperature, is set to zero for simplicity, since the
first moment is conserved. The relaxation rates ζe and ζν are set to 1.5, whereas the relaxation
rates ζa are functions of the thermal diffusivity α in (4.57). The speed of sound of the D2Q5
model is c2s = 0.2.
   
1 1 1
α= c2s
τa − = τa − (4.56)
2 5 2
1 1
⇒ ζa = = . (4.57)
τa 5α + 21

4.5.5.2. Particle Flows as advection–diffusion Problem

The quantity c in the advection–diffusion equation can be considered as a particle density,


thereby giving a continuous ansatz for simulating particle flows. To solve for the particle dis-
tribution, an additional lattice is required with an appropriate descriptor and dynamics, which
are only implemented for the 3D case.
1 using ADDESCRIPTOR = D3Q7<VELOCITY,VELOCITY2>;

Listing 4.17: Advection diffusion descriptor for particle flows

61
The descriptor in Listing 4.17 allocates additional memory, since, for the computation of the
particle velocity, the velocity of the last time step has to be stored as well. These calculations
are also non-local, therefore the communication of the additional data has to be ensured by an
additional object, which is constructed according to line 1 of Listing 4.18 and communicates the
data by a function as shown in line 2, which has to be called in the time loop.
1 SuperExternal3D<T, ADDESCRIPTOR, descriptors::VELOCITY> sExternal( superGeometry,
sLatticeADE, sLatticeAD.getOverlap() );
2 sExternal.communicate();

Listing 4.18: SuperExternal3D object for the communication of additional data

Although the same unit converter can be used for the advection–diffusion lattice, another re-
laxation parameter has to be handed to the dynamics, as shown in Listing 4.19. Additionally,
some of the boundary conditions have to take the diffusion coefficient into account. Therefore
a new ωADE is computed by
 −1
ULB
ωADE = 4D + 0.5 . (4.58)
LLB UC

with characteristic lattice velocity ULB , characteristic velocity UC , lattice length LLB , as well as
the desired diffusion coefficient D.
1 ParticleAdvectionDiffusionBGKdynamics<T, ADDESCRIPTOR> bulkDynamicsAD ( omegaAD,
instances::getBulkMomenta<T, ADDESCRIPTOR>() );

Listing 4.19: Dynamics for the simulation of particle flows via advection–diffusion equations

Applying the advection–diffusion equation to particle flow problems requires a new dynamics
due to the handling of the particle velocity by the coupling processor of the two lattices, which
differs for reasons of efficiency. When constructing the coupling post-processor as shown in
Listing 4.20, forces acting on the particle can be added like the Stokes drag force as shown in
line 2 and 3 of Listing 4.20. The implementation of new forces is straight forward, since only a
new class which provides a function applyForce(...), computing the force in a cell, needs to be
written analogously to the existing advDiffDragForce3D. Finally the lattices are linked by line 4
of Listing 4.20, which needs to be applied to the Navier–Stokes lattice for reasons of accessibility.
1 AdvectionDiffusionParticleCouplingGenerator3D<T,NSDESCRIPTOR> coupling( ADDESCRIPTOR::
index<descriptors::VELOCITY>());
2 advDiffDragForce3D<T, NSDESCRIPTOR> dragForce( converter,radius,partRho );
3 coupling.addForce( dragForce );
4 sLatticeNS.addLatticeCoupling( superGeometry, 1, coupling, sLatticeAD );

Listing 4.20: Coupling of an advection–diffusion and a Navier–Stokes lattice for particle flow
simulations

62
For the boundary conditions the same basic objects as for the advection–diffusion equation can
be used, however there is an additional boundary condition shown on Listing 4.21 which has
to be applied at all boundaries to ensure correctness of the finite differences scheme used to
compute the particle velocity. Further information as well as results can be found in Trunk et
al. [48] as well as in the examples section.
1 setExtFieldBoundary<T,ADDESCRIPTOR,descriptors::VELOCITY,descriptors::VELOCITY2>(
2 sLatticeAD, superGeometry.getMaterialIndicator({2, 3, 4, 5, 6}));

Listing 4.21: Example of a boundary condition for the particle velocity for particle flow
simulations

4.6. Particles
The following chapter summarizes OpenLB ’s functionality regarding the consideration of dis-
crete particles in a Lagrangian framework. This includes both sub-grid particles assuming spher-
ical shapes and surface resolved particles with arbitrary shapes, which can be handled by a com-
mon particle framework. As the framework follows advances in the data concept of the lattice
(cf. Section 2), it provides a dimension agnostic, flexible and easily extendable implementa-
tion. While abstract template meta functionality characterizes the data handling level, acces-
sible high-level user-functions are provided for e.g. particle creation or coupling handling. In
order to guarantee support for previously developed applications, the 3D-only sub-grid particle
framework from the previous releases is included as sub-grid (legacy) framework (cf. Section 4.6.9)
as well.
To get a good overview of the particle framework, the code of examples settlingCube3d (Sec-
tion 8.7.4) and bifurcation3d (Section 8.7.1) is reviewed, focusing on the simulation of particles.
The example settlingCube3d examines the settling of a cubical silica particle under the influ-
ence of gravity in surrounding water. It starts with integrating some libraries and namespaces,
followed by the definition of different types (Listing 4.22), e.g. the descriptor and the particle type.
Afterwards, some variables are set to a concrete value, used in the fluid and particle calculation
(Listing 4.23). Particle settings include all the data to solve the equations of motion, such as the
particle’s starting position and density.
1 #include " olb3D . h "
2 #include " olb3D . hh "
3
4 using namespace olb;
5 using namespace olb::descriptors;
6 using namespace olb::graphics;
7 using namespace olb::util;
8 using namespace olb::particles;
9 using namespace olb::particles::dynamics;
10
11 using T = FLOATING_POINT_TYPE;

63
12 typedef D3Q19<POROSITY,VELOCITY_NUMERATOR,VELOCITY_DENOMINATOR> DESCRIPTOR;
13
14 //Define lattice type
15 typedef PorousParticleD3Q19Descriptor DESCRIPTOR;
16
17 //Define particleType
18 typedef ResolvedParticle3D PARTICLETYPE;

Listing 4.22: Following the new particle system with following the example settlingCube3d

1 //Particle Settings
2 T centerX = lengthX*.5;
3 T centerY = lengthY*.5;
4 T centerZ = lengthZ*.9;
5 T const cubeDensity = 2500;
6 T const cubeEdgeLength = 0.0025;
7 Vector<T,3> cubeCenter = {centerX,centerY,centerZ};
8 Vector<T,3> cubeOrientation = {0.,15.,0.};
9 Vector<T,3> cubeVelocity = {0.,0.,0.};
10 Vector<T,3> externalAcceleration = {.0, .0, -T(9.81) * (T(1) - physDensity /
cubeDensity)};
11
12
13 // Characteristic Quantities
14 T const charPhysLength = lengthX;
15 T const charPhysVelocity = 0.15; // Assumed maximal velocity

Listing 4.23: Particle settings in example settlingCube3d

Like other simulations, particle flow simulations need basic, non particle-specific functions
like prepareGeometry or prepareLattice. After initializing those functions, the main function
starts. The main section begins with initialization of physical units in the unit converter, which
is explained in the Q&A in Section A.1. The unit converter is followed by the preparation of
the geometry using the prepareGeometry-function and afterwards the prepareLattice-function.
After those general simulation functions, the particle simulation starts. First, the ParticleSystem
(Listing 4.24, explained in Section 4.6.1) is called followed by the calculation of the particle
quantities like a smoothing factor and the extent of the particles. After those calculations, the
particles are created. In the following lines, dynamics are assigned to the particles.
1 // Create ParticleSystem
2 ParticleSystem<T,PARTICLETYPE> particleSystem;
3
4 //Create particle manager handling coupling, gravity and particle dynamics
5 ParticleManager<T,DESCRIPTOR,PARTICLETYPE> particleManager(
6 particleSystem, superGeometry, sLattice, converter, externalAcceleration);
7
8 // Create and assign resolved particle dynamics
9 particleSystem.defineDynamics<
10 VerletParticleDynamics<T,PARTICLETYPE>>();
11
12 // Calculate particle quantities

64
13 T epsilon = 0.5*converter.getConversionFactorLength();
14 Vector<T,3> cubeExtend( cubeEdgeLength );
15
16 // Create Particle 1
17 creators::addResolvedCuboid3D( particleSystem, cubeCenter,
18 cubeExtend, epsilon, cubeDensity, cubeOrientation );
19
20 // Create Particle 2
21 cubeCenter = {centerX,lengthY*T(0.51),lengthZ*T(.7)};
22 cubeOrientation = {0.,0.,15.};
23 creators::addResolvedCuboid3D( particleSystem, cubeCenter,
24 cubeExtend, epsilon, cubeDensity, cubeOrientation );
25
26 // Check ParticleSystem
27 particleSystem.checkForErrors();

Listing 4.24: Creation of particles and assigning dynamics

Before the main loop starts, Listing 4.25, we create a timer, Listing 4.25 and set initial values to
the distribution functions by calling setBoundaryValues. After this, the following is processed at
every time step. The fluid’s influence on the particles is calculated by evaluating hydrodynamic
forces acting on the particle surface. Afterwards, an external acceleration, e.g. gravity, is applied
onto the particles (Listing 4.25) and the equations of motion are solved for each one. The back
coupling from the particles to the fluid follows afterwards. Finally, the main loop ends with
the getResults-function, which prints the results to the console and writes VTK data for post-
processing with ParaView (Section 6.10) at previously defined time intervals.
1 /// === 4th Step: Main Loop with Timer ===
2 Timer<T> timer(converter.getLatticeTime(maxPhysT), superGeometry.getStatistics().
getNvoxel());
3 timer.start();
4
5
6 /// === 5th Step: Definition of Initial and Boundary Conditions ===
7 setBoundaryValues(sLattice, converter, 0, superGeometry);
8
9 clout << " MaxIT : " << converter.getLatticeTime(maxPhysT) << std::endl;
10
11 for (std::size_t iT = 0; iT < converter.getLatticeTime(maxPhysT)+10; ++iT) {
12
13 // Execute particle manager
14 particleManager.execute<
15 couple_lattice_to_particles<T,DESCRIPTOR,PARTICLETYPE>,
16 apply_gravity<T,PARTICLETYPE>,
17 process_dynamics<T,PARTICLETYPE>,
18 couple_particles_to_lattice<T,DESCRIPTOR,PARTICLETYPE>
19 >();
20
21 // Get Results
22 getResults(sLattice, converter, iT, superGeometry, timer, particleSystem );
23
24 // Collide and stream
25 sLattice.collideAndStream();

65
26 }
27
28 timer.stop();
29 timer.printSummary();

Listing 4.25: Main Loop with Timer

As we followed the example for particle simulation settlingCube3d, some functions nec-
essary for the simulations were introduced. Therefore, in the next chapters, the individual parts
of the framework are examined.

4.6.1. Class ParticleSystem


The ParticleSystem stores all data concerning the particles in containers. Therefore, the class is
used multiple times in a particle simulation. First, the ParticleSystem is created according to
the desired PARTICLETYPE (Listing 4.24). However, the container of particles is empty. Therefore,
we add two particles to it using creator functions and add dynamics via the ParticleSystem.
Additionally, it is utilized in the ParticleManager (cf. Section 4.6.3) to access the particles and
perform predefined operations on them.
One focus of the new particle system is the separation of data and operations according to the
lattice framework (cf. Section 2). Therefore, only the data is stored in the ParticleSystem. For
the operations, it is non-relevant and only used to store data of the calculations.

4.6.2. Class SuperParticleSystem


The example bifurcation3d (Section 8.7.1) makes use of OpenLB’s domain decomposition
approach, which can also be used for surface resolved particle simulations (Section 8.7.4). In
order to use this, a SuperParticleSystem has to be created by passing the SuperGeometry, which
holds all information regarding the lattice decomposition:
1 SuperParticleSystem<T,PARTICLETYPE> superParticleSystem(superGeometry);

Listing 4.26: Initialization of a SuperParticleSystem

4.6.3. Class ParticleManager


The ParticleManager can be used to encapsulate relevant reoccurring particle tasks as e.g. the
particle-lattice-coupling. After its initial instantiation by providing the access to relevant par-
ticle, lattice and set-up specific data, its execute() method can be called with the respective
tasks specified as template arguments in the desired order. The individual tasks (included
in particleTasks.h) provide an execute() method as well and a parameter set specifying

66
the coupling type and the potential embedding into a loop over all available particles. The
ParticleManager also takes care of combining respective tasks into a single particle loop.
When using a domain decomposition, the particle core distribution has to be updated in
every time step. The use of the ParticleManager in the bifurcation3d example (Section 8.7.1)
respectively looks as follows:
1 ParticleManager<T,DESCRIPTOR,PARTICLETYPE> particleManager(
2 superParticleSystem, superGeometry, superLattice, converter);

Listing 4.27: Initialization of a ParticleManager

1 particleManager.execute<
2 couple_lattice_to_particles<T,DESCRIPTOR,PARTICLETYPE>,
3 process_dynamics<T,PARTICLETYPE>,
4 update_particle_core_distribution<T,PARTICLETYPE>
5 >();

Listing 4.28: Execution of the ParticleManager

4.6.4. Resolved Lattice Interaction


In the directory resolved, all surface resolved specific functionality is bundled. The
blockLatticeInteraction.h (only header file) and blockLatticeInteraction.hh
files consist of five functions. All of those functions are needed to calculate and check the posi-
tion of the particles inside the geometry. For example, the checkSmoothIndicatorOutOfGeometry
-function checks if every part of the particle is inside the cell barrier. If the particle is partly
outside of the geometry, the position needs to be changed. Another important functions is the
setBlockParticleField, where a all cells, which are inside the particle are set as a particle field.
Similar to the blockLatticeInteraction.hh also the superLatticeInteraction.hh
exist. In this file the setBlockParticleField gets converted to the lattice structure with the
function setSuperParticleField. The file momentumExchangeForce.h provides functions to
calculate hydrodynamic forces on the particle’s surface via an adapted momentum exchange
algorithm. The file (smoothIndicatorInteraction.h) is needed for the simulation of the
area directly at the surface of the particles.

4.6.5. Particle Descriptors


The first file is the particleDescriptorAlias.h file, in which the alias’ are given to dif-
ferent types of PARTICLETYPEs. In the example settlingCube3d, right in the beginning,
the PARTICLETYPE ResolvedParticle3D is chosen. After the choice of this alias, the dynam-
ics and other main properties are set. Other possible particle types that can be chosen are
ResolvedParticle2D or ResolvedSphere3D.
For the bifurcation3d example (Section 8.7.1), a respective descriptor is chosen:

67
1 typedef D3Q19<> DESCRIPTOR;
2 typedef SubgridParticle3D PARTICLETYPE;

Listing 4.29: Particle type and descriptor

4.6.6. Particle Dynamics


Another important part of the particle system are the dynamics. The files are used to define
those properties for the chosen particle type. For example, in the particleDynamics.h and
particleDynamics.hh all dynamic functions for the particle type are implemented. There-
fore, all information for calculation of dynamic values can be found here e.g. acceleration or
angular acceleration. Those functions get called in the main part of simulations, e.g. in the
example settlingCube3d (VerletParticleDynamics), in Listing 4.24, as the dynamics are as-
signed to the particle type.
In the example bifurcation3d (Section 8.7.1), different capture methods can be used by
choosing the respective setting beforehand:
1 //Set capture method:
2 // materialCapture: based on material number
3 // wallCapture: based on more accurate stl description
4 typedef enum {materialCapture, wallCapture} ParticleDynamicsSetup;
5 const ParticleDynamicsSetup particleDynamicsSetup = wallCapture;

Listing 4.30: Enum ParticleDynamicsSetup and example initialization

If the wallCapture is chosen, a SolidBoundary object has to be created and passed to the re-
spective dynamics:
1 STLreader<T> stlReader( " . . / b i f u r c a t i o n 3 d . s t l " , converter.getConversionFactorLength() )
;
2 IndicatorLayer3D<T> extendedDomain( stlReader, converter.
getConversionFactorLength() );
3 // Create solid wall
4 const unsigned latticeMaterial = 2; //Material number of wall
5 const unsigned contactMaterial = 0; //Material identifier (only relevant for contact
model)
6 SolidBoundary<T,3> wall( std::make_unique<IndicInverse<T,3>>(stlReader),
7 latticeMaterial, contactMaterial );

Listing 4.31: Solid wall creation

When using the materialCapture instead, a MaterialIndicator is necessary to identify mate-


rial numbers that initialize the capture treatment.
1 std::vector<int> materials {2,4,5};
2 SuperIndicatorMaterial<T,3> materialIndicator (superGeometry, materials);

Listing 4.32: Initialization of a SuperIndicatorMaterial

68
Both SolidBoundary and MaterialIndicator are then used in the function prepareParticles to
define the chosen dynamics:
1 if (particleDynamicsSetup==wallCapture){
2 //Create verlet dynamics with material aware wall capture
3 superParticleSystem.defineDynamics<
4 VerletParticleDynamicsMaterialAwareWallCapture<T,PARTICLETYPE>>(
5 wall, materialIndicator);
6 } else {
7 //Create verlet dynamics with material capture
8 superParticleSystem.defineDynamics<
9 VerletParticleDynamicsMaterialCapture<T,PARTICLETYPE>>(materialIndicator);
10 }

Listing 4.33: Creation of verlet dynamics

4.6.7. Particle Functions


In the functions-directory, additional free functions are defined. These functions are callable
anywhere in the code. The first set of files including particleCreatorFunctions.h,
particleCreatorFunctions2D.h, particleCreatorFunctions3D.h and
particleCreatorHelperFunctions.h concentrate on functions concerning the cre-
ation of particles with different types of surface structures. These functions are therefore
called first to create particles in the desired shape. In the example settlingCube3d, the
function addResolvedCuboid is called and creates a particle in the shape of a cuboid. Also other
geometries, like circles in 2D or cylinders in 3D, can be created. All of those functions are
implemented in these files.
The file (particleMotionfunctions.h) concentrates on the main algorithms for solving
the equations of motion. Two functions exist, using different integration-types, velocity Ver-
let algorithm (velocityVerletIntegration) or Euler-Integration (eulerIntegration). The former
function is used in the VerletParticleDynamics class (chapter 4.6.6) and are therefore called in
the main part of the example. Often two functions for the same calculation exist as they need to
match the dimension of a problem. Those are differentiated by partial template specialization.
The particleDynamicsFunctions.h also contains other important functions to sim-
ulate particle flows. Tasks are included in the particleTasks.h like e.g. the
couple_lattice_to_particles or couple_resolved_particles_to_lattice. Both functions are
used in the main loop of the example, to realize a two-way-coupling.
To sum up, many of the most important functions for the simulations of the particle flow
are implemented in the ParticleDynamicsFunctions.h. Other functions, e.g. concern-
ing the calculation of rotation of the particle body, are implemented in the bodyMotionf
unctions.h. The particleIoFunctions.h contains functions to get the output of the cal-
culation to the console. It consist of two important functions (printResolvedParticleInfo and

69
printResovedParticleInfoSimple), which are used in the getResults-function. The getResults-
function is called at the end of the main part of every simulation.

4.6.8. Discrete Contact Model for Surface Resolved Particles


In order to simulate particulate flows accurately, it is often necessary to incorporate a contact
model. The here used discrete contact model [31] allows for the treatment of particle-particle
and particle-wall interactions, enabling the calculation of contact forces and their application to
the particles. The discrete contact model consists of several steps that are integrated into the
general algorithm. Let’s discuss each step in detail:

• Rough contact detection during coupling: During the coupling stage, where particle informa-
tion is transferred to the fluid lattice, a rough contact detection mechanism is employed.
This step identifies potential contact regions between particles and the fluid. It determines
potential contacts by identifying particles that couple to the same cell.

• Communication of found contacts: Once the potential contact regions are identified, the in-
formation regarding the found contact is communicated across all processes.

• Correction of contact bounding box: To improve the accuracy of the contact treatment, the
contact bounding boxes are refined based on the information obtained during the commu-
nication step. This correction step helps in precisely defining the contact regions, ensuring
that the subsequent calculations consider the actual contacts.

• Determination of contact properties: With the refined contact bounding boxes, the discrete
contact model determines various contact properties. These properties include the contact
volume, contact point, contact normal and other relevant parameters.

• Calculation of contact force and application to particles: Using the contact properties, the con-
tact force is calculated from the parameters determined before and applied to the particles
so that it’s available when solving the equations of motion.

• Removal of empty contact objects (optional): After the contact forces have been determined
and applied, empty contact objects, which no longer represent an existing contact, may
be removed. This step helps in optimizing the computational efficiency by eliminating
unnecessary iterations.

The usage within OpenLB is examplified by dkt2d. First, we set types for the particle-particle
and particle-wall interactions, which define how to the contact is treated. This is represented in
Listing 4.34.
1 typedef ParticleContactArbitraryFromOverlapVolume<T, DESCRIPTOR::d, true>
PARTICLECONTACTTYPE;
2 typedef WallContactArbitraryFromOverlapVolume<T, DESCRIPTOR::d, true> WALLCONTACTTYPE;

70
Listing 4.34: Contact types

Additionally, we define the walls for the contact treatment, as shown in Listing 4.35. Here,
we create a SolidBoundary from an indicator and specify the minimal and maximal coordinates
as well as the material number that represents the wall on he lattice and the material identifier,
which defines the walls mechanical properties.
1 std::vector<SolidBoundary<T, DESCRIPTOR::d>> solidBoundaries;
2 solidBoundaries.push_back( SolidBoundary<T, DESCRIPTOR::d>(
3 std::make_unique<IndicInverse<T, DESCRIPTOR::d>>(
4 cuboid,
5 cuboid.getMin() - 5 * converter.getPhysDeltaX(),
6 cuboid.getMax() + 5 * converter.getPhysDeltaX()),
7 wallLatticeMaterialNumber,
8 wallContactMaterial));

Listing 4.35: Solid boundaries

Similarly, we set a number that relates the particles to mechanical properties, see Listing 4.36.
1 for (std::size_t iP = 0; iP < particleSystem.size(); ++iP) {
2 auto particle = particleSystem.get(iP);
3 setContactMaterial(particle, particleContactMaterial);
4 }

Listing 4.36: Particle material number

To store contact objects later on, we create an empty ContactContainer as shown in List-
ing 4.37.
1 ContactContainer<T, PARTICLECONTACTTYPE, WALLCONTACTTYPE> contactContainer;

Listing 4.37: Contact container

By creating a lookup table ContactProperties that contains constant parameters which solely
depend on the material combination in Listing 4.38, we save computational effort. Setting prop-
erties must be done for each material combination separately.
1 ContactProperties<T, 1> contactProperties;
2 contactProperties.set(particleContactMaterial,
3 wallContactMaterial,
4 evalEffectiveYoungModulus(
5 youngsModulusParticle,
6 youngsModulusWall,
7 poissonRatioParticle,
8 poissonRatioWall),
9 coefficientOfRestitution,
10 coefficientKineticFriction,
11 coefficientStaticFriction);

71
Listing 4.38: Contact properties

Finally, we process the contacts as shown in Listing 4.39.


1 processContacts<T, PARTICLETYPE, PARTICLECONTACTTYPE, WALLCONTACTTYPE,
ContactProperties<T, 1>>(
2 particleSystem, solidBoundaries,
3 contactContainer, contactProperties,
4 superGeometry, contactBoxResolutionPerDirection);

Listing 4.39: Contact processing

4.6.9. Sub-grid Legacy Framework


In this Section the use of Lagrangian particles with the legacy framework is shown. Due to
similar naming of classes and functions in the new common framework, it is worth noting that
all terms are primarily referring to the naming convention used in the legacy framework itself
and should not be mixed up with those of the new one.
Similar to the BlockLattice and SuperLattice structure, a ParticleSystem3D and
SuperParticleSystem3D structure exists. In line 2 of Listing 4.40 the SuperParticleSystem3D is
instantiated, taking a SuperGeometry as a parameter. In line 4 the SuperParticleSysVtuWriter is
instantiated. It takes the SuperParticleSystem3D, a filename as string, and the wanted particle
properties as arguments. Calling the function SuperParticleSysVtuWriter.write(int timestep)
creates .vtu files of the particles positions for the given timestep. These files can be visualized
with Paraview.
Line 10 of the listing instantiates an interpolation functor for the fluids velocity, which is used
in line 13 during the instantiation of StokesDragForce3D. Particles need boundary conditions as
well. In the listing, the simplest possible material boundary is presented. If a particle moves
into a lattice node with material number 2, 4 or 5 its velocity is set to 0 and it is neclected during
further computations, its state of activity is set to false. This MaterialBoundary3D is instantiated
in line 16. In lines 18 and 19 the force and boundary condition are added to and stored in the
respective lists in the SuperParticleSystem3D.
The actual number crunching is then performed in line 25 which is positioned in the main
loop of the program. The supParticleSystem.simulate(T timeStep); function integrates the par-
ticle trajectories by timeStep. Therefore all stored particle forces are computed and summed up.
The particles are moved one step according to Newton’s laws. Then all stored particle boundary
conditions are applied. Parallelization of the particles is achieved automatically.
Results of this simulation are published in Henn et al. [16].
1 // SuperParticleSystems3D
2 SuperParticleSystem3D<T,PARTICLE> supParticleSystem(superGeometry);

72
3 // define which properties are to be written in output data
4 SuperParticleSysVtuWriter<T,PARTICLE> supParticleWriter(supParticleSystem, " p a r t i c l e s
",
5 SuperParticleSysVtuWriter<T,PARTICLE>::particleProperties::velocity |
6 SuperParticleSysVtuWriter<T,PARTICLE>::particleProperties::mass |
7 SuperParticleSysVtuWriter<T,PARTICLE>::particleProperties::radius |
8 SuperParticleSysVtuWriter<T,PARTICLE>::particleProperties::active);
9
10 SuperLatticeInterpPhysVelocity3D<T,DESCRIPTOR> getVel(sLattice, converter);
11
12 auto stokesDragForce = make_shared<StokesDragForce3D<T,PARTICLE,DESCRIPTOR>> (getVel,
converter);
13
14 // material numbers where particles should be reflected
15 std::set<int> boundMaterial = { 2, 4, 5};
16 auto materialBoundary = make_shared<MaterialBoundary3D<T, PARTICLE>> (superGeometry,
boundMaterial);
17
18 supParticleSystem.addForce(stokesDragForce);
19 supParticleSystem.addBoundary(materialBoundary);
20 supParticleSystem.setOverlap(2. * converter.getPhysDeltaX());
21
22 \* ... *\
23
24 main loop {
25 supParticleSystem.simulate(converter.getPhysDeltaT());
26 }

Listing 4.40: Usage of class SuperParticleSystem3D

4.6.9.1. Interpolation of Fluid Velocity

As the particle position X : I → Ω moves in the continuous domain Ω and information on the
fluid velocity can only be computed on lattice nodes xi ∈ Ωh , interpolation of the fluid velocity
is necessary every time fluid-particle forces are computed. Let uF F
i = u (xi ) be the computed
solution of the Navier–Stokes Equation at lattice nodes xi . Let p ∈ Pn be the interpolating
polynomial of order n with p(xi ) = uF
i and (x0 , . . . xn ) the smallest interval containing all
points in the brackets. Furthermore, let C n [a, b] be the vector space of continuous functions
that have continuous first n derivatives in [a, b]. Then the interpolation error of the polynomial
interpolation is stated by the following theorem.
Theorem 1 (Interpolation error). Let u ∈ C n+1 [a, b], a, b ∈ Ω. Then for every x ∈ [a, b] there
b ∈ (x0 , . . . xn , x), such that
exists one x

n
F dn+1 F
x u (b x) Y
u (x) − pn (x) = (x − xj ) (4.59)
(n + 1)! j=0

holds.
Proof. See Rannacher [124, Satz 2.3].

73
Using linear (n = 1) interpolation for the fluid velocity between two neighbouring lattice
nodes a = x0 ∈ Ωh , b = x1 ∈ Ωh , ∥x1 − x0 ∥2 = h clearly the following holds

1 2 F
f (x) − p1 (x) = d u (b x)(x − x0 )(x − x1 ) (4.60)
2 x
1
≤ d2x uF (b
x)h2 (4.61)
2

and the approximation error of the linear interpolation is of order O(h2 ). In the following we
give reason why this order of interpolation is sufficient.
Lets assume there exists an ideal error law of the form


∥uF F α
i − ui ∥L2 (Ωh ) = ch ,

for the discrete solution uF


h obtained by an LBM with lattice spacing h and the analytic solution

uF . Then α ∈ R+ is the to be determined order of convergence. We further define the relative
error ∗
∥uF
h −u
F
∥L2 (Ωh )
Errh = F ∗ .
∥u ∥L2 (Ωh )
The ratio of the error laws of two distinct lattice spacings hi and hj forms the EOC as

ln(Errhi /Errhj )
EOCi,j = . (4.62)
ln(hi /hj )

With this Krause [0, Chapter 2.3] determines an of EOC ≈ 2 for the discrete solution towards
the analytic solution of a stationary flow in the unit cube governed by the incompressible NSE.
Therefore the order of converge of the fluid velocity obtained by an LBM can be assumed to be
O(h2 ). This conclusion is backed up by the theoretical results obtained by [89]. This leads to the
assumption that, each interpolation scheme of higher order than 2 would not be exhausted as
the error of the incoming data is too large.
The interpolation is implemented as a trilinear interpolation using the eight nodes surround-
ing the particle. Let the point of interpolation x b ∈ [x(0,0,0) , x(1,1,1) ] be in the cube spanned by
the lattice nodes x(0,0,0) and x(1,1,1) , see Figure 4.1 for an illustration. We will denote by

d = (d0 , d1 , d2 )T = x
b − x(0,0,0) (4.63)

the distance of the particle to the next smaller lattice node. The fluid velocities at the eight cor-
ners are named accordingly u(i,j,k) , i, j, k ∈ {0, 1}. The trilinear interpolation is executed by
three consecutive linear interpolations in the three different space directions. First we interpo-

74
h
h x(0,1,1) x(1,1,1)

x(0,1,0) x(1,1,0)

x
b
h
x(0,0,1) x(1,0,1)

x(0,0,0) x(1,0,0)

Figure 4.1.: Trilinear interpolation.

late along the x-axis

u(d,0,0) = u(0,0,0) (h − d0 ) + u(1,0,0) d0 (4.64)


u(d,1,0) = u(0,1,0) (h − d0 ) + u(1,1,0) d0 (4.65)
u(d,0,1) = u(0,0,1) (h − d0 ) + u(1,0,1) d0 (4.66)
u(d,1,1) = u(0,1,1) (h − d0 ) + u(1,1,1) d0 (4.67)

followed by interpolation along the y-axis

u(d,d,0) = u(d,0,0) (h − d1 ) + u(d,1,0) d1 (4.68)


u(d,d,1) = u(d,0,1) (h − d1 ) + u(d,1,1) d1 (4.69)

and finally in direction of the z-axis

x) = u(d,d,d) = u(d,d,0) (h − d2 ) + u(d,d,1) d2 .


u(b (4.70)

4.6.9.2. Class SuperParticleSystem3D

The implementation of the particle phase follows an hierarchical ansatz, similar to the Cell →
BlockLattice3D → SuperLattice ansatz used for the implementation of the LBM. The equiv-
alent classes in the context of Lagrangian particles are Particle3D → ParticleSystem3D →
SuperParticleSystem3D. The class Particle3D allocates memory for the variables of one single
particle, such as its position, velocity, mass, radius and the force acting on it. It also provides
the function bool getActive(), which returns the active state of the particle. Active particles’
positions are updated during the simulation, in contrast to non-active particles, which are only
used for particle-particle interaction. The class Particle3D is intended to be inherited from, in
order to provide additional properties, such as electric or magnetic charge. The particles in the

75
domain of a specific BlockLattice3D are combined in the class ParticleSystem3D. Finally the
class SuperParticleSystem3D combines all ParticleSystem3Ds, and handles the transfer of parti-
cles between them.
The concept of the class SuperParticleSystem3D is to provide an easily adaptable framework
for simulation of a large number of particles arranged in and interacting with a fluid. In
this context easily adaptable means that simulated forces and boundary conditions are imple-
mented in a modular manner, such that they are easily exchangeable. Development of new
forces and boundary conditions can be readily done by inheritance of provided base classes.
Particle-particle interaction can be activated if necessary and deactivated to decrease simu-
lation time. The contact detection algorithm is interchangeable. This section introduces the
SuperParticleSystem3D and the mentioned properties in more detail.
The class SuperParticleSystem3D is initialised by a call to the constructor simultaneously on
all PUs.

1 SuperParticleSystem3D(CuboidGeometry3D<T>& cuboidGeometry, LoadBalancer<T>&


loadBalancer, SuperGeometry<T,3>& superGeometry);

During the construction each PU instantiates one ParticleSystem3D for each local cuboid. Sub-
sequently for each ParticleSystem3D a list of the ranks of PUs holding neighbouring cuboids is
created.
Particles can be added to the SuperParticleSystem3D by a call to one of the addParticle()
functions.

1 /// Add a Particle to SuperParticleSystem


2 void addParticle(PARTICLETYPE<T> &p);
3 /// Add a number of identical Particles equally distributed in a given IndicatorF3D
4 void addParticle(IndicatorF3D<T>& ind, T mas, T rad, int no=1, std::vector<T> vel
={0.,0.,0.});
5 /// Add a number of identical Particles equally distributed in a given Material
Number
6 void addParticle(std::set<int> material, T mas, T rad, int no=1, std::vector<T> vel
={0.,0.,0.});
7 /// Add Particles form a File. Save using saveToFile(std::string name)
8 void addParticlesFromFile(std::string name, T mass, T radius);

Currently there are four implementations of this class. The first adds single predefined par-
ticles, the second and third add a given number of equally distributed particles of the same
mass and radius in an area that can be defined by either a set of material numbers or an in-
dicator function. The initial particle velocity can be set optionally. Finally particles can be
added from an external file, containing their positions. In all cases the assignment to the correct
ParticleSystem3D is carried out internally.
Particle forces and boundaries are implemented by the base classes Force3D and Boundary3D.

76
1 template<typename T, template<typename U> class PARTICLETYPE>
2 class Force3D {
3 public:
4 Force3D();
5 virtual void applyForce(typename std::deque<PARTICLETYPE<T> >::iterator p, int pInt
, ParticleSystem3D<T, PARTICLETYPE>& psSys)=0;
6 }

Both classes are intended to be derived from in order to implement force and boundary
specialisations. The key function in both classes are applyForce() and applyBoundary(), which
are called during each timestep of the main LBM loop. Force3D and Boundary3D specialisations
are added to the SuperParticleSystem3D by passing a pointer to a class instantiation via a call to
the respective function.

1 /// Add a force to system


2 void addForce(std::shared_ptr<Force3D<T, PARTICLETYPE> > f);
3 /// Add a boundary to system
4 void addBoundary(std::shared_ptr<Boundary3D<T, PARTICLETYPE> > b);

Both functions add the passed pointer to a list of forces and boundaries, which will be looped
over during the simulation step. If necessary a contact detection algorithm can be added.

1 /// Set contact detection algorithm for particle-particle contact.


2 void setContactDetection(ContactDetection<T, PARTICLETYPE>& contactDetection);

A force based on contact between two particles is the contact force like described in the
theory of Hertz and others and is named here as HertzMindlinDeresiewicz3D.

1 auto hertz = make_shared < HertzMindlinDeresiewicz3D<T, PARTICLE, DESCRIPTOR>


2 > (0.0003e9, 0.0003e9, 0.499, 0.499);
3 spSys.addForce(hertz);

Finally one timestep is computed by a call to the function simulate().

1 template<typename T, template<typename U> class PARTICLETYPE>


2 void SuperParticleSystem3D<T, PARTICLETYPE>::simulate(T dT)
3 {
4 for (auto pS : _pSystems) {
5 pS->_contactDetection->sort();
6 pS->simulate(dT);
7 pS->computeBoundary();
8 }
9 updateParticleDistribution();
10 }

This function contains a loop over the local ParticleSystem3Ds calling the lo-
cal sorting algorithm and the functions ParticleSystem3D::simulate() and
ParticleSystem3D::computeBoundary(). The sorting algorithm determines po-
tential contact between particles according to the set ContactDetection.

77
1 inline void simulate(T dT) {
2 _pSys->computeForce();
3 _pSys->explicitEuler(dT);
4 }

The inline function ParticleSystem3D::simulate() first calls the local function ParticleSystem3D
::computeForce().

1 template<typename T, template<typename U> class PARTICLETYPE>


2 void ParticleSystem3D<T, PARTICLETYPE>::computeForce()
3 {
4 typename std::deque<PARTICLETYPE<T> >::iterator p;
5 int pInt = 0;
6 for (p = _particles.begin(); p != _particles.end(); ++p, ++pInt) {
7 if (p->getActive()) {
8 p->resetForce();
9 for (auto f : _forces) {
10 f->applyForce(p, pInt, *this);
11 }
12 }
13 }
14 }

This function consists of a loop over all particles stored by the calling
ParticleSystem3D. If the particle state is active, its force variable is reset to zero. Then
the value computed by each previously added particle force is added to the particle’s force
variable. Finally, the particle velocity and position is updated by one step of an integration
method.
Returning to the function SuperParticleSystem3D::simulate(T dT) the next command in
the loop is a call of the function ParticleSystem3D::computeBoundary(), which has the
same structure as the ParticleSystem3D::computeForce(). After executing the loop, the
function updateParticleDistribution() is called, which redistributes the particles over the
ParticleSystem3Ds according to their updated position. A detailed description of this function
is provided at the end of the next section.

4.6.9.3. Implementation of the Communication Optimal Strategy

The communication optimal strategy is implemented in the function SuperParticleSystem3D::


updateParticleDistribution() already mentioned above. The function has to be called after
every update of the particle positions, in order to check if the particle remained in its current
cuboid, as otherwise segmentation faults may occur during the computation of particle forces.
The transfer is implemented using nonblocking operations of the MPI library.

1 template<typename T, template<typename U> class PARTICLETYPE>


2 void SuperParticleSystem3D<T, PARTICLETYPE>::updateParticleDistribution()
3 {
4 /* Find particles on wrong cuboid, store in relocate and delete */
5 //maps particles to their new rank

78
6 _relocate.clear();
7 for (unsigned int pS = 0; pS < _pSystems.size(); ++pS) {
8 auto par = _pSystems[pS]->_particles.begin();
9 while (par != _pSystems[pS]->_particles.end()) {
10 //Check if particle is still in his cuboid
11 if (checkCuboid(*par, 0)) {
12 par++
13 }
14 //If not --> find new cuboid
15 else {
16 findCuboid(*par, 0);
17 _relocate.insert(
18 std::make_pair(this->_loadBalancer.rank(par->getCuboid()), (*par)));
19 par = _pSystems[pS]->_particles.erase(par);
20 }
21 }
22 }

The function begins with with two nested loops. The outer loop is over all local
ParticleSystem3Ds, the inner loop over the Particle3Ds of the current ParticleSystem3D.
Each particle is checked if it remained in its cuboid during the last update, by the func-
tion checkCuboid(*par, 0). The first parameter of checkCuboid(*par, 0) is the particle to
be tested and the second parameter is an optional spatial extension of the cuboid. If the
function returns true the counter is incremented and the next particle is tested. If the func-
tion returns false the particle together with the rank of its new cuboid are copied to the
std::multimap<int, PARTICLETYPE<T> > _relocate for future treatment and removed from the
std::deque<PARTICLETYPE<T> > _particles of particles.

1 /* Communicate number of Particles per cuboid*/


2 singleton::MpiNonBlockingHelper mpiNbHelper;
3
4 /* Serialise particles */
5 _send_buffer.clear();
6 T buffer[PARTICLETYPE<T>::serialPartSize];
7 for (auto rN : _relocate) {
8 rN.second.serialize(buffer);
9 _send_buffer[rN.first].insert(_send_buffer[rN.first].end(), buffer, buffer+
PARTICLETYPE<T>::serialPartSize);
10 }

The function continues by instantiating the class singleton::MpiNonBlockingHelper, which


handles memory for MPI_Request and MPI_Status messages. Then the particles buffered in
_relocate are serialized. Meaning their data is written consecutively in memory and stored
in a buffer std::map<int, std::vector<double> > _send_buffer in preparation for the transfer.

79
1 /*Send Particles */
2 int noSends = 0;
3 for (auto rN : _rankNeighbours) {
4 if (_send_buffer[rN].size() > 0) {
5 ++noSends;
6 }
7 }
8 mpiNbHelper.allocate(noSends);
9 for (auto rN : _rankNeighbours) {
10 if (_send_buffer[rN].size() > 0) {
11 singleton::mpi().iSend<double>(&_send_buffer[rN][0], _relocate.count(rN)*
PARTICLETYPE<T>::serialPartSize, rN, &mpiNbHelper.get_mpiRequest()[k++],
1);
12 }
13 }
14 singleton::mpi().barrier();

To find the number of send operations a loop over the ranks of neighbouring cuboids is car-
ried out, increasing the variable count each time data for a specific rank is available. Then the
appropriate number of MPI_Requests is allocated. Finally the data is sent to the respective PUs
via a nonblocking MPI_Isend() and all PUs wait until the send process is finished on each PU.
1 /*Receive and add particles*/
2 int flag = 0;
3 MPI_Iprobe(MPI_ANY_SOURCE, 1, MPI_COMM_WORLD, &flag, MPI_STATUS_IGNORE);
4 if (flag) {
5 for (auto rN : _rankNeighbours) {
6 MPI_Status status;
7 int flag = 0;
8 MPI_Iprobe(rN, 1, MPI_COMM_WORLD, &flag, &status);
9 if (flag) {
10 int amount = 0;
11 MPI_Get_count(&status, MPI_DOUBLE, &number_amount);
12 T recv_buffer[amount];
13 singleton::mpi().receive(recv_buffer, amount, rN, 1);
14 for (int iPar=0; iPar<amount; iPar+=PARTICLETYPE<T>::serialPartSize) {
15 PARTICLETYPE<T> p;
16 p.unserialize(&recv_buffer[iPar]);
17 if (singleton::mpi().getRank() == this->_loadBalancer.rank(p.getCuboid()))
{
18 _pSystems[this->_loadBalancer.loc(p.getCuboid())]->addParticle(p);
19 }
20 }
21 }
22 }
23 }
24 if (noSends > 0) {
25 singleton::mpi().waitAll(mpiNbHelper);
26 }
27 }

On the receiving side the nonblocking routine MPI_Iprobe() checks whether an incoming trans-
mission is available. The constant MPI_ANY_SOURCE indicates that messages from all ranks are
accepted. If a message is awaiting reception the flag flag is set to a nonzero value and the fol-

80
lowing switch will be true. This query is not necessary, but the following loop can be entirely
skipped if no particles are transferred, which is expected to be the case most of the time.
The subsequent loop tests for each single neighbouring rank if a message awaits reception. If
true the number of send MPI_Doubles is read from the status variable via an MPI_Get_count().
The appropriate memory is allocated and the message is received by wrapped call to MPI_Recv
(), and written consecutively. Then new Particle3Ds are instantiated, initialised with the re-
ceived data and assigned to the respective ParticleSystem3D on the updated PU. Finally, a call
to MPI_Waitall() makes sure, that all MPI_Isend()s have been processed by the recipients.

ej
Ω ek

Rn
Rm Xn
Xm

Rm + Rn Rm + Rn

Figure 4.2.: Overlap of the particle domains. Particles within a distance to of the sum of the two
largest radii to a neighbour cuboid have to be transferred to this specific neighbour
cuboid.

4.6.9.4. Shadow Particles

If particle collisions are considered, it may happen that particles Pm with centre Xm ∈ Ω ej
collide with particles Pn with centre Xn ∈ Ωe k in a different cuboid, as illustrated in Figure 4.2.
e j and so-called shadow particles are introduced.
Therefore Pn has to be known on Xm ∈ Ω
Shadow particles are static particles, whose positions and velocities are not explicitly computed
during the update step. Particle collision across cuboid boundaries can only occur if the distance
d = ∥Xn − Xm ∥2 between the participating particles is less then the sum of the two largest radii
of all particles in the system. Hence the width of the particle overlap has to be at least the sum
of the two largest particle radii and all particles within this overlap have to be transferred to the
neighbour cuboid after each update of the particle position by an additional communication
step similar to the one introduced above.

81
5. Initial and Boundary Conditions
Each example located in examples/ is typically sructured along several function definitions
and a main part. The function setBoundaryValues usually sets initial values for density and
velocity (if not yet done in prepareLattice). Boundary values can be refreshed at certain time
steps. In some applications, this function may be missing or empty, if there is no temporal
change in the boundary values. An exemplary implementation can be found in examples/
laminar/poiseuille2d, where a smooth start-up is used at the velocity inflow boundary as
follows:
1 // No of time steps for smooth start-up
2 int iTmaxStart = converter.getLatticeTime( maxPhysT*0.4 );
3 int iTupdate = 5;
4
5 if (iT%iTupdate == 0 && iT <= iTmaxStart) {
6
7 // Smooth start curve, polynomial
8 PolynomialStartScale<T,T> StartScale(iTmaxStart, T(1));
9 // Creates and sets the Poiseuille // inflow profile using functors
10 T iTvec[1] = {T( iT )};
11 T frac[1] = {};
12 StartScale( frac,iTvec );
13 T maxVelocity = converter.getCharLatticeVelocity()*3./2.*frac;
14 T distance2Wall = L/2.;
15 Poiseuille2D<T>
16 poiseuilleU(superGeometry, 3,
17 maxVelocity, distance2Wall);
18 sLattice.defineU(superGeometry, 3,
19 poiseuilleU);
20 }

5.1. Define Boundary Method


There are two different types of boundaries, namely wet-node boundaries (on the nodes)
and link-wise boundaries (in between the nodes). Examples of wet-node boundaries are
LocalVelocity, LocalPressure, InterpolatedPressure, InterpolatedVelocity, Zou-HePressure
, ZouHeVelocity and AdvectionDiffusion. Examples of link-wise boundaries are Bouzidi,
BouzidiVelocity, BouzidiZeroVelocity (+Interpolation) and YuPostProcessor (+Interpolation
). The boundary declarations are usually designed as in the following code snippets:
1 setBounceBackBoundary(superLattice, superGeometry, materialNumber);

82
2 boundary::set<boundary::LocalVelocity>(superLattice, superGeometry, materialNumber);
3 setBouzidiBoundary(superLattice, superGeometry, materialNumber, indicator);

OpenLB offers implementations of several methods for approximating macroscopic bound-


ary conditions. Some of the frequently used boundary methods are listed below and described
in terms of locality of operations and general applicability. This list does not raise a claim to
completeness. OpenLB includes the following boundary methods, which are callable with set
...Boundary and the respective arguments.

1. LocalPressure, LocalVelocity, LocalConvection, etc.:


• local
• wet-node
• Application: boundary for fluid flows
• regularized boundary, re-computes all fi from local momenta reconstructing off-
equilibrium parts
• stable in most regimes
• implemented according to Latt and Chopard [0]

2. InterpolatedPressure, Interpolated, etc.:


• non-local
• wet-node
• Application: boundary for fluid flows (stable for higher Reynolds numbers)
• re-computes all fi using a finite-difference scheme over adjacent cells for the velocity
gradient
• implemented according to Skordos [129]

3. setZouPressure, setZouHeVelocity, etc.:


• local
• wet-node
• Application: boundary for fluid flows (low Reynolds number)
• computes missing fi by applying symmetry conditions on the off-equilibrium part,
enforcing velocity and pressure on the equilibrium
• highly accurate, but less stable
• implemented according to Zou and He [138]

4. Bouzidi, BouzidiVelocity, etc.:


• non-local
• link-wise
• Application: boundary for fluid flows (for curved-boundaries)
• computes missing fi using a bounce-back rule which takes the distance between
node and boundary into account
• second order accuracy in space

83
• implemented according to Bouzidi et al. [85]

5. AdvectionDiffusionTemperature, etc.:
• local and non-local
• wet-node
• Application: boundary for advection-diffusion problems (temperature, particle, ..)
• various boundary conditions for advection diffusion problems
• type of implementation differs for each condition
• for details see e.g. Trunk et al. [48]

The typical macroscopic effects that can be obtained with the wet-node-method are for example
velocity and pressure Dirichlet boundary conditions and for the link-wise methods for example
Dirichlet boundary conditions for a specific velocity, zero-velocity, convection and slip walls.
With the advectionDiffusion-methods macroscopic boundary conditions in terms of Dirichlet
boundaries with respect to temperature, convection, zero-distribution, or the external-field, are
recoverable.

5.1.1. Wet-node Method


With the wet-node approach, for example Dirichlet-type boundaries for the velocity can be
obtained. On a macroscopic level, this is used for example at inflow boundaries where the
values for the inflow velocity are given.
1 boundary::set<boundary::LocalVelocity>(superLattice, superGeometry, materialNumber);

Moreover, the wet-node approach is applied for macroscopic boundaries for the pressure. This
is used for example at outflow boundaries and fixes the values for the pressure in terms of a
Dirichlet condition.
1 setLocalPressureBoundary<T, DESCRIPTOR> (superLattice, omega, superGeometry,
materialNumber);

5.1.2. Link-wise Method


The link-wise method is applicable to recover macroscopic convection, which is used for out-
flow boundaries and approximates

∂u ∂u
+ uaverage =0, (5.1)
∂t ∂n

similar to a Neumann-type boundary for the velocity.


1 setInterpolatedConvectionBoundary<T, DESCRIPTOR> (superLattice, omega, superGeometry,
materialNumber, averageVelocity);

84
Additionally, a slip boundary can be constructed that is used for solid boundaries and reflects
outgoing fi . The latter has the effect of zero velocity normal to the boundary and free flow
tangential to the boundary.
1 setSlipBoundary<T, DESCRIPTOR> (superLattice, superGeometry, materialNumber);

Further, a macroscopic velocity used for curved boundaries with fixed values can be superim-
posed. It is realized as a Dirichlet boundary for the velocity and considers the boundary shape
via an indicator functor. There are multiple bulkMaterials possible.
1 setBouzidiBoundary<T, DESCRIPTOR,BouzidiVelocityPostProcessor> (superLattice,
superGeometry, material , indicator, bulkMaterialsList);
2 AnalyticalConstF3D<T> u;
3 setBouzidiVelocity(superLattice, superGeometry, material, u);

As a special case, the zero velocity boundary for curved boundaries which fixes the velocity to
zero is callable with simplified syntax. It is a Dirichlet-type boundary for velocity and considers
the shape by indicator. Here are also multiple bulkMaterials possible.
1 setBouzidiBoundary<T, DESCRIPTOR>(superLattice, superGeometry, material, indicator,
bulkMaterialsList);

5.1.3. AdvectionDiffusionBoundary
An exemplary macroscopic effect of the advectionDiffusion boundary methods is the Dirichlet
boundary condition for the temperature, which is used for inflow or wall boundaries and fixes
the values for the temperature.
1 setAdvectionDiffusionTemperatureBoundary<T, DESCRIPTOR> (superLattice, omega,
superGeometry, materialNumber);

In addition, convection can be modeled, which is used for outflow boundaries and interpolates
incoming fi from neighbors. It is similar to a Neumann-type temperature boundary.
1 setAdvectionDiffusionConvectionBoundary<T, DESCRIPTOR> (superLattice, superGeometry,
materialNumber);

Further, the zero-distribution boundary is used for "sticky" boundaries and sets the incoming fi
to zero, such that particles touching the boundary are trapped.
1 setZeroDistributionBoundary<T, DESCRIPTOR> (superLattice, superGeometry, materialNumber
);

At last also macroscopic external fields used for example in particle simulations can be supplied
with boundary conditions. The following boundary method provides data on the boundary for
particle calculations (velocity gradient).
1 setExtFieldBoundary<T, DESCRIPTOR, FIELD >(superLattice, superGeometry, materialNumber)
;

85
5.1.4. Robin-type boundary condition
The code for the boundary condition approach is almost identical for one, two or three missing
populations. The implementation from [0] is implmented as a post-processor. Two boundary
schemes are implemented where the use of a flag variable enables the selection of the desired
scheme. Certain values are prepared including the velocity field, the coefficients of the Robin
boundary condition, the normal vector and the outgoing population indices. The next listing is
the main part of a procedure that updates a3 of the Robin boundary condition, as it depends on
the local concentration value of the other species.
1 { // Update for component C
2 T concP = cell.template get<names::Concentration1>().computeRho(); //get
concentration of P
3
4 auto a1_a2_a3 = cell.template get<names::Concentration0>().template getField<
descriptors::G>(); //pull out current field
5 a1_a2_a3[2] = backwardConstant*concP; //update a3 according to k_b*P = k*C_eq =
a3
6 cell.template get<names::Concentration0>().template setField<descriptors::G>(
a1_a2_a3); //put updated field back
7 }
8 { // Update for component P
9 T concC = cell.template get<names::Concentration0>().computeRho(); //get
concentration of C
10
11 auto a1_a2_a3 = cell.template get<names::Concentration1>().template getField<
descriptors::G>(); //pull out current field
12 a1_a2_a3[2] = forwardConstant*concC; //update a3 according to k_f*C = k*P_eq = a3
13 cell.template get<names::Concentration1>().template setField<descriptors::G>(
a1_a2_a3); //put updated field back
14 }

Listing 5.1: Code updating the coefficient a3

5.1.5. Additional Options


An additional option to supply boundary conditions in OpenLB is to choose LBM collision
dynamics also at the boundary. One can add a specific dynamics object to a boundary mate-
rial number, e.g.: BulkDynamics which executes the collision step as in a fluid node (e.g. with
BGKdynamics).

1 using BulkDynamics = BGKdynamics<T,DESCRIPTOR>;


2 [...]
3 boundary::set<boundary::LocalVelocity<T,DESCRIPTOR,BulkDynamics>>(superLattice,
superGeometry, materialNumber);

Another option is to choose an implementation via boundary dynamics. One can add boundary
methods as dynamics, e.g. BounceBack (zero velocity), BounceBackVelocity (prescribes a nonzero
velocity).

86
1 superLattice.defineDynamics<BounceBack> (superGeometry, materialNumber);
2 superLattice.defineDynamics <BounceBackVelocity>(superGeometry, materialNumber);

5.2. Define Initial Conditions


1 sLattice.defineRhoU(superGeometry, materialNumber, analyticalFunctor,
analyticalFunctor);

For each each material number, the density (usually ρ = 1), the velocity (usually u = 0), and
the distribution functions should be initialized. The functions expect lattice values, therefore
physical values have to be converted, e.g.: via the function getLatticeDensity(density) of the
UnitConverter. Note that instead of the materialNumber argument, any discrete indicator func-
tion can be used. Exemplary initializations are given below.
1 // Initial conditions
2 AnalyticalConst2D<T,T> rhoF(1);
3 std::vector<T> velocity(dim,T(0));
4 AnalyticalConst2D<T,T> uF(velocity);
5
6 // Initialize density
7 superLattice.defineRho(superGeometry, materialNumber, rhoF);
8
9 // Initialize velocity
10 superLattice.defineU(superGeometry, materialNumber, uF);
11
12 // Initialize distribution functions
13 // to local equilibrium
14 superLattice.iniEquilibrium(superGeometry, materialNumber,
15 rhoF, uF);

5.3. Define Boundary Values


1 sLattice.defineU(superGeometry, materialNumber, analyticalFunctor);

Just like the initial conditions, boundary values can be set using defineRho(...) and defineU
(...). For a smooth start-up, the values can be scaled, e.g. according to a sinus-scale or
polynomial-scale for a given startup-time.
1 // Smooth start curve, sinus
2 SinusStartScale<T,int>
3 StartScale(numerStartTimeSteps, maxValue);
4
5 //Smooth start curve,polynomial PolynomialStartScale<T,T>
6 StartScale(numerStartTimeSteps, maxValue);
7
8 // compute scale-factor "frac"
9 T iTvec[1] = {T(timestep)};

87
10 T frac[1] = {};
11 StartScale( frac, iTvec );

To apply a flow profile, one first has to update values in every nth time step, then initialize a
functor and then set values using defineRho(...) and defineU(...). These are the same func-
tions as for the initial conditions. However, the time point when to call them is crucial.
1 if (timestep%updatePeriod==0 &&
2 timestep <= numberStartupTimesteps) {
3 Poiseuille2D<T> poiseuilleU(superGeometry,
4 materialNumber, maxVelocity*frac[0],
5 distance2Wall);
6 sLattice.defineU(superGeometry, materialNumber, poiseuilleU);
7 }

Further examples of 3D functors for this purpose are: rotating functors (linear for a rotating ve-
locity field and quadratic for a rotating pressure field), Circle-Poiseuille, Elliptic-Poiseuille, and
Rectangular-Poiseuille. The latter are only for on axis boundaries and can be constructed from
points spanning a plane or a material number. These functors are summarized with arguments
as follows:
1 RotatingLinear3D(axisPoint,
2 axisDirection, angularVelocity);
3
4 RotatingQuadratic1D(axisPoint, axisDirection, angularVelocity);
5
6 CirclePoiseuille3D(axisPoint,
7 axisDirection, maxVelocity, radius);
8
9 EllipticPoiseuille3D(center, semiPrincipalAxis1, semiPrincipalAxis2, maxVelocity);
10
11 RectanglePoiseuille3D(point1, point2,
12 point3, maxVelocity);
13
14 RectanglePoiseuille3D(superGeometry, materialNumber, maxVelocity, offsetX, offsetY,
offsetZ);

88
6. Input and Output
During development or during actual simulations, it might be necessary to parameterize your
program. For this case, OpenLB provides an XML parser, which can read files produced by
OpenGPI [76], thereby providing a user-friendly GUI. Details on the XML format and functions
are given in Section 6.9.
The simulation data is stored in the VTK data format and can be further processed with Par-
aview. For output tasks that are performed only once during the simulation, it is recommended
to write the data sequentially. Commonly, the geometry or cuboid information is one of these
tasks. In contrast to the parallel version, it is easier to use and does not produce unnecessary
data overhead. However, if the output is performed regularly in a parallel simulation, the
performance may slow down using the sequential output method. Therefore, OpenLB has im-
plemented a parallel data output functionality. At the lowest scope, every thread writes .vti
files that contain the data. OpenLB writes a .vti file for every cuboid, to provided parallel data
processing. Those .vti files are summarized and put together by the next hierarchy, the .vtm
file. A .vtm file corresponds to the entire domain with respect to a certain time step. At the end,
the different time steps are packed to a .pvd file, that is a collection of .vtm according to time
steps.
The technical aspects are presented in Section 6.1, whereas the usage is demonstrated with an
example in Section 6.2.

6.1. Output Data Structure


OpenLB simulation data is stored in file system according to the VTK data format [77]. This
format has XML structure and the data therein is written as binary Base64 code. Additionally,
the simulation data is compressed by zlib, which allows to reduce data tremendously.
On the top level, the parallel output hierarchy contains a .pvd file, which consists of links to
.vtm files. The .vtm files summarize the cuboids represented by .vti files.

<?xml v e r s i o n = " 1 . 0 " ?>


<VTKFile type= " C o l l e c t i o n " v e r s i o n = " 0 . 1 " b y t e _ o r d e r = " L i t t l e E n d i a n " >
<Collection>
<DataSet t i m e s t e p = " 81920 " group= " " p a r t = " " f i l e = " data/VTM_iT0081920 . vtm " />
<DataSet t i m e s t e p = " 163840 " group= " " p a r t = " " f i l e = " data/VTM_iT0163840 . vtm " />
<DataSet t i m e s t e p = " 245760 " group= " " p a r t = " " f i l e = " data/VTM_iT0245760 . vtm " />
<DataSet t i m e s t e p = " 327680 " group= " " p a r t = " " f i l e = " data/VTM_iT0327680 . vtm " />
<DataSet t i m e s t e p = " 409600 " group= " " p a r t = " " f i l e = " data/VTM_iT0409600 . vtm " />

89
</ C o l l e c t i o n >
</VTKFile>
Listing 6.1: Example of a .pvd file that points for every time step to the corresponding .vtm file.
Every time step is associated to a .vtm file.

<?xml v e r s i o n = " 1 . 0 " ?>


<VTKFile type= " v t k M u l t i B l o c k D a t a S e t " v e r s i o n = " 1 . 0 " b y t e _ o r d e r = " L i t t l e E n d i a n " >
<vtkMultiBlockDataSet>
<Block index= " 0 " >
<DataSet index= " 0 " f i l e = " VTM_iT0081920iC00000 . v t i " ></DataSet>
</Block>
<Block index= " 1 " >
<DataSet index= " 0 " f i l e = " VTM_iT0081920iC00001 . v t i " ></DataSet>
</Block>
<Block index= " 2 " >
<DataSet index= " 0 " f i l e = " VTM_iT0081920iC00002 . v t i " ></DataSet>
</Block>
<Block index= " 3 " >
<DataSet index= " 0 " f i l e = " VTM_iT0081920iC00003 . v t i " ></DataSet>
</Block>
</ v t k M u l t i B l o c k D a t a S e t >
</VTKFile>
Listing 6.2: Example of a .vtm file that points to .vti files that hold data of a cuboids. Every
cuboid writes its data to a .vti file, which are assembles by a .vtm file.

There is also a BlockVTKwriter that writes data sequentially. More details can be found in the
source code and its documentation.

6.2. Write Simulation Data to VTK File Format


VTK data files can be visualized and postprocessed with the free software Paraview [75], which
offers a graphical interface with extensive functionality. The following listing shows, on the one
hand, how to write VTK files sequential for a geometry and cuboid functors. On the other hand,
the usage of the parallel write-routine for velocity and pressure functors is shown.
1 // create VTK writer object
2 SuperVTMwriter3D<T> vtmWriter( " FileNameGoesHere " );
3 // write only the first iteration step
4 if (iT==0) {
5 SuperLatticeGeometry3D<T,DESCRIPTOR> geometry(sLattice, superGeometry);
6 SuperLatticeCuboid3D<T,DESCRIPTOR> cuboid(sLattice);
7 // writes the geometry and cuboids to file system, sequentially
8 vtmWriter.write(geometry);
9 vtmWriter.write(cuboid);
10 // mandatory to call the following write()-method
11 vtmWriter.createMasterFile();
12 }
13 // write every 2 sec (physical time scale)
14 if (iT%converter.getLatticeTime(2.)==0) {
15 // create functors that process data from SuperLattice
16 SuperLatticePhysVelocity3D<T,DESCRIPTOR> velocity(sLattice,
17 converter);

90
18 SuperLatticePhysPressure3D<T,DESCRIPTOR> pressure(sLattice,
19 converter);
20 vtmWriter.addFunctor( velocity );
21 vtmWriter.addFunctor( pressure );
22 // writes the added functors to file system, parallel
23 vtmWriter.write(iT);
24 }

Listing 6.3: An exemplary code to write simulation data to file system.

Note that the function call creatMasterFile() in iT == 0 is essential to write parallel VTK data.

6.3. CSV Writer


For some data analysis a CSV format of the data is necessary. In this case it is possible to use
the CSV Writer to create these data files. The following lines show an application of the CSV
Writer in the example advectionDiffusion1d (8.3.3). If one only wants to write in one data file,
the filename can be given to the constructor of the CSV Writer. However the plotFileName
parameter provides the possibility to set a new datafile with every call of this function. The
precision parameter refers to the precision of the output data.
1 CSV<T> csv();
2 csv.writeDataFile(N, simulationAverage, " averageSimL2RelErr " );

Listing 6.4: Exemplary application of the CSV Writer

6.4. Write Images Instantaneously


OpenLB is able to output image data directly. This is helpful to get a brief overview of how
the simulation is going on without using external visualization tools. Note that only 1D data
or equivalent scalar-valued data can be represented by images. Hence, for vector-valued data,
e.g. velocity, it is important to take an appropriate norm. This step transforms the vector into a
scalar and the data becomes one dimensional as required.
For 2D applications it is straight forward to generate images, since every point of the com-
putational grid represents a pixel. However, for 3D applications this assignment fails. OpenLB
allows one to reduce the 3D grid to a 2D plane by parameterizing a hyperplane in 3D space.
The resulting 2D block lattice represents the image by assigning lattice points to pixels.
An example of how to take a norm and how to reduce a plane is shown below.
1 // get the pointwise l2 norm of velocity
2 SuperEuklidNorm3D<T> normVel( velocity );
3 // reduce a hyperplane paratrized by normal (0,0,1) and centered in the mother geometry
from the 3D data
4 BlockReduction3D2D<T> planeReduction( normVel, {0, 0, 1} );

91
Listing 6.5: An exemplary code reducing a plane in 3D

Note that internally the hyperplane is parameterized using the Hyperplane3D class. This exam-
ple uses one of the helper constructors of BlockReduction3D2D to hide this detail for the common
use case of parameterizing a hyperplane by a normal vector. There are further such helper con-
structors available if one wishes to for example define a hyperplane by two span vectors and
its origin. However for full control over the hyperplane a Hyperplane3D instance may also be
created by hand.
1 SuperEuklidNorm3D<T> normVel( velocity );
2 BlockReduction3D2D<T> planeReduction(
3 normVel,
4 // explicitly construct a 3D hyperplane
5 Hyperplane3D<T>()
6 .centeredIn(superGeometry.getCuboidGeometry().getMotherCuboid())
7 .spannedBy({1, 0, 0}, {0, 1, 0}));
8 BlockGifWriter<T> gifWriter;
9 gifWriter.write(planeReduction, iT, " v e l " );

Listing 6.6: Exemplary code to write images of an explicitly instantiated 3D hyperplane with

Both of these exemplary codes reduce a 3D hyperplane to a 2D lattice with 600 points on
its longest side. It is possible to change this resolution either by providing it as a constructor
argument to BlockReduction3D2D or by explicitly instantiating a HyperplaneLattice3D.
1 SuperEuklidNorm3D<T> normVel( velocity );
2 HyperplaneLattice3D<T> gifLattice(
3 superGeometry.getCuboidGeometry(),
4 Hyperplane3D<T>()
5 .centeredIn(superGeometry.getCuboidGeometry().getMotherCuboid())
6 .normalTo({0, -1, 0}),
7 // resolution (floating point values are used as grid spacing instead)
8 1000);

Listing 6.7: Exemplary code using an explicitly instantiated 3D hyperplane lattice

In 2D the reduction of velocity data to a block can be achieved as follows.


1 SuperEuklidNorm2D<T,DESCRIPTOR> normVel( velocity );
2 BlockReduction2D2D<T> planeReduction( normVel );

Listing 6.8: Exemplary code reducing data in 2D

The resolution of 600 points on the longest side of the object is set as default, but can be
altered similarly to the listings 6.5, 6.6 and 6.7. There are two options of generating images of
the processed values in 2D and 3D.

92
6.4.1. GifWriter
In this example the constructor gifWriter generates automatically scaled images of the PPM data
type which are scaled according to the minimum and maximum value of the desired value of
the time step.
1 BlockReduction3D2D<T> planeReduction(normVel, gifLattice);
2 BlockGifWriter<T> gifWriter;
3 //gifWriter.write(planeReduction, 0, 0.7, iT, "vel"); //static scale
4 gifWriter.write( planeReduction, iT, " v e l " ); // scaled

Listing 6.9: Exemplary code using gifWriter to create PPM files

With imagemagick’s command convert the PPM files generated by gifWriter can be combined
to an animated GIF file as follows.
convert tmp/imageData/*.ppm animation.gif

To reduce the GIF’s file size you can use the options fuzz and OptimizeFrame, for example:
convert -fuzz 3% -layers OptimizeFrame tmp/imageData/*.ppm animation.gif

Even smaller files are possible with ffmpeg and conversion to MP4 video file. This could be done
using a command like the following.
ffmpeg -pattern_type glob -i ’tmp2/imageData/*.ppm’ animation.mp4

6.4.2. Heatmap
Whereas the the gifWriter creates only automatically scaled PPM images, the functor heatmap has
more options to adjust the JPEG files. For this purpose the variable plotParam can be created and
the desired modifications, e.g. minimum and maximum values of the scale, can be passed on to
the optional variable.
1 SuperEuklidNorm3D<T> normVel( velocity );
2 BlockReduction3D2D<T> planeReduction( normVel, {0, 0, 1} );
3 // write output as JPEG and changing properties
4 heatmap::plotParam<T> jpeg_Param;
5 jpeg_Param.contourlevel = 5; //setting the number of contur lines
6 jpeg_Param.colour = " rainbow " ; //colour combination "grey", "pm3d", "blackbody" and "
rainbow" can be chosen
7 heatmap::write(planeReduction, iT, jpeg_Param);

Listing 6.10: Exemplary code using the functor heatmap with modified parameters

The exemplary code in listing 6.10 shows how to change the colour set and number of contour
lines in the generated images. All possible adjustments are listed and used in the example
venturi3D (see Section 8.11.5).

93
6.5. Gnuplot Interface
Often, for the analysis of simulations a plot of the data is required. OpenLB offers an interface
which uses Gnuplot to create plots. Furthermore, it is possible to see the particular data that was
used for the plots in realtime and to use comparison data, which is directly used in the plot. An
example for the usage from examples/cylinder2d is shown below.
1 // Gnuplot constructor (must be static!)
2 // for real-time plotting: gplot("name", true) // experimental!
3 static Gnuplot<T> gplot( " drag " );
4
5 ...
6
7 // set data for gnuplot: input={xValue, yValue(s),
8 // names (optional), position of key (optional)}
9 gplot.setData( converter.getPhysTime( iT ), {_drag[0], 5.58},
10 { " drag ( openLB ) " , " drag ( s c h a e f e r T u r e k ) " }, " bottom r i g h t " );
11
12 // writes a png (or optional pdf) in one file for every timestep,
13 // if the png file is opened by an imageviewer it can be used as a "liveplot"
14 // optional for pdf output, use: gplot.writePDF()
15 gplot.writePNG();
16 }

Listing 6.11: An exemplary code to plot simulation data.

The data drag[0] is calculated in the example and compared with the value 5.58. This is then
plotted as shown in Fig. 6.1.

Figure 6.1.: Gnuplot output of drag calculation in cylinder2d.

94
In order to have plots for different times, the following usage is recommended.
1 ...
2
3 // every (iT%vtkIter) write an png of the plot
4 if ( iT%( vtkIter ) == 0 ) {
5 // writes pngs: input={name of the files (optional),
6 // x range for the plot (optional)}
7 gplot.writePNG( iT, maxPhysT );

Listing 6.12: Creating plots for different time steps.

6.5.1. Regression with Gnuplot


Moreover, Gnuplot can be used to create a linear regression of datasets. For instance, the analy-
sis of the experimental order of convergence in a simulation can be executed as in the example
poiseuille2dEOC.
The possible options are: Linear regression to the given data whereas it is possible to use a
loglog-scaling (loglogINVERTED for inverting the x-axis). The implementation is done via the
constructor of plot in the .cpp file itself as seen below.
1 static Gnuplot<T> gplot( " eoc " , Gnuplot<T>::LOGLOG, Gnuplot<T>::LINREG);

The possible options for the scaling are: LINEAR (using the data as given), LOGLOG (using log
of the x- and y-dataset) and LOGLOGINVERTED (using log of y-dataset and 1/log of x-dataset). For
the regression type one can choose LINREG (linear Regression) and OFF (no regression).

Figure 6.2.: Example of using regression to analyze polynomial errors (left: old, right: new im-
plementation)

95
6.6. Console Output
In OpenLB, there is an extension of the default ostream, which handles parallel output and
prefixes each line with the name of the class that produced the output. Listing 6.13 is the output
of the bstep2d example.
It is easy to determine which part of OpenLB has produced a specific message. This can be
very helpful in the debugging process, as well as for quickly postprocessing console output
or filtering out important information without any need to go into the code. Together with
OpenLB’s semi-CSV style output standard, it is possible to easily visualize any data imaginable
with diagrams, such as convergence rates, data errors, or simple average mass density.
1 void MyClass::print() {
2 OstreamManager clout(std::cout, " MyClass " );
3 ...
4 clout << " s t e p = " << step << " ; avRho= " << avRho
5 << " ; maxU= " << maxU << std::endl;
6 }

Using the OstreamManager is easy and consists of two parts. First, an instance of the class
OstreamManager is needed. The one created here in line 2 is called clout like all the other in-
stances in OpenLB. This word consists of the two words class and output Moreover, it is quite
similar to standard cout. The constructor receives two arguments: one describing the ostream to
use, the other one setting the prefix-text. In line 4 the usage of an instance of the OstreamManager
is shown. There is not much difference in usage between a default std::cout and an instance
of OpenLB’s OstreamManager. The only thing to consider is that a normal \n won’t have the
expected effect, so use std::endl instead.
In classes with many output producing functions however, you wouldn’t like to instantiate
OstreamManager for every single function, so a central instantiation is preferred. This is done
by adding a mutable OstreamManager object as a private class member and initializing it in the
initialization list of each defined constructor. An example implementation of this method can
be found in src/utilities/timer.hh.
Another great benefit of OstreamManager is the reduction of output in parallel. Running a pro-
gram using cout on multiple cores normally means getting one line of output for each process.
OstreamManager will avoid this by default and display only the output of the first processor. If
this behavior is unwanted in a specific case, it can be turned off for an instance named clout by
clout.setMultiOutput(true).
Further scenarios that are not yet implemented in OpenLB can make use of different streams
like the ostream std::cerr for separate error output, file streams, or something completely dif-
ferent. In doing so, every stream needs its own instance.

96
$ ./bstep2d
...
[prepareGeometry] Prepare Geometry ...
[SuperGeometry2D] cleaned 0 outer boundary voxel(s)
[SuperGeometry2D] cleaned 0 inner boundary voxel(s)
[SuperGeometry2D] the model is correct!
[SuperGeometryStatistics2D] materialNumber=0; count=13846; minPhysR=(0,0); maxPhysR=(5,0.75)
[SuperGeometryStatistics2D] materialNumber=1; count=92865; minPhysR=(0.0166667,0.0166667); maxPhysR
=(19.9833,1.48333)
[SuperGeometryStatistics2D] materialNumber=2; count=2448; minPhysR=(0,0); maxPhysR=(20,1.5)
[SuperGeometryStatistics2D] materialNumber=3; count=43; minPhysR=(0,0.783333); maxPhysR=(0,1.48333)
[SuperGeometryStatistics2D] materialNumber=4; count=89; minPhysR=(20,0.0166667); maxPhysR=(20,1.48333)
[prepareGeometry] Prepare Geometry ... OK
[prepareLattice] Prepare Lattice ...
[prepareLattice] Prepare Lattice ... OK
97

[main] starting simulation...


[SuperPlaneIntegralFluxVelocity2D] regionSize[m]=1.46667; flowRate[m^2/s]=0; meanVelocity[m/s]=0
[SuperPlaneIntegralFluxPressure2D] regionSize[m]=1.46667; force[N]=0; meanPressure[Pa]=0
[Timer] step=0; percent=0; passedTime=0.846; remTime=101519; MLUPs=0
[LatticeStatistics] step=0; t=0; uMax=1.49167e-154; avEnergy=0; avRho=1
[SuperPlaneIntegralFluxVelocity2D] regionSize[m]=1.46667; flowRate[m^2/s]=0; meanVelocity[m/s]=0
[SuperPlaneIntegralFluxPressure2D] regionSize[m]=1.46667; force[N]=0; meanPressure[Pa]=0
[Timer] step=300; percent=0.25; passedTime=2.503; remTime=998.697; MLUPs=17.2699
[LatticeStatistics] step=300; t=0.1; uMax=5.75006e-07; avEnergy=8.66459e-16; avRho=1
...

Listing 6.13: Terminal output of example bstep2d.


6.7. Console Input
It can be useful to accept some simulation parameters such as e.g. the resolution N as a named
CLI argument, i.e. ./case -resolution 42. For this purpose, OpenLB includes the CLIreader
class which can be used to easily read named and typed arguments from the program’s flags.
1 CLIreader args(argc, argv);
2 // Set resolution to value of "--resolution" or return default of 5 if not provided
3 const int resolution = args.getValueOrFallback( " −− r e s o l u t i o n " , 5);
4 // Get string from argument
5 const std::string model = args.getValueOrFallback<std::string>(
6 " −−bulk −model " , " Smagorinsky " )

Listing 6.14: Example for reading CLI arguments

6.8. Read and Write STL Files


OpenLB offers the possibility to read and write geometry data in the Standard Triangulation
Language, STL for short. The OpenLB class STLreader provides the desired functionality. In the
case that the STL file you want to read is too large, you can use Paraview’s filter "Decimate" to
reduce the number of facets.
The constructor of the class STLreader takes two necessary and three optional arguments.
1 STLreader(const std::string fName, T voxelSize, T stlSize=1,
2 unsigned short int method = 2, bool verbose = false);

• fName: The filename of the STL file to be read.

• voxelSize: The intended spatial step size for the simulation in SI units (m).

• stlSize: Conversion factor if the STL file is not given in SI units. E.g.: For an STL file in
cm, this factor is stlSize = 0.01.

• method: Switch between methods for determining inside and outside of geometry.
– default: fast, less stable
– 1: slow, more stable (for untight STLs)

• verbose: Switch to get more output.

Functionality: The STL file is read and stored in the class STLmesh. A class Octree is instan-
tiated of side-length rad = 2j−1 · voxelSize, j ∈ N with j such that a cube with diameter 2rad
covers the entire STL. Intersections of triangles and the nodes of the Octree are computed and an
index of the respective triangles is stored in each node. A node is a leaf if either rad = voxelSize
or if it does not contain any triangles.

98
In a second step, it is determined whether a leaf is inside the STL geometry by one of the fol-
lowing methods:

• (Default) One ray in Z-direction is defined for each Voxel in XY-layer. All nodes are indi-
cated on the fly (faster, less stable).

• Define three rays (X-, Y-, Z-direction) for each leaf and count intersections with STL for
each ray. Odd number of intersection means inside. The final state is decided by a majority
vote (slower, more stable).

6.9. XML Parameter Files


In OpenLB essential simulation parameters can be placed in a XML file. This is a useful feature,
since once a program is compiled, the parameters can be changed through the XML file and
recompilation is redundant. As a consequence whenever parameter fitting or general simula-
tions are wanted, this approach can help you since only editing the XML file is necessary. The
parsing is implemented in the header file io/xmlReader.h.
The general format for the XML files is:

<Param>
<Output>
<Log>
<VerboseLog> t r u e </VerboseLog>
</Log>
</Output>
<VisualizationImages>
<Filename> image </Filename
</ V i s u a l i z a t i o n I m a g e s >
</Param>

All parameters need to be wrapped in a <Param> tag. To open a config file, you just pass a string
with the file name to the class constructor of XMLreader.
1 std::string fName( "demo . xml " );
2 XMLreader config(fName);
3
4 bool _verboseLog;
5 std::string imagename;
6 XMLreader outputConfig = config[ " Output " ];
7
8 config.readOrWarn<bool>( " Output " , " Log " , " VerboseLog " ,_verboseLog);
9 outputConfig.readOrWarn<bool>( " Log " , " VerboseLog " , " " ,_verboseLog);
10 config.readOrWarn<std::string>( " V i s u a l i z a t i o n I m a g e s " , " Filename " , " " , imagename);

First, an XMLreader object config is created. There are multiple ways to access the configuration
data. To select the tag you would like to read, you just use an associative array like syntax as
shown above.
To get a specific value out of an XML parameter file, there are multiple methods. One is
to pass a predefined variable to the method readOrWarn, which reads the respective value and

99
prints a warning in case the data type is not matching or the value cannot be found. For large
subtrees with lots of parameters, you can also create a subobject. For this, you just have to
reassign your selected subtree to a new XMLreader-object as is done above for Output.

6.10. Visualization with Paraview


There are several data formats that can be used in Paraview. Use ‘File – Open’ and choose the
set of data you want to use. In regards to OpenLB it is enough to open the file with the ending
.pvd, since it contains a reference to the .vti files. The chosen files should now be part of the
‘Pipeline Browser’, which should be on the left hand side (if any of the panels are missing you
can add them in the ‘View’ menu on the top). Click on ‘Apply’ in the ‘Properties’ panel (usually
located below the ‘Pipeline Browser’) after opening.
Your data should now be visible in the center window. From within the ‘Properties’ or in
one of the top tool bars, you can change the ‘Coloring’ properties, which selects what shall be
displayed (e.g. physical velocity, phys pressure), which part of this choice shall be displayed
(e.g. magnitude, x-value) and the way it is colored.
Make sure that ‘3D’ is part of the tool bar directly above the window where you can see your
objects. If you cannot find it click on ‘2D’ which should be written instead and change it to ‘3D’
by doing this. The commands for moving your whole set of visible objects and thus changing
the perspective are the following:
• Using the mouse wheel, you can zoom in and out.

• Using the right mouse button or ‘Ctrl + left mouse button’, you can move the object to
the background or the foreground. In comparison to zooming in and out, this changes the
level of the 3D-effect.

• Using the left mouse button allows you to turn the object.

• Clicking the mouse wheel allows you to move the object centre.
Of course you can also stick to ‘2D’, although in this case the mouse commands might change a
bit.
You can visualize the temporal development of your simulation using the ‘Play’ button and
the related buttons directly next to it. If you want to go to a certain time step, use the input field
‘Time’, which is also located here.
To manipulate your data in Paraview, numerous so-called ‘Filters’ are provided in the ‘Filters’
menu in the top bar.

6.10.1. Clip
With this filter, you can cut off parts of your objects, for example, to make it possible to look
inside the geometry. There are several tool options to determine which part is cut off. You can

100
choose between plane, box and sphere.
If the “wrong” side is cut off, check ‘inside out’ to make the other side visible.

6.10.2. Contour
Using ‘Contour’ you can show lines or planes of certain data values, which you can set.

6.10.3. Glyph
If you have a point data set, you can represent it as spheres using the filter ‘Glyph’ and choosing
‘Sphere’ as setting for ‘Glyph Type’. Using the resolution settings, you can smooth the surface
to make the sphere look more rounded.
There are alternative ways to represent the data. As an example, arrows can be used to show
the direction of a velocity. Check ‘Glyph Type’ for further possibilities.

6.10.4. Stream Tracer


The Stream Tracer filter is a powerful tool that allows users to visualize the flow as streamlines,
making for example turbulent flow more visible. By placing the seed of the stream tracer next
to the area of interest, users can apply this filter. There are two possible forms of the seed: it can
be set as a pointcloud or a line.
To use the Stream Tracer filter, simply select the seed type that best fits your needs and place
it in the desired location. In the case of a pointcloud seed, the flow that goes through the sphere
is transformed, while a line seed transforms the flow along the line. An example of the OpenLB
example nozzel3D visualized with streamlines using a pointcloud seed can be seen in Figure 6.3.

Figure 6.3.: OpenLB example nozzel3D visualized with streamlines. In this case as seed, a point-
cloud was used.

101
6.10.5. Resample To Image
Another useful filter for visualization purposes is (Adaptive) Resample To Image. It applies a
volumetric raymarching algorithm on the data. The Figure 6.4 shows the result when this filter
is applied.
To use this filter, simply select the Resample To Image option and apply it to your data. The
resulting visualization will be a 3D image that can be rotated and explored in real-time. An
example of the OpenLB example nozzel3D visualized using the Resample To Image filter can be
seen in Figure 6.4.

Figure 6.4.: OpenLB example nozzel3D visualized with the help of the Resample To Image filter

This visualization provides a unique perspective on the data and can help users to better
understand the underlying patterns and structures in the data. It’s important to note that the
Resample To Image filter requires a significant amount of computational resources to generate
the 3D image. As such, it may not be suitable for use on larger datasets or on less powerful
hardware. Additionally, the parameters of the filter can be adjusted to fine-tune the resulting
visualization, but this requires some knowledge of the filter and its underlying algorithm.

6.11. Application of Functors


6.11.1. Extract Simulation Data
Velocity, pressure and other information can be extracted from the lattice using predefined func-
tors, see Listing 6.15. All they need to know is a SuperLatticeXD and an UnitConverter - if di-
mension or physical units are wanted.

102
1 // Create functors
2 SuperLatticePhysVelocity3D<T,DESCRIPTOR> velocity(sLattice, converter);
3 SuperLatticePhysPressure3D<T,DESCRIPTOR> pressure(sLattice, converter);

Listing 6.15: Code example for calculating velocity and pressure using functors.

6.11.2. Define Analytic Functions


Often the inflow velocity has Poiseuille profile which is defined analytically, by means of a
function. OpenLB provides analytic functors to define e.g. a Poiseuille velocity profile, random
values, linear and constant values.
1 Poiseuille2D<T> poiseuilleU(superGeometry, 3, maxVelocity, distance2Wall);

Listing 6.16: Define a poiseuille velocity profile for inflow boundary condition.

6.11.3. Interpolation
Another case for interpolation functors is the conversion of a given analytical functor, such as an
analytical solution to a SuperLattice functor. Afterwards, the difference can be easily calculated
with the help of the functor arithmetic, see Listing 6.18. Finally, specific norms implemented as
functors facilitate analysis of convergence.
1 // define a analytic functor: R^3 -> R
2 AnalyticalConst3D<double,double> constAna(1.0);
3 // get analytic functor on the lattice: N^3 -> R
4 SuperLatticeFfromAnalyticalF3D<double,DESCRIPTOR> constLat(constAna,
5 lattice);

Listing 6.17: Transition from an analytical functor to a lattice functor.

Application of this is shown in the example poiseuille2d, which is discussed in Section 8.4.4

6.11.4. Arithmetic and Advanced Functor Usage


Functors can be added, subtracted, etc. which is a very useful and elegant method to treat data.
Listing 6.18 shows how to compute the relative error over the whole three dimensional domain.
1 int input[1];
2 double normAnaSol[1], absErr[1], relErr[1];
3 // define analytical solution: R^3 -> R
4 // for snake of simplicity it is a constant function,
5 // however it may be any specialization of AnalyticalF3D
6 AnalyticalConst3D<double,double> dSol(1.0);
7 // get analytical solution on the lattice: N^3 -> R
8 SuperLatticeFfromAnalyticalF3D<double,DESCRIPTOR> dSolLattice(dSol, lattice);

103
9 // get density out of simulation data
10 SuperLatticeDensity3D<T,DESCRIPTOR> d(lattice);
11 // compute absolute error
12 SuperL2Norm3D<double> dL2Norm(dSolLattice - d, superGeometry, 1);
13 // compute norm of solution
14 SuperL2Norm3D<double> dSolL2Norm(dSolLattice, superGeometry, 1);
15 dL2Norm(absErr, input); // access absolute error
16 dSolL2Norm(normAnaSol, input); // access norm of the solution
17 relErr[0] = absErr[0] / normAnaSol[0];
18 clout << " d e n s t i t y −L2− e r r o r ( abs ) = " << absErr[0] << " ; "
19 << " d e n s t i t y −L2− e r r o r ( r e l ) = " << relErr[0] << std::endl;

Listing 6.18: Computation of a relative error with respect to L2 -norm.

For more detail, see the source code of example 8.4.4.


Assemble geometry with geometric primitives of type IndicatorFXD.
1 Vector<double,2> extendChannel(lx0, ly0);
2 Vector<double,2> originChannel;
3 IndicatorCuboid2D<double> channel(extendChannel, originChannel);
4 // setup step
5 Vector<double,2> extendStep(lx1, ly1);
6 Vector<double,2> originStep;
7 IndicatorCuboid2D<double> step(extendStep, originStep);
8 // remove step from channel
9 IndicatorIdentity2D<double> channelIdent(channel-step);

Listing 6.19: Deploy functor arithmetic to build geometry data.

6.11.5. Setting Boundary Value


Boundary cells are marked by a certain material number in the SuperGeometryXD. Using a func-
tor, velocities can be set simultaneously on all cells of this material. First, a vector that character-
izes the maximum flow velocity and its directions is necessary. Then, a special functor uses this
vector to initialize a Poiseuille profile. The direction can be extracted in the case of axis-parallel
inflow regions automatically from the SuperGeometryXD. In the last step, the SuperLattice initial-
izes all cells of a certain material given by the SuperLatticeXD with the velocities computed by
the functor.
1 // Creates and sets the Poiseuille inflow profile using functors
2 double maxVel = converter.getCharLatticeVelocity();
3 CirclePoiseuille3D<double> poiseuilleU(superGeometry, 3, maxVel, distance2Wall);
4 sLattice.defineU(superGeometry, 3, poiseuilleU);

Listing 6.20: Code example for setting a Poiseuille velocity profile and a constant pressure
boundary in cylinder3d.

104
6.11.6. Flux Functor
The flux of a quantity is defined as the rate at which this quantity passes through a fixed bound-
ary per unit time. As a mathematical concept, flux is represented by the surface integral of a vector
field Z
Φ= F · dA,

where F is a vector field, and dA is an area element of the surface A, in the direction of the
surface normal n.
Flux functors calculate the discrete flux
X
Φh = h2 fi · n,
i

with h as the grid length of the surface and fi the vector of the quantity at grid point i.
As the grid of the area has to be independent from the lattice, the value of fi will be inter-
polated from the surrounding lattice points. In the general case this discrete value is calcu-
lated by SuperPlaneIntegralF3D. Note that the reduction of the relevant surface is performed by
BlockReduction3D2D and that SuperPlane-IntegralF3D adds only the multiplication by the area
unit as well as the normal vector for multidimensional fi . In turn specific flux functors such
as SuperPlaneIntegralFluxVelocity3D only add functor instantiation and print methods. So, for
the SuperPlaneIntegralF3D functor a surface needs to be defined. OpenLB currently supports
using subsets of hyperplanes as the surfaces on which to calculate a flux.
Such a hyperplane can be defined by an origin and two span vectors, an origin and a nor-
mal vector or a 3D circle indicator. BlockReduction3D2D interpolates the full intersection of hy-
perplane and mother geometry. Optionally this maximal plane may be further restricted by
arbitrary 2D indicators.
Note that SuperPlaneIntegralF3D as well as all specific flux functors provide a variety of con-
structors accepting various hyperplane parametrizations. For full control you may consider
explicitly constructing a Hyperplane3D instance.
The discretization of a hyperplane parametrization (given by Hyperplane3D) into a discrete lat-
tice is performed by HyperplaneLattice3D.

Step 1: Define the hyperplane by


a) origin and two span vectors
1 Vector<T,3> origin;
2 Vector<T,3> u, v;

b) origin and normal vector


1 Vector<T,3> origin;
2 Vector<T,3> normal;

c) normal vector (centered in mother cuboid)

105
1 Vector<T,3> normal;

d) circle indicator
1 IndicatorCircle3D<T> circleIndicator(center, normal, radius);

e) arbitrary hyperplane
1 // example parametrization of a hyperplane centered in the mother cuboid and normal to
the Z-axis
2 Hyperplane3D<T> hyperplane()
3 .centeredIn(cuboidGeometry.getMotherCuboid())
4 .normalTo({0, 0, 1});

Step 1.1 (optional): Define the hyperplane discretization by


a) grid length
1 T h = converter.getLatticeL();
2 HyperplaneLattice3D<T> hyperplaneLattice(
3 cuboidGeometry,
4 Hyperplane3D<T>().originAt(origin).spannedBy(u, v),
5 h);

b) grid resolution
1 HyperplaneLattice3D<T> hyperplaneLattice(
2 cuboidGeometry,
3 Hyperplane3D<T>().originAt(origin).spannedBy(u, v),
4 600); // resolution

Step 1.2 (optional): Define the flux-relevant lattice points by


a) list of material numbers
1 std::vector<int> materials = {1, 2, 3};

a) arbitrary indicator
1 SuperIndicatorF3D<T> integrationIndicator...

Step 1.3 (optional): Restrict the discretized intersection of hyperplane and geometry by
a) 2D circle indicator (relative to hyperplane origin)
1 T radius = 1.0;
2 IndicatorCircle2D<T> subplaneIndicator({0,0}, radius);

a) arbitrary 2D indicator (relative to hyperplane origin)


1 IndicatorF2D<T> subplaneIndicator...

106
Step 2: Create a SuperF3D functor for
a) velocity flow
1 SuperLatticePhysVelocity3D<T,DESCRIPTOR> f(sLattice, converter);

b) pressure
1 SuperLatticePhysPressure3D<T,DESCRIPTOR> f(sLattice, converter);

c) any other SuperF3D functor


1 SuperF3D<T> f...

Step 3: Instantiate SuperPlaneIntegralF3D functor depending on how the hyperplane was de-
fined and discretized.
a) using origin, two span vectors and materials list
1 SuperPlaneIntegralF3D<T> fluxF(
2 f, superGeometry, origin, u, v, materials);

b) using origin, normal vector and materials list


1 SuperPlaneIntegralF3D<T> fluxF(
2 f, superGeometry, origin, normal, materials);

c) using normal vector and materials list


1 SuperPlaneIntegralF3D<T> fluxF(f, superGeometry, normal, materials);

d) using 3D circle indicator and materials list


1 SuperPlaneIntegralF3D<T> fluxF(
2 f, superGeometry, circleIndicator, materials);

e) using arbitrary hyperplane and integration point indicator


1 SuperPlaneIntegralF3D<T> fluxF(
2 f, superGeometry, hyperplane, integrationIndicator);

f) using arbitrary hyperplane, integration point indicator and subplane indicator


1 SuperPlaneIntegralF3D<T> fluxF(f, superGeometry, hyperplane, integrationIndicator,
subplaneIndicator);

g) using arbitrary hyperplane lattice, integration point indicator and subplane indicator
1 SuperPlaneIntegralF3D<T> fluxF(f, superGeometry, hyperplaneLattice,
integrationIndicator, subplaneIndicator);

Step 4: Get results using operator()

107
1 int input[1]; // irrelevant
2 T output[5];
3 fluxF(output, input);

• output[0]: flow rate or plane integral (if quantity has dimension 1)

• output[1]: size of the area

• output[2..4]: flow vector (i.e. vector of summed quantities)

In many cases the functor argument is either the velocity or the pressure functor.
Thus Step 2 and Step 3 may be combined using SuperPlaneIntegralFluxVelocity3D respectively
SuperPlaneIntegralFluxPressure3D. Their constructors are mostly identical to the ones provided
by SuperPlaneIntegralF3D. In fact the only difference is that the first functor argument is re-
placed by references to SuperLattice and UnitConverter.

Step 2.1): Combined steps for velocity flux


1 SuperPlaneIntegralFluxVelocity3D<T> vFlux(superLattice, converter, ...);

Step 2.2): Combined steps for pressure flux


1 SuperPlaneIntegralFluxPressure3D<T> pFlux(superLattice, converter, ...);

Step 3.1): Output region size, volumetric flow rate and mean velocity
1 vFlux.print(std::string regionName,
2 std::string fluxSiScaleName, std::string meanSiScaleName);

• fluxSiScaleName: ’ml/s’ or ’l/s’ or ’ ’ (default=m3 /s)

• meanSiScaleName: ’mm/s’ or ’ ’ (default=m/s)

Step 3.2): Output region size, force and mean pressure


1 pFlux.print(std::string regionName,
2 std::string fluxSiScaleName, std::string meanSiScaleName);

• fluxSiScaleName: ’MN’ or ’kN’ or ’ ’ (default=N )

• meanSiScaleName: ’mmHg’ or ’ ’ (default=P a)

108
6.11.7. Discrete Flux Functor
If a hyperplane is axis-aligned, flux functors may optionally be used in discrete mode. Passing
BlockDataReductionMode::Discrete as the last argument to any plane integral or flux constructor
instructs the internal BlockReduction3D2D instance to reduce the hyperplane by evaluating the
underlying functor at the nearest lattice points instead of by interpolating physical positions.
Note that this imposes restrictions on the accepted hyperplane and its lattice:

• Hyperplane3D normal must be orthogonal to a pair of unit vectors

• HyperplaneLattice3D spacing must equal the distance between lattice nodes

The restriction on the hyperplane lattice spacing is fulfilled implicitly when automatic lattice
parameterization is used.
1 // discrete flux usage in examples/aorta3d
2 SuperPlaneIntegralFluxVelocity3D<T> vFluxInflow( sLattice, converter, superGeometry,
inflow, materials, BlockDataReductionMode::Discrete );

6.11.8. Atmospheric Boundary Layer Functor


The atmospheric boundary layer AnalyticalWindProfileF3D, is a logarithmic wind profile meant
to set the velocity dependent of the height. According to Emeis et al. [95] the velocity relative to
the ground u and the friction velocity u∗ is defined as

u∗ z − d + z0
u= ln( ), (6.1)
κ z0

uref · κ
u∗ = zref +z0 , (6.2)
ln( z0 )

where uref is the reference velocity, zref the reference height, d the displacement height, z0
the aerodynamic roughness length and κ the Kármán constant.

6.11.9. Porosity & Velocity Volume Functor


The analytical functor AnalyticalPorosityVolumeF and AnalyticalVelocityVolumeF are used to
import porosity values or velocity values which are defined in an external file. It supports VDB
or VTK. In order to use these functions the respective libraries have to be installed on the system.
In the config.mk the chosen library has to be activated under FEATURES:

# Any e n t r i e s a r e passed t o t h e compiler as ‘ −DFEATURE_ * ‘


# declarations
# Used t o e n a b l e some a l t e r n a t i v e code paths and dependencies
FEATURES : = VTK VDB

109
Depending on the system configuration and the installation path of the library some adjust-
ments might have to be made to the compiler flags CXXFLAGS to link the library.

6.11.10. Wall Shear Stress Functor


The Wall Shear Stress is defined as the parallel force per unit area exerted by a fluid on a wall. In
the context of macroscopic fluid mechanics the Wall Shear Stress of a Newtonian fluid is given
by
∂u
τW = µ , (6.3)
∂y y=0

where µ is the dynamic viscosity, u is the velocity field and y the coordinate perpendicular to
the wall. The Wall Shear Stress Functor calculates the discrete Wall Shear Stress

τW = σ · n − ((σ · n) · n) · n, (6.4)

where σ is the Cauchy stress tensor and n the local unit normal vector of the surface. Since the
lattice stress tensor Π is not defined on boundary cells, it is read out from an adjacent fluid cell
in a discrete velocity direction associated with each boundary cell. The unit normal vector is
obtained by a given IndicatorF3D instance, which is slightly increased in size. See examples/
poiseuille3d for usage details. Due to the staircase approximation of the boundary, the wall
shear stress calculation is first order accurate.

10−1
Lp Error

10−2

L1
10−3 L2
L∞
EOC=1
EOC=2
10−4
11 21 41 81
Resolution

Figure 6.5.: relative error of Wall Shear Stress Lp norm in poiseuille3d

110
6.11.11. Error Norm Functors
While relative and absolute error norms may be calculated manually using functor arithmetic
(see 6.11.4), they are also available as distinct functors. As such it is preferable to utilize Super
-RelativeErrorLpNormXD and SuperAbsoluteErrorLpNormXD if one uses the common definition of
relative and absolute error norms.
Let wantedF be the simulated solution functor and f the analytical solution.

∥wantedF − f ∥p
SuperRelativeErrorLpNormXD implements (6.5)
∥wantedF∥p
SuperAbsoluteErrorLpNormXD implements ∥wantedF − f ∥p (6.6)

An example of how to use these error norm functors in practice is given by the Poiseuille flow
example as described in section 8.4.4.
1 Poiseuille2D<T> uSol(axisPoint, axisDirection, maxVelocity, radius);
2 SuperLatticePhysVelocity2D<T,DESCRIPTOR> u(sLattice, converter);
3 auto indicatorF = superGeometry.getMaterialIndicator(1);
4
5 SuperAbsoluteErrorL1Norm2D<T> absVelErrorNormL1(u, uSol, indicatorF);
6 absVelErrorNormL1(result, tmp);
7 clout << " v e l o c i t y −L1− e r r o r ( abs ) = " << result[0];
8 SuperRelativeErrorL1Norm2D<T> relVelErrorNormL1(u, uSol, indicatorF);
9 relVelErrorNormL1(result, tmp);
10 clout << " ; v e l o c i t y −L1− e r r o r ( r e l ) = " << result[0] << std::endl;

Listing 6.21: L1 velocity error in poiseuille2d

Further implementation details are touched upon in section 2.6.2.1.

6.11.12. Grid Refinement Metric Functors


SuperLatticeRefinementMetricKnudsen(2,3)D implements an automatic block-level grid refine-
ment criterion as described by Lagrava et al. in “Automatic grid refinement criterion for lattice
Boltzmann method” [110]. This criterion uses the quality of the cell-local Knudsen number
approximation as measured by SuperLatticeKnudsen*D to judge the adequacy of the block reso-
lution.

111
7. Flow Control and Optimization
In almost every application, the optimization of a process regarding certain objectives is of
interest. In numerical investigations, this is often the natural step subsequent to a successful
set-up of a simulation. That could be for example maximizing the efficiency of a mixer, exploit-
ing the maximal conversion in a chemical reactor, or reducing the air drag of a parameterized
shape. In the context of numerical investigations, the goal is to find the optimal simulation
parameters, which yield the optimal results regarding the objective we want to optimize. A
straight-forward approach is to run through the parameter sets manually, which can of course
be very demanding in terms of resources. The limitations of this approach are clear: when we
need to handle problems where the number of parameters to be optimized exceeds a certain
amount, this is no longer feasible. At this point, optimal flow control comes into play, where we
first formulate an optimization problem, which we aim to solve using methods of optimization.

OpenLB provides a powerful framework to address these problems accounting for automatic
parameter adjustments and iterative execution of simulations. Currently, the following types of
optimization problems can be addressed by our framework:

• Classical optimization – Finding optimal flow simulation parameters regarding a defined


objective

• Shape optimization – Adjusting the shape of a parameterized surface model which interacts
with the fluid

• Topology optimization – Finding an optimal body topology (distributed control problems,


i.e. control variable is distributed in the flow domain)

• Inverse problem – Minimizing the deviation between simulation and measurement (e.g. of
velocity distributions) and find flow simulation parameters reversely

Besides that, the implemented methodologies to solve the above mentioned problems can be
also used to:

• Find the minimum of an analytic function J w.r.t. an argument vector α, e.g. as a postpro-
cessing step

• Perform automated parameter studies

112
• Conduct sensitivity analysis regarding flow simulation parameters (as gradients are com-
puted when gradient-based approach is chosen)

This chapter aims to give an brief theoretical introduction of the concepts we apply to solve
these kind of problems and how they are realized within OpenLB. It is structured as follows:
Section 7.1 is about what kind of problems we can address using the optimization framework,
Section 7.2 explains the methods which are currently available in OpenLB, and Section 7.3 deals
with their implementation in OpenLB. Furthermore, there are application examples available
which demonstrate the use of the optimization framework, e.g. showcaseRosenbrock.

7.1. Problem formulation and solution strategy


The optimization framework of OpenLB is dedicated to address the solution of optimization
problems of the general form

Find α, s.t. J(f, α) is minimized while G(f, α) = 0. (7.1)

In this context, α ∈ Rd is called the control variable, f is the state, J is the objective functional
and G is the side condition. The objective functional evaluates how ’optimal’ our solution is.
The control variables can be interpreted as the degrees of freedom in the optimization problem
which we adjust in order to find the optimal solution. The state relates to the physical entities,
such as flow velocity or particle distribution function, which we compute in a simulation. The
underlying physical conservation equations which are solved during simulations are the side
conditions mentioned in (7.1). For simplicity, we assume in the following that the side condi-
tion G(f, α) = 0 yields a unique state f for any admissible control α, s.t. we have f = f (α)
(unconstrained optimization approach).
The employed solution strategy in OpenLB in order to solve problems formulated in 7.1 is
an iterative approach based on the line search concept. Starting with an initial guess, the set
of control variables αk are updated where with each iteration the objective function should be
reduced, i.e.

αk+1 = αk + △αk , with J(αk+1 ) < J(αk ). (7.2)

That is, we basically have a loop where we perform iterative adjustments to the control parame-
ters and run the simulations as a function of those controls. After completion of one simulation
we obtain the resulting state, which we use to evaluate the objective function. Depending on
the chosen strategy (gradient-based, pattern search, etc.), the control variables are updated in
order to be used in the next simulation. This basic scheme is illustrated in Algorithm 1.

113
Algorithm 1 Basic optimization algorithm in OpenLB
1: procedure M INIMIZE(J(f, α))
2: Choose initial guess of controls α0
3: repeat ▷ For k = 0, 1, 2, ..., kmax
4: Run simulation for current set of αk to obtain fk
5: Evaluate objective Jk (fk , αk )
6: Update controls αk+1
7: until Terminating condition fulfilled
8: end procedure

7.2. Gradient-based approach


Currently, all the available optimization methods implemented in OpenLB are gradient-based.
The next set of control variables αk+1 are computed by using the total derivative of the objective
dJ

regarding the control variables and the step size s, i.e. △α = △α dα , s . For the termination
condition of the optimization algorithm, usually the value of the objective functional or its total
derivative is checked if it undergoes a user-defined threshold. The remaining questions at this
point are listed together with the respective subsections answering them:

1. How to choose the (descent) direction? (7.2.1)

2. How to choose the step size? (7.2.2)

3. How can we compute the derivatives? (7.2.3)

7.2.1. Descent Algorithms


The descent algorithm computes the descent direction d for our next iteration step, i.e. the direc-
tion where we likely approach the optimal solution. One possible approach is to use the local
∂J
gradient of the objective function ∂α as the descent direction. This method is referred to as the
steepest descent approach. Since only first order derivatives are considered here, this method is
relatively slow (it requires more iteration steps) compared to Newton or Quasi-Newton meth-
ods while its main advantage is its stability. Quasi-Newton methods such as LBFGS achieve
higher convergence orders since these additionally include the approximated second deriva-
tives. In OpenLB, the steepest descent, LBFGS, and Barzilai-Borwein algorithm are provided
for computing the descent direction.

7.2.2. Step control


The choice of suitable step sizes is a nontrivial task: if the step sizes are too large we may
overshoot and miss the optimum. On the other side, if the steps are too small we will need a
lot of iterations until the optimal solution is reached, resulting in larger computational time. To
find the optimal step size, step conditions like Armijo, normal or strong Wolfe-Powell rules can

114
be applied. This introduces an additional inner loop where the step size is varied until a valid
step size is found regarding the step control (or by exceeding the maximal number of attempts).
Note, that during the step finding process multiple evaluations of the objective function and/or
its derivative can be required.

7.2.3. Derivative Computation


OpenLB provides several methods to compute the gradients, which have to be chosen depend-
ing on the optimization problem one wants to tackle:

• Usage of forward/ central difference quotients: evaluate J for neighboring values of α


and compute the difference quotients. This is the simplest method and for a small number
of control variables it is fast. However, it is the least accurate method.

• Forward automatic differentiation: evaluate J for operator-overloaded variables of type


ADf<T,n>, where T is the underlying arithmetic type and n is the number of control vari-
ables. This method is a little slower that difference quotients, but in contrast it returns
derivatives at full machine precision.

• Adjoint LBM: adjoint Lattice-Boltzmann equations are used, cf. [21, 24]. This method is
perfect for distributed control problems since the computational expense remains con-
stant for any number of control variables. However, this method requires the (theoretical)
derivation of the adjoint formulation of the LB-equations which is problem-specific.

7.3. Implementation
The implementation is separated into two key steps: the OptiCase classes define how the ob-
jective functional J and its gradient are computed as functions of α. The Optimizer classes
define optimization methods such as steepest descent or step size algorithms (independent of
the question, how the function and gradient evaluations are performed).
The optimization framework provided in OpenLB is rather designed for large-scale applica-
tions like fluid flow simulations, where a time critical step (regarding computation time) is the
function evaluation and not the optimization routine.

7.3.1. Optimizer Classes: Optimization Methods


The common methods steepest descent, LBFGS and Barzilai-Borwein are implemented as chil-
dren classes of the Optimizer classes. Their various free parameters (e.g. an upper limit for the
number of optimization steps) can be passed via the constructor or read from an xml file. In
many situations, the control variables have to remain in a certain range in order to guarantee
physical meaningfulness and a stable simulation. E.g. for porosity optimization, the poros-
ity has to lie between 0 and 1 in order to be physically meaningful. This can be achieved via

115
bounding the control by user-defined thresholds (typically used for parameter optimization) or
by using a smooth projection map p : R → [0, 1] (helpful for porosity optimization). Alterna-
tively, if one has more than one control variables, a vector with the upper and lower bounds for
each variable can be used to bound the control.

7.3.2. OptiCase Classes: Gradient Computation


dJ
For the numerical evaluation of the functional gradient dα , the following four options are im-
plemented:

• Forward/ central difference quotients. The method is implemented in the classes


OptiCaseFDQ and OptiCaseCDQ, where the user (only) has to provide an expression for the
evaluation of J. This expression is passed as any function in the constructor of the Opti-
Case.

• Forward automatic differentiation via operator-overloading. Therefore, the full source


code has to be templatized w.r.t. the arithmetic type T and one has to take care that the
arithmetic operations are differentiable (which is both satisfied by the OpenLB code basis).
One then passes two instances of J (one with type T, one with type ADf<T,n>) to the class
OptiCaseAD, which then does the rest.

• Adjoint LBM. This method requires the (theoretical) calculation of adjoint LB-equations
which is a problem-specific task. Because of that, the implementing class OptiCaseDual
poses direct assumptions on the implementation of the LB simulation (e.g., it requires
usage of the Solver framework and has so far only been implemented for porosity and
force optimization, cf. examples DomainIdenfification3d and TestFlowOpti3d).

For difference quotients and forward automatic differentiation, any functions of type T (const
std::vector<T>&) (with or without LB simulation) are passed. Flow simulations have to be en-
capsulated by a suitable wrapper that accepts the control variables, runs the simulation and
computes the objective. In the Solver app structure, the getCallable method fulfills this task.
An introduction into forward automatic differentiation, whose application is not limited to op-
timization, is provided in the example showcaseADf.

7.3.3. Miscellaneous
7.3.3.1. Projection

In some situations, it is useful to link the numerical control variable and the controlled physical
quantity via a function, which is called projection in this context. This is best illustrated by the
following example: for topology optimization, it is a common approach to use a scalar porosity
field d : D → [0, 1] as the controlled (physical) quantity, where D is the discretized design do-
main, consisting of the respective mesh points. Using the porosity d as the (numerical) control

116
variable in the optimization, i.e. α = d, causes difficulties: we had to bound α between zero
and one. Handling these bounds in the optimization is possible, but it seems easier and more
stable to employ a bijective projection B : (−∞, ∞) → (0, 1), d = B(α) component-wise, s.t.
we have an unbounded optimiztion problem. Various typical projection functions are provided
in the file src/optimization/projection.h. Some of them are dedicated to resolve resolution-
dependent effects of the porosity, cf. [106] for more details.

7.3.3.2. Serialization

For distributed optimization problems, where the control is given by a spatial field, we
need a mapping between (serial) control vector and two- or threedimensional field. We
call this mapping a serialization. Both its evaluation and its inversion are necessary in
pratice. Two variants are implemented in the file src/optimization/serialization.h: The
SimpleGeometrySerializer uses a cartesian mapping serial_index = cuboid_offset + NX*NY*z
+ NX*y + x. Here, cuboid_offset is the position of the cuboid in the control vector, Nx, Ny, Nz
are the local coordinates of the mesh point. This method is simple and fast, but the resulting
number of control variables is independent of the size of the design domain (the domain where
the control variables are active). Hence, the length of the control vector may be much larger
than the size of the design domain. All in all, this is good for adjoint optimization, where the
computational effort does not depend on the number of control variables.
This is different for direct optimization with difference quotients or forward AD. In that case,
the SparseGeometrySerializer is usually better, since it indicates only the cells within the design
domain and, hence, keeps the number of control variables at a minimum.

7.3.4. Parameter Explanation and Reading from XML


In this short overview the relevant parameters for an app with optimization are listed with the
respective names for using an xml-file for the input. The different parameters of each part in the
xml-file are explained via a table of the following form:

[declaration & definition] Default Explanation: (if avail-


Parameter Name(type)
(&& and other usages) value able, all) possibilities

The explanation of each column is as follows:


• Parameter: Name of the parameter in the xml-file
• Name (type): Name of the parameter in the source code and its data type in brackets.
• [declaration & definition] (&& other usages): Location of declaration and definition (&& and
sometimes some other important usages) of the parameter, e.g.:
[solver.h & solver.hh] (&& example.cpp)
If the location of the declaration and definition is the same, only one location is indicated, e.g.
[solver.hh]
If there is more than one location for the declaration and definition, it is indicated with an and, e.g.

117
[solver.h & solver.hh] and [optiCaseDual.hh]
If there are different possibilities for the location of declaration and definition, it is indicated with
curved brackets () and an or, e.g.
([optimizerSteepestDescent.h] or [optimizerLBFGS.h] or [optim
izerBarzilaiBorwein.h])
• Default value: Is this parameter essential or optional?
If a parameter is optional, it does not need to be defined. Then, the default value can be seen in this
column.
If a parameter is of such importance that without it the program has to exit, it is labeled as EXIT.
Some parameters are indicated with unused which means, that the parameter is read but not used
afterwards.
• Explanation: (if available, all) possibilities: Brief description and explanation of the parameter.
Some parameters have different possibilities for their definition. In this case, all available possibili-
ties are also offered in bold type letters, e.g.:
ad for OptiCaseAD or
dual for OptiCaseDual or
adTest for OptiCaseADTest
The arrangement of the parameters in the xml-file has the following structure:
example:

<Param>
<Optimization>
<MaxStepAttempts> 20 </MaxStepAttempts>
</Optimization>
</Param>

118
[declaration & definition] Default Explanation: (if avail-
Parameter Name(type)
(&& and other usages) value able, all) possibilities
Defines the control type
ControlType _controlType [optiCaseDual.h &
to optimize
(std::string) optiCaseDual.hh] force or
porosity
[optiSolverParam Defines the number of
Control- _control- eters.h] (&& control material, here 6
0
Material Material (int) optiCaseDual.h and for porosity optimization
optiCaseDual.hh) problems
[optiSolverParam
Field- _fieldDim eters.h] (&& Spatial dimension of
0
Dimension (int) optiCaseDual.h and controlled field
optiCaseDual.hh)
([optiCaseDual.h &
optiCaseDual.hh])
and ([optimizer.h &
optimizer.hh]) ([optim Upper limit for number
Dimension-
_dimCtrl (int) izerSteepestDescent. EXIT of control variables, so
Control
h] or [optim far not read by xml
izerLBFGS.h] or [optim
izerBarzilaiBorwein.
h]) and [controller.h]
([optim
izerSteepestDescent.
h] or [optim
_lambda Determines the initial
Lambda izerLBFGS.h] or [optim 1
(double) step length lambda
izerBarzilaiBorwein.
h]) (&& optim
izerLineSearch.h)
[optimizer.h & (optim
izerSteepestDescent.h
or optimizerLFBGS. Maximal number of iter-
MaxIter _maxIt (int) 100
h or optim ations of the optimizer
izerBarzilaiBorwein.
h)]
[optimizer.h & (optim
izerSteepestDescent.h Maximal number of at-
MaxStep- _maxStep- or optimizerLFBGS. tempts at each optimiza-
20
Attempts Attempts (int) h or optim tion step for finding a
izerBarzilaiBorwein. suitable step size
h)]

119
if true, the optimiza-
tion fails when reach-
[optimizer.h] (&& ing _maxIt and prints the
optimizer.hh or warning: Optimization
FailOnMax- _failOn-
optimizerLBFGS.h or 1 (true) problem failed to con-
Iter MaxIter (bool)
optimizerLineSearch. verge within specified it-
h) eration limit of _maxIt
iterations with tolerance
of _eps
Tolerance of the opti-
mizer. Optimizer stops if
the norm of the vector of
[optimizer.h] (&& optim
Tolerance _eps (double) 1e-10 derivatives of the object
izer.hh)
functional is smaller than
the tolerance of the opti-
mizer
Maximal number of
L _l (int) [optimizerLBFGS.h] 20 stored iteration steps for
LBFGS algorithm.
[optimizer.h &
Print Warnings and fur-
_verboseOn optimizer.hh] (&&
Verbose 1 (true) ther output in the termi-
(bool) optimizerLineSearch.
nal, if true
h)
([optim
izerSteepstDescent.h]
or [optimizerLBFGS.
Name of the file that con-
InputFile- fname h] or [optim
"control.dat" tains the initial guess for
Name (std::string) izerBarzilaiBorwein.
the control values
h]) (&& optimizer.h and
optimizerLineSearch.
hh)
([optimizer.h & optim
izerSteepestDescent.h Optimization stops if the
ControlTol- _controlEps and optimizerLBFGS. change of the control
0
erance (double) h and optim variables is less than this
izerBarzilaiBorwein. tolerance.
h] (&& optimizer.hh)
Defines the step condi-
StepCondi- stepCondition ([optimizerSteepest- "Armijo"
tion:
tion (std::string) Descent.h] or (SD), None or
[optimizerLBFGS.h] or "Strong Smaller or
[optimizerBarzilai- -Wolfe" Armijo or

120
Borwein.h]) (LBFGS), Wolfe or
(&& optim
"None", StrongWolfe
izerLineSearch.h)
(Barzilai-
Borwein)
[optimizer.h & optim
izerLBFGS.h or optim Determines, whether
Vector- _vectorBounds izerBarzilaiBorwein. bounds on the control
0 (false)
Bounds (bool) h] (&& optimizer.h and variables are applied
optimizerLineSearch. component-wise
h)
mapping method be-
_projection-
[optiCaseDual.h & tween the physical and
Projection Name
optiCaseDual.hh] computational control
(std::string)
variables:
Sigmoid or
Rectifier or
Softplus or
Baron or
Krause or
Foerster or
FoersterN or
StasiusN or
_compute-
Reference- [optiCaseDual.h & states if reference solu-
Reference false
Solution optiCaseDual.hh] tion is available
(bool)
Determines the intial
startValue (&& optimizer.h and
StartValue 0 guess for the optimiza-
(double) optimizer.hh)
tion algorithm
([optimizer.h & optim
izerSteepestDescent.h
false in all Sets an upper bound for
Upper- _upperBound and optimizerLBFGS.
three opti- the control values during
Bound (double) h and optim
mizers the optimization process
izerBarzilaiBorwein.
h] (&& optimizer.hh)
([optimizer.h & optim
izerSteepestDescent.h
false in all Sets a lower bound for
Lower- _lowerBound and optimizerLBFGS.
three opti- the control values during
Bound (double) h and optim
mizers the optimization process
izerBarzilaiBorwein.
h] (&& optimizer.hh)
VisualizationGnuplot:

121
[optim
Lists the parameters that
izerSteepestDescent.
gplotAnalysis- will be plotted by Gnu-
Visualized- h] or [optim
String plot during the optimiza-
Parameters izerLBFGS.h] or [optim
(std::string) tion process. Possibili-
izerBarzilaiBorwein.
ties:
h]
VALUE
CONTROL
DERIVATIVE
ERROR
NORM_DERIVATIVE

122
8. Examples

8.1. Example Overview


A list of the currently included examples is given in the following table together with related keywords.

123
multi multi porous transient STL geometry check-
folder example turbulent thermal particles benchmark showcase
Component Phase Media flow geometry primitives pointing
adsorption3D
adsorption
microMixer3D
advectionDiffusionReaction2d
advectionDiffusionReaction2dSolver
convectedPlate3d
advectionDiffusion reactionFiniteDifferences2d
Reaction advectionDiffusion1d
advectionDiffusion2d
advectionDiffusion3d
advectionDiffusionPipe3d
bstep2d
bstep3d
cavity2d
cavity2dSolver
cavity3d
cavity3dBenchmark
laminar
cylinder2d
cylinder3d
poiseuille2d(Eoc)
poiseuille3d(Eoc)
powerLaw2d
testFlow3dSolver
airBubbleCoalescence3d
binaryShearFlow2d
contactAngle2d
contactAngle3d
fourRollMill2d
microFluidics2d
multiComponent phaseSeperation2d
phaseSeperation3d
rayleighTaylor2d
rayleighTaylor3d
waterAirflatInterface2d
youngLaplace2d
youngLaplace3d

example includes relevant subject example includes relevant subject and is recommended for beginning
multi multi porous transient STL geometry check-
folder example turbulent thermal particles benchmark showcase
Component Phase Media flow geometry primitives pointing
domainIdentification3d
parameterIdentificationPoiseuille2d
optimization showcaseADf
showcaseRosenbrock
testFlowOpti3d
bifurcation3d
dkt2d
particles
magneticParticles3d
settlingCube3d
city3d
porousPoiseuille2d
porousMedia
porousPoiseuille3d
resolvedRock3d
galliumMelting2d
porousPlate2d
porousPlate3d
porousPlate3dSolver
thermal rayleighBernard2d
rayleighBernard3d
squareCavity2d
squareCavity3d
stefanMelting2d
aorta3d
channel3d
turbulent nozzle3d
tgv3d
venturi3d
breakingDam2d
breakingDam3d
deepFallingDrop2d
freeSurface
fallingDrop2d
fallingDrop3d
rayleighInstability3d

example includes relevant subject example includes relevant subject and is recommended for beginning
All the demo codes can be compiled with or without MPI, with or without OpenMP, and executed in
serial or parallel.

8.2. adsorption
8.2.1. adsorption3D
This example shows the adsorption in a batch reactor using an Euler–Euler approach. The model is based
on the linear driving force model and uses advection diffusion reaction lattices for particles, solute and
particle loading. Different isotherms and mass transfer models can be used. An analytical solution is
implemented when using the linear isotherm and surface diffusion.

8.2.2. microMixer3D
This example portrays the adsorption in a static mixing reactor using an Euler–Euler approach. Analogue
to the example before, the model is based on the linear driving force model and uses advection diffusion
reaction lattices for particles, solute and particle loading. Different isotherms and mass transfer models
can be used.

8.3. advectionDiffusionReaction
8.3.1. advectionDiffusionReaction2d
This example illustrates a steady-state chemical reaction in a plug flow reactor. One can choose two types
of reaction, A −→ C and A ←→ C. The concentration and analytical solution along the centerline of
the rectangle domain is given in ./tmp/N<resolution>/gnuplotData as well as the error plot for
the concentration along the centerline. The default configuration executes three simulation runs and the
average L2 -error over the centerline is computed for each resolution. A plot of the resulting experimental
order of convergence is provided in ./tmp/gnuplotData/.

8.3.2. reactionFiniteDifferences2d
Similarly to the previous example, a simplified domain with no fluid motion and homogeneous species
concentrations is simulated, but here with finite differences. The chemical reaction |a|A −→ |b|B is ap-
proximated, where the reaction rate ν = A/t0 is given, where t0 is a time conversion factor. The initial
conditions are set to A(t = 0) = 1 and B(t = 0) = 0 such that an analytical solution is possible via
 
|a| t
A(t) = exp − , (8.1)
t0
  
b |a| t
B(t) = 1 − exp − . (8.2)
a t0

By default, the executable produces a plot in ./tmp/gnuplotData/ which is given in Figure 8.1 below.

126
1

0.8

0.6

0.4

0.2 A analytical
A numerical
B analytical
B numerical
0
0 5 10 15 20

Figure 8.1.: Solutions to concentration profiles in reactionFiniteDifferences2d.

8.3.3. advectionDiffusion1d
The advectionDiffusion1d example showcases second order mesh convergence of LBM for scalar
linear one-dimensional advection–diffusion equations [40] of the form

∂t χ + ∂x F (χ) − µ∂xx χ = 0, (8.3)

where χ : X × I → R is the conservative variable dependent on space x ∈ X ⊆ R and time t ∈ I ⊆ R+


0 ,

F ≡ uχ is a linear function defined by the advection velocity u ∈ R, and µ > 0 denotes the diffusion coeffi-
cient. Hence, the LBM approximates the transport of the conservative variable χ along a one-dimensional
line with periodic boundary conditions on X = [−1, 1]. The initial pulse is defined by a sine profile, which
is subsequently diffused and advected. An analytical solution at point x and time t is given by

χ⋆ (x, t) = sin [π(x − ut)] exp −µπ 2 t .



(8.4)

In practice the simulation uses a two-dimensional square domain which is evaluated along a centerline to
obtain the desired one-dimensional result. The domain is initialized with χ⋆ (x, t = 0).
Diffusive scaling is applied which results in the input parameters listed in Table 8.1. In the de-
fault setting, advectionDiffusion1d executes three simulation runs with increasing resolutions N =
50, 100, 200, respectively. Each simulation recovers µ = 1.5 and a Péclet number of P e = 40/3.

diffusive scaling △t = △x2 for P e = 40/3


N uL △x
50 0.4 0.04
100 0.2 0.02
200 0.1 0.01

Table 8.1.: Default simulation parameters of advectionDiffusion1d with µ = 1.5

127
The output of each simulation run is stored in the tmp/N<number> directory. At each simulation time
step the average L2 relative error over the centerline is computed. Said average is then stored within the
respective resolution directory ./gnuplotData/data/averageL2RelError.dat. Additionally, the
program averages the values in averageL2RelError.dat for each simulation run, which in turn is writ-
ten to the global error file tmp/gnuplotData/data/averageSimL2RelErr.dat. For post-processing,
a python3 script can be executed via

python3 advectionDiffusion1dPlot.py

The script requires the matplotlib python package which can be installed on any platform by issuing
the following commands in a terminal:

python3 -m pip install -U pip


python3 -m pip install -U matplotlib

The script generates basic error plots for every file with the file extension .dat in ./tmp. Finally, a global
log-log error plot with reference curves is extracted from the data contained in averageSimL2RelErr.
dat.

8.3.4. advectionDiffusion2d
The example advectionDiffusion2d acts as a mesh-convergence test for a solution to the scalar linear
two-dimensional advection–diffusion equation

∂t χ + ∇x F (χ) − µ∆x χ = 0, (8.5)

where χ : X × I → R is the conservative variable dependent on space x ∈ X ⊆ R2 and time t ∈ I ⊆ R+


0 ,

F ≡ uχ is a linear function defined by the advection velocity u = (ux , uy )T ∈ R2 , and µ > 0 denotes the
diffusion coefficient. Similarly, the analytical solution is given for any point x = (x, y)T and time t as

χ⋆ (x, y, t) = sin [π (x − ux t)] sin [π (y − uy t)] exp −2µπ 2 t .



(8.6)

The simulation is executed on a square X = [−1, 1]2 which is periodically embedded in R2 . An error
norm over the domain measures the deviation from the analytical solution up to the time step at which
the initial pulse is diffused below 10%. For the default setting (µ = 0.05 and P e = 100), the outputs of
three subsequent simulation runs are stored in a subfolder structure in ./tmp and directly post-processed
for visualization. A sequence of contour plots is generated with the highest computed resolution N = 200
and contained in ./tmp/N200/imageData. Note that via issuing the command

python3 advectionDiffusion2dPlot.py

an error plot can be produced, which numerically validates the second order convergence in space.

128
8.3.5. advectionDiffusion3d
The example advectionDiffusion3d acts as a mesh-convergence test for a numerical solution to initial
value problem 
∂t χ(x, t) + ∇x F (χ(x, t)) − µ∆x χ(x, t) = 0 in X × I,
(8.7)
χ(x, 0) ≡ χ0 (x) in X ,

where χ : X × I → R is the conservative variable dependent on space x ∈ X ⊆ R3 and time t ∈ I ⊆ R+


0 ,

F ≡ uχ is a linear function defined by the advection velocity u = (ux , uy , uz )T ∈ R3 , and µ > 0 denotes
the diffusion coefficient. Note that the domain X = [−1, 1]3 is periodic.
The example implements a smooth initial profile χs0 (x) and an unsmooth version χu0 (x). The former is
a three-dimensional extrusion of the initial pulse in the advectionDiffusion2d example, such that the
equation admits the analytical solution [2, 40, 41]

χ⋆,s (x, y, z, t) = sin [π (x − ux t)] sin [π (y − uy t)] sin [π (z − uz t)] exp −3µπ 2 t .

(8.8)

: periodically connected in and outlets with forcing combined with different wall treatment. The latter
comprises a Dirac delta at x0 as initial pulse which induces a Dirac comb as super-positioned analytical
solution [2]
− (x − x0 − ux t + 2k)2
 
1 X
χ⋆,u (x, y, z, t) = √ exp +1 (8.9)
4πµt k∈Z 4µt

For each case several error norms over the domain measure the deviations from the analytical solution.
For the default setting the outputs of three subsequent simulation runs are stored in a subfolder structure
in ./tmp and directly post-processed for visualization. Via issuing the command

python3 advectionDiffusion3dPlot.py

an error plot is produced, which numerically validates the: periodically connected in- and outlets with
forcing combined with different wall treatment second order convergence for both initializations, under
the constraints on the grid Péclet number derived in [2].

8.3.6. advectionDiffusionPipe3d
This example implements a spreading Gaussian density package advecting within a square duct pipe
with velocity U . The precise description of the test-case can be found in [2]. Whereas the velocity is
computed via approximating the incompressible Navier–Stokes equations with a D3Q19 BGK LBM, the
advection–diffusion equation for the density package is solved with finite differences (FD). Four different
FD schemes can be employed within the example. The advantages of each of the schemes for a broad
range of P e are documented in [2].

8.3.7. convectedPlate3d
This example implements a convection and diffusion driven transport of a species from a plate where the
convective flow is a uniform plug flow streaming over the plate surface. The results are compared against

129
an analytic solution taken from [108]. It is given as
!
y
C(x, y) = Cp erfc p , (8.10)
4Dx/u0
where x is the coordinate along the plate, y the coordinate perpendicular to the plate, Cp the species
concentration on the plate surface, erfc(x) the complementary error function, D the diffusivity constant,
and u0 the convective velocity.

8.3.8. longitudinalMixing3d
This example models a transient longitudinal mixing process which is used to validate the implemented
robin-type boundary condition. In a one-dimensional bed of length L a solute concentration C(x, t) be-
haves according to:

∂t C + u∂x C = D∂x2 C, 0 ≤ x ≤ L, (8.11)

where u is the constant displacement velocity in x direction and D is a constant diffusion coefficient.
The initial condition is

C(x, 0) = 0, 0≤x≤L (8.12)

while the boundary conditions

uC(0, t) − D∂x C(0, t) =uCf , t≥0 (8.13a)

∂x C(L, t) =0, t≥0 (8.13b)

are imposed at the ends of the domain. Under these conditions the analytical solution C ∗ to the system is
found in [0] and is given by
"  r 2
(x − ut)2
  
∗ 1 x − ut u t
C (x, t) =Cf erfc √ + exp −
2 2 Dt πD 4Dt
(8.14)
2
   
1 ux u t  ux  x + ut
− 1+ + exp erfc √ .
2 D D D 2 Dt

This is an asymptotic solution that holds for small t. It is considered valid for t < 5 [0].

130
8.4. laminar
8.4.1. bstep2d and bstep3d
This example implements the fluid flow over a backward facing step. Furthermore, it is shown how
checkpointing is used to regularly save the state of the simulation. The 2D geometry corresponds to
Armaly et al. [82].

8.4.2. cavity2d, cavity2dSolver and cavity3d


This example illustrates a flow in a cuboid, lid-driven cavity. The 2D version also shows how to use the
XML parameter files and has an example description file for OpenGPI. This example is available in two
different versions for sequential and parallel use. The 2dSolver-version illustrates the use of the solver
class concept (cf. Section 10.11) together with the XML parameter interface.

8.4.3. cylinder2d and cylinder3d


This example examines a steady flow past a cylinder placed in a channel. The cylinder is offset somewhat
from the center of the flow to make the steady-state symmetrical flow unstable. At the inlet, a Poiseuille
profile is imposed on the velocity, whereas the outlet implements a Dirichlet pressure condition set by
p = 0, inspired by [133]. For high resolution, low latticeU, and enough time to converge, the results for
pressure drop, drag and lift lie within the estimated intervals for the exact results. An unsteady flow with
Karman vortex street can be created by changing the Reynolds number to Re = 100. The 3D version also
shows the usage of the STL-reader. The model was created using the open source CAD tool FreeCAD [78].

8.4.4. poiseuille2d and poiseuille3d


For basic tests of boundary conditions, a comparison with analytical solutions is the easiest and most
accurate approach. One of the fundamental applications of fluid dynamics is that of laminar flow of a
Newtonian fluid in a circular pipe. This is known as Poiseuille flow. The analytical solution is easily
found and is therefore a common benchmark case (see Figure 8.2). It is also one of the first examples in
most fluid dynamics text books for the application of the principles of fluid dynamics. The extension of
the Poiseuille flow in a round pipe from 2D to 3D is trivial, consequently it is also an ideal test case for
curved boundaries in 3D as well.

8.4.5. poiseuille2dEoc and poiseuille3dEoc


The examples poiseuille2dEoc and poiseuille3dEoc are extensions of the poiseuille2d and
poiseuille3d examples, respectively, focusing on the experimental order of convergence (EOC) of the
simulation. The implementations therein run the Poiseuille flow simulations multiple times to subse-
quently analyze the different error norms of the simulations with an automated gnuplot output (cf. Sec-
tion 6.5.1). Exemplary, the poiseuille3dEoc example is simulated for multiple configurations: the

131
Γwall
1

inflow 0 Γout outflow

Γin

0
Γwall
r2 0 2

r3 r1

Figure 8.2.: Geometry setup in example poiseuille3d with boundary patches and velocity
profile.

periodic pipe flow with forcing and several boundary methods at the walls (keywords interpolated, bouzidi,
and bounce back are used to specify the implementations in Section 5.1), as well as the a bounded domain
case where a velocity inlet and a pressure outlet are prescribed and combined with a bouzidi boundary at the
pipe wall. For all simulations, the Reynolds number of the flow problem is set to Re = 10 and the relax-
ation time to τ = 0.8. The EOC is investigated by running four consecutive simulations with increasing
grid resolution (number of lattice cells along the pipe diameter: ND = 21, 31, 41, 51;). A residual below
1e-5 stops each simulation via the convergence criteria (see Section 4.5.2 and Section 10.5). The absolute
error of the simulation from the analytic solution is evaluated on every lattice node whereby different
norms for the error calculation are used. The L1 -norm is defined as
N X
X D
Eabs,L1 = |ϕsim,i,d − ϕana,i,d | △xD , (8.15)
i d

where N is the number of lattice nodes, D is the number of dimensions of the variable ϕ, ϕsim is the
solution of a flow variable obtained by the simulation, ϕana is the corresponding analytic solution, and △x
is the grid spacing. The L2 -norm is defined as
v
uN D
uX X
Eabs,L2 =t |ϕsim,i,d − ϕana,i,d |2 △xD . (8.16)
i d

Finally, the L∞ -norm is defined as

Eabs,L∞ = max |ϕsim,i,d − ϕana,i,d | . (8.17)


i,d

The relative error according to the L1 -norm is defined as

Eabs,L1
Erel,L1 = PN PD , (8.18)
d |ϕana,i,d | △x
D
i

132
the L2 -norm as

Eabs,L2
Erel,L2 = qP P , (8.19)
N D
d |ϕana,i,d | △x
2 D
i

and the L∞ -norm as

Eabs,L∞
Erel,L∞ = . (8.20)
maxi,d |ϕana,i,d |

The results of the EOC tests are plotted in Figures 8.3, 8.4, 8.5, 8.6, and 8.7.

10−1
velocity L1 velocity L1
velocity L2 velocity L2
−2
velocity L∞ 10 velocity L∞
SR L1 SR L1
SR L2 SR L2
SR L∞ SR L∞
10−2
Eabs

Erel

10−3
−3
10
slope = −2
slope = −2

101.4 101.6 101.4 101.6


ND ND

(a) Absolute error (b) Relative error

Figure 8.3.: EOC study for periodic pipe flow simulations (poiseuille3dEoc) with forcing
and interpolated boundary walls. The absolute and relative error of the velocity and
the strain-rate (SR) are plotted.

8.4.6. powerLaw2d
This example describe a steady non-Newtonian flow in a channel. At the inlet, a Poiseuille profile is
imposed on the velocity, whereas the outlet implements a Dirichlet pressure condition set by p = 0.

8.4.7. testFlow3dSolver
This app implements a Navier–Stokes flow with an analytical solution [0]. The standard simulation as well
as an EOC computation for various error norms are shown. The implementation makes use of the Solver
framework, while simulation parameters are read from the corresponding parameter XML file. Among
others, the following options can be selected:
• The flow domain can be either a cube or restricted to a sphere (node [Application][Domain])
• Different LBM-boundary conditions (all of them implement a Dirichlet fixed velocity, node [Appli-
cation][BoundaryCondition])

133
velocity L1 velocity L1
velocity L2 velocity L2
velocity L∞ velocity L∞
SR L1 SR L1
10−1 SR L2 10−1 SR L2
slope = −1 SR L∞ slope = −1 SR L∞
Eabs

Eabs
10−2 10−2

10−3 10−3
slope = −2 slope = −2

101.4 101.6 101.4 101.6 101.8


ND ND

(a) Absolute error (b) Relative error

Figure 8.4.: EOC study for periodic pipe flow simulations (poiseuille3dEoc) with forcing
and bouzidi boundary walls. The absolute and relative error of the velocity and the
strain-rate (SR) are plotted.

101
velocity L1 velocity L1
velocity L2 velocity L2
101 velocity L∞ velocity L∞
SR L1 SR L1
SR L2 SR L2
SR L∞ 100 SR L∞
0
10
Eabs

Eabs

10−1
10−1 slope = −1

slope = −1
−2 −2
10 10
101.4 101.6 101.4 101.6 101.8
ND ND

(a) Absolute error (b) Relative error

Figure 8.5.: EOC study for periodic pipe flow simulations (poiseuille3dEoc) with forcing
and bounce-back boundary walls. The absolute and relative error of the velocity and
the strain-rate (SR) are plotted.

Especially, the EOC computation can be used to benchmark various aspects of LBM implementation, e.g.
the boundary conditions and the collision steps. The result on an example run is shown in Figure 8.8. The
class structure is as follows: TestFlowBase implements the basic simulation details (geometry, boundary
conditions etc.). Since it shall later be used for adjoint optimization, it inherits from AdjointLbSolverBase
. The specification TestFlowSolver then implements the standard simulation. The distinction between
TestFlowBase and TestFlowSolver is solely because TestFlowBase will also be used for optimization (cf.

134
velocity L1 velocity L1
101 velocity L2 velocity L2
velocity L∞ velocity L∞
SR L1 100 SR L1
0
10 SR L2 SR L2
slope = −1 SR L∞ slope = −1 SR L∞
−1
10
Eabs

Eabs
10−1

10−2
10−2

10−3 10−3
slope = −2 slope = −2
1.4 1.6
10 10 101.4 101.6 101.8
ND ND

(a) Absolute error (b) Relative error

Figure 8.6.: EOC study for pipe flow simulations (poiseuille3dEoc) with velocity inlet, pres-
sure outlet and bouzidi boundary walls. The absolute and relative error of the velocity
and the strain-rate (SR) are plotted.

pressure L1 pressure L1
100 pressure L2 pressure L2
pressure L∞ slope = −1 pressure L∞

10−1
10−1
slope = −1
Eabs

Eabs

10−2
10−2

10−3 10−3
slope = −2 slope = −2

101.4 101.6 101.4 101.6 101.8


ND ND

(a) Absolute error (b) Relative error

Figure 8.7.: EOC study for pipe flow simulations (poiseuille3dEoc) with velocity inlet, pres-
sure outlet and bouzidi boundary walls. The absolute and relative error of the pres-
sure are plotted.

examples in Section 8.5.5).

135
Figure 8.8.: EOC study for the example testFlow3dSolver on a sphere, as it is returned via
the gnuplot interface.

8.5. optimization
8.5.1. domainIdentification3d
In this example, a domain identification problem is solved: a cubic obstacle in the middle of a cubic fluid
flow domain has to be identified given only the surrounding fluid flow, cf. [19]. Therefore, an optimization
problem is set up, where the porosity field α in the design domain has to be found such that the error J
between resulting velocity field u and original velocity field u∗ is minimized. Hence, the objective is J :=
1
|u − u∗ |2 dx. The example employs adjoint Lattice Boltzmann methods for gradient computation.
R
2 Ω̃
Results are shown in Figure 8.9.
The implementation makes use of the Solver framework and the XML interface for parameter reading,
cf. Section 10.11. In addition to the typical parameters for simulation and optimization, the following
geometric domains are read from the xml file:

• <ObjectiveDomain>: The domain Ω̃, where the objective functional is computed.

136
Figure 8.9.: Results of domain identification after 97 optimization steps. Left: identified object,
surrounding velocity field and streamlines. Right: relative decrease of the objective
functional and its derivative w.r.t. the design variables. We can see that the objective
hardly changes after ca. 30 optimization steps.

• <SimulationObject>: The object that is to be identified by the example, it is used for the computa-
tion of the reference solution u∗.

• <DesignDomain>: the domain, where the porosity field is computed by the optimization algo-
rithm.

The OpenLB software provides the method createIndicator to create these domains or objects.
For example, a three-dimensional cuboid with length 1 in the origin can be created with
createIndicatorF3D<T>(reading from xml-file) with the input from the xml-file:
<IndicatorCuboid3D extend = "1 1 1" origin = "0 0 0"/>.
These domains are objects and depend on the specific optimization problem and have to be created or
modified for each problem again.
The following optimization parameters are specific in the context of domain identification and set in the
xml file:

• ReferenceSolution: decide whether the reference velocity u∗ is computed via a simulation

• StartValueType: select between Control, Permeability and Porosity

8.5.2. poiseuille2dOpti
This is a simple showcase for a two-dimensional fluid flow optimization/ parameter identification prob-
lem: It is based on the simulation of a planar channel flow similar to the example poiseuille2d. By an
optimization loop, the inlet pressure is determined s.t. a given mass flow rate is achieved. Hence, we
solve the following optimization problem:

1
argmin (m(u(p0 ), p(p0 )) − m∗ )2 , (8.21)
p0 2

137
where (u(p0 ), p(p0 )) is the solution of the Navier-Stokes equations corresponding to the inlet pressure p0 ,
m is the mass flow rate corresponding to that solution, and m∗ is the wanted mass flow rate.
The optimization is performed with the LBFGS method; the derivative of the objective w.r.t. the argu-
ment (inlet pressure) is computed with automatic differentiation.

8.5.3. showcaseADf
This app gives an introduction into the usage of forward automatic differentiation in order to compute
derivatives of any numerical quantities in OpenLB. It is written in the style of literate programming as a
tutorial that should allow for being read sequentially.

8.5.4. showcaseRosenbrock
This app gives an introduction into the usage of optimization functionalities in OpenLB in order to com-
pute optima of functions or simulations. It is written in the style of literate programming as a tutorial that
should allow for being read sequentially.

8.5.5. testFlowOpti3d
This app is built on top of the example laminar/testFlow3dSolver and solves parameter identification
problems with an optimization approach [0]. It makes use of the Solver framework to create the following
variants of application:

• Computing sensitivity of flow quantities w.r.t. the force field with Difference Quotients (DQ) or
Automatic Differentiation (AD)

• Optimization with AD: scale the force field s.t. the velocity or dissipation error w.r.t. a given
flow is minimized. 3 control variables scale the components of the force field. The objective is
J = 21 Ω |u − u∗ |2 dx, where u∗ is the reference solution field (for control = (1, 1, 1)) and u is the
R

simulated solution (velocity or dissipation) with ”estimated” control parameters.

• Optimization with adjoint LBM: identify the force field s.t. the velocity or dissipation error w.r.t. a
given flow is minimized. The objective is as above, but the control is the (spatially) distributed field
– three components at each mesh point.

The basic simulation setup is implemented in the base class TestFlowBase. Further optimization-specific
implementations (which are used in all optimization variants, e.g. objective computation) are done in
TestFlowOptiBase. For DQ and AD, TestFlowSolverDirectOpti implements the case-specific features, e.g.
that the vector of control variables acts as a scaling factor for the force field. For adjoint LBM, case-specific
features are implemented in TestFlowSolverOptiAdjoint. The structure of parameter classes corresponds
to to that of the solver classes.
Simulation and optimization parameters are read from the corresponding parameter xml file. The fol-
lowing optimization parameters are specific for this example and set in the xml file:

• ReferenceSolution: decide whether reference solution u∗ is computed via a simulation

138
• TestFlowOptiMode: decide, whether velocity or dissipation is compared in the objective functional

• OptiReferenceMode: use analytical or discrete (simulated) solution as u∗

• ControlType (force): type of control parameters in the context of adjoint optimization

• CuboidWiseControl: if true, then the domain is decomposed into an arbitrary number of cuboids
and one control variable is set as a scaling factor for each cuboid (this is meaningful if the sensitivies
are computed with finite difference quotients or automatic differentiation.

The results of example runs with AD and adjoint LBM are shown in Figure 8.10.

Figure 8.10.: Decrease of the objective for increasing optimization steps for different resolutions
(left: AD, right: adjoint LBM). The trends differ since the number of control vari-
ables is fixed in the AD case while it increases with the resolution in the adjoint
optimization case.

8.6. multiComponent
The examples in this folder demonstrate the use of the pseudo-potential model (section 4.4.1) and the
free-energy model (section 4.4.3) and for multi-component and multi-phase flows.

8.6.1. airBubbleCoalescence3d
Two bubbles of air in quiescent water are initialized close to each other so that they start to coalesce from
the first time step under standard conditions. The mixture consists of water, oxygen and nitrogen and
uses the same multi-component-multi-phase model as well as equation of state as the model from the
flat interface between water and air, see Section 8.6.8 below. It is important to mention that for these
thermodynamic models, currently only periodic boundary conditions are available in OpenLB.

8.6.2. binaryShearFlow2d
A circular domain of one fluid phase is immersed in a rectangle filled with another fluid phase. The
top and bottom walls are moving in opposite directions, such that the droplet shaped phase is exposed

139
to shear flow and deforms accordingly. The default parameter setting is taken from [107] and injected
into the more general ternary free energy model from [125]. Both scenarios, breakup and steady state of
the initial droplet, are implemented and visualized as .vtk output. Reference simulations are provided
in [43].

8.6.3. contactAngle2d and contactAngle3d


In this example a semi-spherical droplet of fluid is initialized within a different fluid at a solid boundary.
The contact angle is measured as the droplet comes to equilibrium. This is compared with the analyti-
cal angle predicted by the parameters set for the boundary (100 degrees for preset values). This example
demonstrates how to use the solid wetting boundaries for the free-energy model with two fluid compo-
nents.

8.6.4. fourRollMill2d
Here, a spherical domain filled with one fluid phase is immersed in a square filled with another phase
of equal density and viscosity. Four circle structures which represent roller sections are equidistantly dis-
tributed in the corners of the domain. The bottom left and top right cylinders begin to spin in counterclock-
wise direction. Whereas the top left and bottom right cylinders spin in clock-wise direction. A velocity
field of extensional type deforms the initial droplet accordingly. Dependent on the non-dimensional pa-
rameter setting in the example header, the droplet reaches steady state or breaks up. Reference simulations
are provided in [43].

8.6.5. microFluidics2d
This example shows a microfluidic channel creating droplets of two fluid components. Poiseuille velocity
profiles are imposed at the various channel inlets, while a constant density outlet is imposed at the end of
the channel to allow the droplets to exit the simulation. This example demonstrates the use of three fluid
components with the free energy model. It also shows the use of open boundary conditions, specifically
velocity inlet and density outlet boundaries.

8.6.6. phaseSeparation2d and phaseSeparation3d


In these two examples, the simulation is initialized with a given density plus small, random variation
over the domain. This condition is unstable and leads to liquid-vapor phase separation. Boundaries are
assumed to be periodic. These examples illustrate the usage of multiphase flow models akin to the ones
proposed in [126] in OpenLB.

8.6.7. rayleighTaylor2d and rayleighTaylor3d


This example demonstrates Rayleigh–Taylor instability in 2D and 3D, generated by a heavy fluid penetrat-
ing a light one. The multicomponent fluid model by X. Shan and H. Chen is used [126]. These examples

140
show the usage of multicomponent flow models in periodic domains.

8.6.8. waterAirflatInterface2d
This example illustrates how to compute the thermodynamic equilibrium of a mixture described by an
empirical equation of state with complex mixing rules using a multi-component-multi-phase LBM. The
multi-phase description is based on the work of Czelusniak et al. [92], while the multi-component force
splitting is based on Peng et al. [121]. We focus on the case of a non-curved phase interface with periodic
boundary conditions.

8.6.9. youngLaplace2d and youngLaplace3d


In this example the two-component free energy model is used in its simplest configuration to perform
a Young–Laplace pressure test. A circular or spherical domain of a fluid with radius R is immersed in
another fluid. A diffusive interface forms and the pressure difference across the interface, ∆p, is calculated
and compared to that given by the Young–Laplace equation,

γ α
∆p = = (κ1 + κ2 ) for 2D, (8.22)
R 6R
2γ α
∆p = = (κ1 + κ2 ) for 3D. (8.23)
R 3R

The parameters α and κi are input parameters to the simulation which define the interfacial width and
surface tension, γ, respectively. The pressure difference is calculated between a point in the middle of the
circular domain and a point furthest away from it in the computational domain.

8.7. particles
8.7.1. bifurcation3d
The bifurcation3d example simulates particulate flow through an exemplary bifurcation of the human
bronchial system. The geometry is a splitting pipe, with one inflow and two outflows. The fluid is trans-
porting micrometer scale particles and the escape and capture rate is computed. There exist two implene-
tations of the problem. The first one is a Euler–Euler ansatz, meaning that the fluid phase as well as the
particle phase are modelled as continua. The second is an Euler–Lagrange ansatz, where the particles are
modelled as discrete objects.

8.7.1.1. eulerEuler

In this example the particles are viewed as a continuum and described by a advection–diffusion equation.
This is done similar to the thermal examples, where the temperature is the considered quantity. For par-
ticles however, inertia has to be taken into account. This is achieved by applying the Stokes drag force to
the velocity field. Since for this computations also the velocity of the previous time step is required, the

141
new descriptor ParticleAdvectionDiffusionD3Q7Descriptor has to be used, that is capable of saving 2
velocity fields. Besides an extra lattice for the advection–diffusion equation, a SuperExternal3D structure
is required to manage the communication for parallel execution.
1 SuperExternal3D<T,ADDESCRIPTOR,descriptors::VELOCITY> sExternal(
2 superGeometry,
3 sLatticeAD,
4 sLatticeAD.getOverlap());
5
6 ...
7
8 sExternal.communicate();

The function communicate() is called in the time loop and handles the communication analogue to the
lattices.
Furthermore the new dynamics object ParticleAdvectionDiffusionBGKdynamics is required to access
the saved velocity fields correctly and use them in an efficient way. For information on the coupling of the
lattices we refer to the section on the advection–diffusion equation for particle flow problems 4.5.5.2. In
this example only the Stokes drag is applied by
1 advDiffDragForce3D<T, NSDESCRIPTOR> dragForce( converter,radius,partRho );

For the simulation of particles as a continuum, also new boundary conditions are required. Here
setZeroDistributionBoundary represents an unidirectional outflow condition, that removes particle
concentrations that cross a boundary. For the usual outflow at the bottom of the bifurcation a new
AdvectionDiffusionConvectionBoundary for advection–diffusion lattices can be applied, that approx-
imates a Neumann boundary condition, for further reference see [48]. Since non-local computations (gra-
dient is required) are performed on the the external field, also a Neumann boundary condition is required
that is here implemented as setExtFieldBoundary.

8.7.1.2. eulerLagrange

The main task of his example is to show the using of Lagrangian particles with OpenLB. As this example is
used to show the application of the particle framework (8.7), implementation specifics can be found there.

8.7.2. dkt2d
OpenLB provides an alternative approach to conventional resolved particle simulation methods, referred
to as the homogenised lattice Boltzmann method (HLBM). It was introduced in "Particle flow simulations
with homogenised lattice Boltzmann methods" by Krause et al. [22] and extended for the simulation of
3D particles in Trunk et al. [48]. It was eventually revisited in [50]. In this approach the porous media
model, introduced into LBM by Spaid and Phelan [130], is extended by enabling the simulation of moving
porous media. In order to avoid pressure fluctuations, the local porosity coefficient is used as a smoothing
parameter. For solid-solid interactions, a discrete contact model is used [31].
The example dkt2d employs said approach for the sedimentation of two particles under gravity in a
water-like fluid in 2D. The rectangular domain is limited by no-slip boundary conditions. This setup is

142
usually referred to as a drafting–kissing–tumbling (DKT) phenomenon and is widely used as a reference
setup for the simulation of particle dynamics submerged in a fluid. The benchmark case is described e.g.
in [96, 135]. For the calculation of forces a DNS approach is chosen which also leads to a back-coupling of
the particle on the fluid, inducing a flow. The example demonstrates the usage of HLBM in the OpenLB
framework as well as the utilization of the gnuplot-writer to print simulation results.

8.7.3. magneticParticles3d
Warning: This example can currently only be run sequentially! High-gradient magnetic separation is a
method to separate ferromagnetic particles from a suspension. The simulation shows the deposition of
magnetic particles on a single magnetized wire and models the magnetic separation step of the complete
process.

8.7.4. settlingCube3d
The case examines the settling of a cubical silica particle under gravity in a surrounding fluid. The rect-
angular domain is limited by no-slip boundary conditions. For the calculation of forces a DNS approach
is chosen which also leads to a back-coupling of the particle on the fluid, inducing a flow. The exam-
ple demonstrates the usage of HLBM in the OpenLB framework, the use of the particle decomposition
scheme [32], as well as the utilization of the gnuplot-writer to print simulation results (Section 8.7.2).

8.8. porousMedia
8.8.1. city3d
This example uses HLBM to simulate a city or similar as porous media. An example geometry of the
KIT campus can be downloaded with the script download_large_geometry.sh. To use this script
type in your Linux terminal ./dowload_large_geometry.sh. It uses the atmBoundaryLayer from Sec-
tion 6.11.8 as inlet profile to change the initial velocity depending on the height. The example comes with
a XML to change parameters of the UnitConverter and atmBoundary without the need to recompile.

8.8.2. porousPoiseuille2d and porousPoiseuille3d


These examples simulate Poiseuille flows through porous media. The implementation reproduces the
benchmark Example A by Guo and Zhao [102]. The theoretical maximum velocity is calculated as in [102,
Equation (21)], and the velocity profile as in [102, Equation (23)]. For a schematic simulation setup see
Figure 8.12.

8.8.3. resolvedRock3d
This example illustrates the consideration of resolved porous media. It does this by importing a VTI file
representing the geometry of the rock and applying a constant pressure gradient in one direction and

143
Figure 8.11.: Showcase of the city3d example with the geometry of the Karlsruhe Institute of
Technology Campus South.

Γwall
1

Γin Ω

inflow outflow

Γout

0
Γwall
r2 0 2

r1

Figure 8.12.: Geometry used in the example porousPoiseuille2d with boundary patches
and velocity profile.

periodic boundary conditions in all other directions. The no-slip boundary condition on the channel walls
is enforced using bounce back. Please refer to Section 5.1 for more details on boundary conditions.
It is also possible to convert a TIFF series, that is for example provided on the Digital Rocks Portal
(https://www.digitalrocksportal.org), to a VTI file by following these steps:

1. Provide or Download a Medium: Obtain the TIFF series of the desired medium, for instance from
the Digital Rocks Portal (https://www.digitalrocksportal.org/projects/92 [119]).

2. Convert the TIFF Series to VTI using ParaView [75]:


a) Open ParaView [75] and import the TIFF series.
b) If the "cell data to point data" filter is available, apply it. (Note: If the filter is grayed out, it is

144
not be necessary to apply it.)
c) Save the resulting data as a VTI file.
d) When saving, use ASCII formatting.
e) Export only one array from the data and make a note of its name for use as a command line
arguments to start the simulation (see below).

To start the simulation, use the following parameters in the specified order:

• Filename: Name of the VTI file representing the porous medium.

• Array name: Name of the array exported from ParaView [75] containing the data.

• Scaling-factor: Factor to scale the dimensions of the simulation.

• Time-scaling-factor: Factor to scale the time dimension of the simulation.

• Resolution: Resolution of the y-length.

• Pressure Drop: Magnitude of the applied pressure drop in Pa.

For example:
mpirun -np 6 ./resolvedRock3d rock.vti "Tiff Scalars" 2.5e-6 1.0 200 1.0
Note, that -np 6 specifies the number of processors, or cores, that will be used to run the simulation, hence
it may differ depending on the available resources. Ensure you replace the filename rock.vti with the
name of your VTI file, the array name "Tiff Scalars" with the name of the array you exported from
ParaView [75], and adjust the values for scaling factor, resolution, and pressure drop according to your
specific setup requirements.

8.9. reaction
8.9.1. advectionDiffusionReaction2d
This example illustrates a forward or a reversible reaction of a substance into another one (A→C or A↔C).
The concentration of each substance is modeled with a one-dimensional advection diffusion equation. We
consider a steady state in a plug flow reactor. This means that it is time independent and we have a
constant velocity field u. The chemical reaction is modeled with a linear but coupled source term. The
equations read

u∂x cA = D∂x2 cA − kH cA +kR cC
(8.24)
u∂x cC = D∂x2 cC + kH cA −kR cC

with the diffusion coefficient D > 0, forth reaction rate coefficient kH > 0 and backwards reaction rate
coefficient kR > 0 (= 0 in case of A → C). This LBM models the transport of the species concentration
along a one-dimensional line on [0, 10]. In practice the simulation uses a two-dimensional rectangular
domain which is evaluated along a centerline to obtain the desired one-dimensional result. The height of
the domain depends on the resolution which holds the number of voxels for the height constant.

145
On the bottom and the top of the rectangular periodic boundaries and at the inlet and outlet the con-
centrations from the analytical solution are set. The solution is given by

kR  
cA (x) = cA,0 eλx + cA,0 1 − eλx (8.25)
kH + kR
cC (x) = cA,0 − cA (x) (8.26)

u− u2 +4(kH +kR )D
with λ = 2D
.
Diffusive scaling is applied and a physical diffusivity of D = 0.1 and a flow rate of u = 0.5 which leads to
a Péclet number of P e = 100.
Every species has its own lattice and stored in a vector. In the simulate method we iterate over ever
element of the vector adlattices.
In the default setting, advectionDiffusionReaction2d executes three simulation runs with in-
creasing resolutions N = 200, 250, 300, respectively. The output of each simulation run is stored in
the tmp/N<number> directory. It contains a plot centerConcentrations.pdf of the concentra-
tions and the analytical solution along the centerline and an error plot of the numerical and analyt-
ical solution along the centerline ErrorConcentration.pdf. After each simulation has converged
the average L2 relative Error over the centerline is computed. Said average is then stored within
tmp/gnuplotData/data/averageL2RelError.dat. The order of convergence can be seen in the
log-log error plot in tmp/gnuplotData/concentration_eoc.png. The EOC plot is only done for one
species (A or C) and the species can be chosen in the return statement of the method errorOverLine.
One can select the reactionType a2c or a2cAndBack which automatically provides the data
for modelling the reaction. It contains the number of reactions, the reaction rate coefficients
physReactionCoeff[numReactions] (kH , kR ), the number of species numComponents and their names,
the stoichometric coefficients stochCoeff which are sorted according to the number of reactions and in-
side each reaction block according to the species number. Finally we assume that the reaction rate satisfies
a power law depending on the concentration of the species. The exponent is given by the reaction order
reactionOrders which is sorted in the same way as stochCoeff. In the example cases these exponents are
always 1. The chemical reaction itself is represented as a source term for each Advection Diffusion equa-
tion. This source term is calculated in the ConcentrationAdvectionDiffusionCouplingGenerator for
every species which can handle arbitrary number of species and reactions and stored in the field SOURCE.

8.10. thermal
8.10.1. galliumMelting2d
The solution for the melting problem (solid-liquid phase change) coupled with natural convection is found
using the lattice Boltzmann method after Huang and Wu [104]. The equilibrium distribution function for
the temperature is modified in order to deal with the latent-heat source term. That way, iteration steps or
solving a group of linear equations is avoided, which results in enhanced efficiency. The phase interface is
located by the current total enthalpy, and its movement is considered by the immersed moving boundary

146
scheme after Noble and Torczynski [120]. This method was validated by comparison with experimental
values (e.g. Gau and Viskanta [98]).

8.10.2. porousPlate2d, porousPlate3d and porousPlate3dSolver


The porous plate problem is implemented as described in [101] and [122]. To test the coupled model’s
accuracy and to determine its EOC, we use a numerical simulation the porous plate problem including
a temperature gradient and natural convection in a square cavity. The porous plate problem describes a
channel flow, where the upper cool plate moves with a constant velocity, and through the bottom warm
plate a constant normal flow is injected and withdrawn at the same rate from the upper plate. At the
left and right hand side of the domain a periodic boundary condition is applied and constant velocity and
temperature boundary conditions are applied to the top and bottom plates according to figure 8.13. An an-

ux = c1 ; uy = c2 ; T = Thigh

periodic periodic

ux = 0; uy = c2 ; T = Tlow

Figure 8.13.: Schematic representation of the porous plate’s simulation setup including the
boundary conditions.

alytical solution for the given steady state problem is given for the velocity and temperature distributions
by

eRe·y/L − 1
ux (y) = ux,0 ( ), (8.27)
eRe − 1
eP r·Re·y/L − 1
T (y) = T0 + ∆T = ( P r·Re ). (8.28)
e −1
uy,0 L
Here ux,0 is the upper plate’s velocity, Re = ν
the Reynolds number depending on the injected veloc-
ity uy,0 , the fluid’s viscosity ν and the channel length L. The temperature difference between the hot and
cold plate is given by ∆T = Th − Tc . First we implement a couple of simulations to scale the velocity and
temperature profiles for a range of the Reynolds number and Prandtl number. The relative global error is

147
computed via [101] rP
|T (xi ) − Ta (xi )|2
i
E= rP , (8.29)
|Ta (xi )|2
i

where the summation is over the entire system, Ta is the analytical solution (8.28). The
porousPlate3dSolver example implements the same simulation, but additionally illustrates the ap-
plication of the solver class concept.

8.10.3. rayleighBenard2d and rayleighBenard3d


The Rayleigh–Bénard convection is a typical case of natural convection, where the lower boundary is
heated and a regular pattern of convection cells is developed. This is a suitable test platform for thermal
algorithms, since the driving force is a coupling between momentum and energy equations by means of
a buoyancy force, which is function of the temperature, and the temperature varies spatially inside the
domain. This example demonstrates Rayleigh–Bénard convection rolls in 2D and 3D, simulated with the
thermal LB model by Guo et al. [100], between a hot plate at the bottom and a cold plate at the top.

8.10.3.1. Setup

The case considered has an aspect ratio (AR = Lx/Ly) of 2, which enhances the appearance of unstable
modes. The lower wall is heated with a constant temperature (T = 1), and the upper wall is isothermal
and cold (T = 0). The vertical walls are set to be periodic.
Among the example programs implemented in OpenLB, a demo code for the Rayleigh–Bénard convec-
tion in 2D and 3D is provided. This code is taken as a base for the development of most of the thermal
applications. For the simulation of the Rayleigh–Bénard convection only one modification is made to the
code regarding the initial conditions: to enhance the appearance of the convection cells, an instability
in the domain is introduced. The available code initializes a small area near the lower boundary with a
slightly higher temperature, introducing a perturbation in the system, whereas the rest of the domain is
initialized with the cold temperature. In the modified code there is no local perturbation, but the initial
temperature at the domain is dependent on the space coordinates. The domain is initialized with zero
velocity and a temperature field by using a functor according to

y x
T (x, y, t = 0) = Tmax [(1 − ) + 0.1cos(2π )]. (8.30)
Ly Lx

The files created to help with the initialization of the temperature field are called tempField.h and
tempField2.h. Listing 8.1 shows the corresponding usage. The first file computes the temperature at
every point of the lattice, as a function of its macroscopic position, and then this value is applied on the
lattice as the density (Line 3). The second file calculates the equilibrium distribution functions for every
node corresponding to the given temperature and zero velocity. Next, the populations are defined for the
desired material number in Line 4.
1 TemperatureField2D<T,T> Initial( converter );

148
adiabatic

T = Thot Fg T = Tcold

adiabatic

Figure 8.14.: Schematic diagram of the simulation domain for the example squareCavity2d.

2 TemperatureFieldPop2D<T,T> EqInitial( converter );


3 ADlattice.defineRho( superGeometry, 1, Initial );
4 ADlattice.definePopulations( superGeometry, 1, EqInitial );

Listing 8.1: Initialization of the temperature field

In (8.30), the y-dependent part of the equation matches the stationary solution of the problem, correspond-
ing to a case where there is no fluid movement and the heat transfer only occurs by conduction. The cosine
term introduces a disturbance in the system, which enhances the appearance of the convection cells.

8.10.3.2. Simulation Parameters

Computations are run for a range of different Rayleigh (3 · 103 , 6 · 105 ) and Prandtl numbers (0.3, 1).
The spatial resolution was fixed to 100 cells in the y-direction, and the time discretization was switched
between 10−3 and 10−4 , which give lattice velocities of 0.1 and 0.01 respectively. The convergence criterion
is applied on the average energy, and it is set to a precision of 10−5 .

8.10.4. squareCavity2d and squareCavity3d


A common application for the validation of thermal models is the numerical simulation of the natural
convection in a square cavity. For this configuration there is an extensive database in a wide range of
Rayleigh numbers, which allows to verify the accuracy of the thermal model.

8.10.4.1. Setup

The problem considered is shown schematically in Figure 8.14. The horizontal walls of the cavity are
adiabatic, while the vertical walls are kept isothermal, with the left wall at high temperature (Thot = 1) and
the right wall at low temperature (Tcold = 0).

149
The dynamics chosen for the velocity field is ForcedBGKdynamics, and for the temperature field
AdvectionDiffusionBGKdynamics.

8.10.4.2. Simulation Parameters

Taking air at 293K as working fluid, the value of the Prandtl number is P r = 0.71 and is kept constant. The
Rayleigh number ranges from 103 to 106 . Different spatial resolutions are tested for each Rayleigh number,
in order to study the grid convergence. The time-step size is adjusted so that the lattice velocity stays at the
value 0.02. This ensures that the Mach number is kept at incompressible levels. The convergence criterion
is set by a standard deviation of 10−6 in the kinetic energy.

8.10.4.3. MRT

The new implemented MRT model for thermal applications is first examined on the 2-dimensional cavity.
The only setup differences to the BGK model are the lattice descriptors (ForcedMRTD2Q9Descriptor and
AdvectionDiffusionMRTD2Q5Descriptor) and the dynamics objects selected, which are now specialized
for the MRT dynamics (ForcedMRTdynamics and AdvectionDiffusionMRTdynamics). This simulation is
used as a test for different important aspects of the implementation. First, the formulation of the MRT
model, particularly the values of the transformation matrix, the relaxation times and the sound speed of
the lattice are based on [113]. With this MRT model, variations over 10% with respect to the BGK model are
observed. A second formulation [114] is selected, which shows much closer results to the BGK model. No
special treatment is required to make use of the available boundary conditions. The number of iterations
required to achieve the desired precision, that is, the number of time steps until the steady-state solution
is reached, is found to be usually higher for the MRT simulations. Furthermore, the execution time is
between 4 and 8 times longer when compared to the BGK simulations.

8.10.5. stefanMelting2d
The solution for the melting problem (solid-liquid phase change) is computed using the LBM from Huang
and Wu [104]. The equilibrium distribution function for the temperature is modified in order to deal with
the latent-heat source term. That way, iteration steps or solving a group of linear equations is avoided,
which results in enhanced efficiency. The phase interface is located by the current total enthalpy, and its
movement is considered by the immersed moving boundary scheme after Noble and Torczynski [120].
Huang and Wu validated this method by the problem of conduction-induced melting in a semi-infinite
space, comparing its results to analytical solutions.

8.11. turbulent
8.11.1. aorta3d
In this example, the fluid flow through a bifurcation is simulated. The geometry is obtained from a mesh
in STL format. With Bouzidi boundary conditions, the curved boundary is adequately mapped and ini-

150
tialized entirely automatically. A Smagorinsky LES BGK model is used for the dynamics to stabilize the
turbulent flow simulation for low resolutions. The output is the flux computed at the inflow and outflow
region. The results have been validated through comparison with other results obtained with FEM and
FVM.

8.11.2. channel3d
This example features the application of wall functions in a bi-periodic, fully developed turbulent channel
flow for friction Reynolds numbers of Reτ = 1000 and Reτ = 2000. For the published results and further
reference see [11].

8.11.3. nozzle3d
On the one hand this example describes building a cylindrical 3D geometry in OpenLB, on the other hand
it examines turbulent flow in a nozzle injection tube using different turbulence models and Reynolds
numbers.
For characterization different physical parameters have to be set. The resolution N defines most phys-
ical parameters such as the velocity charU, the kinematic viscosity ν and two characteristic lengths charL
and latticeL. Physical length charL is used to characterize the geometry and the Reynolds number.
Lattice length latticeL defines the mesh size and is calculated as latticeL = charL/N . More informa-
tion about the parameter definitions are located in the file units.h. Figure 8.15 illustrates the geometry

Table 8.2.: This table shows the preset simulation parameters.


parameter value
charL 1m
1
latticeL 3m
charU 1 ms
2
ν 0.00002 ms
Reinlet 5000
turbulence model Smagorinsky

and the nozzle’s size as a function of the characteristic length charL. The nozzle consists of two circular
cylinders. The inflow (red) is located left in the inletCylinder. The outflow (green) is at the right end of the
injectionTube. At the main inlet, either a block profile or a power 1/7 profile is imposed as a Dirichlet veloc-
ity boundary condition, whereas at the outlet a Dirichlet pressure condition is set by p = 0 (i.e. rho = 1).
Two vectors, origin and extend, describe the center and normal direction of the cylinder’s circular start
(origin) and end (extend) plane. The radius is defined in the function.
As mentioned before, this example simulates turbulent fluid flow. The flow behavior in the inlet is char-
acterized by the Reynolds number. The following turbulence models are based on large eddy simulation
(LES). The idea behind LES is to simulate only eddies larger than a certain grid filter length, while smaller
eddies are modeled. Several models are currently implemented, e.g.:

151
Figure 8.15.: Cross section of a 3D geometry of nozzle3d in dependency of characteristic
length charL.

• The Smagorinsky model reduces the turbulence to a so called eddy viscosity. This viscosity de-
pends on the Smagorinsky constant, which has to be defined. This model has certain disadvantages
at the wall.

• The Shear-improved Smagorinsky model (SISM) is based on the Smagorinsky model. Compared
to the original model, the SISM works at the wall very well. Similarly, a model specific constant has
to be defined.

The following code shows the model selection. A model is selected, when the correlate line is uncom-
mented. Below, the model specific constants are defined. In this case the Smagorinsky model is selected.
Smagorinsky constant is set to 0.15.
1 /// Choose your turbulent model of choice
2
3 #define Smagorinsky
4
5 ...
6
7 #elif defined(Smagorinsky)
8 bulkDynamics = new SmagorinskyBGKdynamics<T, DESCRIPTOR>(converter.getOmega(),
instances::getBulkMomenta<T, DESCRIPTOR>(),
9 0.04, converter.getLatticeL(), converter.physTime());

As an example, Figure 8.16 shows the results with preset parameters. The simulations strongly depends

152
Figure 8.16.: Physical velocity field after 200 seconds with preset parameters (Smagorinsky
Model, CS = 0.15, latticeL = 13 m, Reinlet = 5000).

on the Smagorinsky constant’s value, used in the turbulence model. However, the constant is not a general
calculable value and valid for one model. It could be a function of the Reynolds number and/or another
dimensionless parameter. Thus, a physically useful value has to be found by trial and error, or chosen as
an educated guess in the beginning. Generally, if the constant’s value is chosen too small, the simulation
results will become unstable and/or unphysical. If the value is too large, the model will introduce too
much artificial viscosity and smooth the results.

8.11.4. tgv3d
The Taylor–Green vortex (TGV) is one of the simplest configurations to investigate the generation of small
scale structures and the resulting turbulence. The cubic domain Ω = (2π)3 with periodic boundaries and
the single mode initialization contribute to the model’s simplicity. In consequence, the TGV is a common
benchmark case for direct numerical simulation (DNS) as well as large eddy simulation (LES). This exam-
ple demonstrates the usage of different sub-grid models and visualizes their effects on global turbulence
quantities. The molecular dissipation rate, the eddy dissipation rate and the effective dissipation rate
are calculated and plotted over the simulation time. The results can be compared with a DNS solution
published by Brachet et al. [86].

8.11.5. venturi3d
This example examines a steady flow in a Venturi tube. A Venturi tube is a cylindrical tube, which has
a reduced cross-section in the middle part. At this constriction is an injection tube. As a result of the
accelerating fluid in the constriction, the static pressure decreases and the injection tube’s fluid is pumped

153
Figure 8.17.: Increasing Smagorinsky constant smoothens results and straightens the appearing
turbulence.

Figure 8.18.: Isosurface of vorticity for the Taylor–Green vortex at t = 12s [23].

in the main tube. The overall geometry is built with adding together single bodies. Each body’s geometry
is defined by certain points (position vectors) in the coordinate system and their radius. A cone-shaped
cylinder needs the center of the start an end circle as well as the radii. Following code builds the geometry
and shows the semantics.

154
1 /// Definition of the geometry of the venturi
2
3 //Definition of the cross-sections’ centers
4 Vector<T,3> C0(0,50,50);
5 Vector<T,3> C1(5,50,50);
6 Vector<T,3> C2(40,50,50);
7 Vector<T,3> C3(80,50,50);
8 Vector<T,3> C4(120,50,50);
9 Vector<T,3> C5(160,50,50);
10 Vector<T,3> C6(195,50,50);
11 Vector<T,3> C7(200,50,50);
12 Vector<T,3> C8(190,50,50);
13 Vector<T,3> C9(115,50,50);
14 Vector<T,3> C10(115,25,50);
15 Vector<T,3> C11(115,5,50);
16 Vector<T,3> C12(115,3,50);
17 Vector<T,3> C13(115,7,50);
18
19 //Definition of the radii
20 T radius1 = 10 ; // radius of the tightest part
21 T radius2 = 20 ; // radius of the widest part
22 T radius3 = 4 ; // radius of the small exit
23
24 //Building the cylinders and cones
25 IndicatorCylinder3D<T> inflow(C0, C1, radius2);
26 IndicatorCylinder3D<T> cyl1(C1, C2, radius2);
27 IndicatorCone3D<T> co1(C2, C3, radius2, radius1);
28 IndicatorCylinder3D<T> cyl2(C3, C4, radius1);
29 IndicatorCone3D<T> co2(C4, C5, radius1, radius2);
30 IndicatorCylinder3D<T> cyl3(C5, C6, radius2);
31 IndicatorCylinder3D<T> outflow0(C7, C8, radius2);
32 IndicatorCylinder3D<T> cyl4(C9, C10, radius3);
33 IndicatorCone3D<T> co3(C10, C11, radius3, radius1);
34 IndicatorCylinder3D<T> outflow1(C12, C13, radius1);
35
36 //Addition of the cylinders to overall geometry
37 IndicatorIdentity3D<T> venturi(cyl1 + cyl2 + cyl3 + cyl4 + co1 + co2 + co3);

Figure 8.19 visualizes the defined point positions and Figure 8.20 shows the computational geometry. At

Figure 8.19.: Schematic diagram visualizing the defined point positions for venturi3d.

155
Figure 8.20.: Built geometry as used in simulation of example venturi3d.

Figure 8.21.: Simulation of the the example venturi3d after 200 simulated time steps.

the main inlet, a Poiseuille profile is imposed as a Dirichlet velocity boundary condition, whereas at the
outlet and the minor inlet, a Dirichlet pressure condition is set by p = 0 (i.e. ρ = 1). Figure 8.21 visualizes
the computed velocity magnitude in the Venturi tube geometry.

8.12. freeSurface
The free surface approach [132] is a numerical simulation of two phases, where one of the phases does not
have to be simulated thoroughly. In the provided examples of this category, these two phases are usually
water and air, with the air phase being the one that is handled in a simplified manner.

8.12.1. breakingDam2d and breakingDam3d


The breakingDam2d example is based on a physical experiment conducted by LaRocque et al. [111]. An
enclosed box contains an area of fluid in the lower left corner. With the start of the simulation, the fluid
spreads throughout the box, with a visible wave forming after the fluid reaches the right-hand side wall.

156
Figure 8.22.: Initial breaking dam setup of breakingDam2d with the fluid in grey.

The example breakingDam3d extends the 2D example to 3D. A full simulation run of the
breakingDam3d example, visualized in ParaView, can be found on the OpenLB YouTube page (https:
//youtu.be/X8yeLCkUldQ).

8.12.2. fallingDrop2d and fallingDrop3d


This example type simulates a drop falling into a pool of the same liquid. Figure 8.23 shows an excerpt of
the first couple of time steps. On the right half of these steps, the forming of a so-called crown can be seen.

Figure 8.23.: Setup and some exemplary steps of the example fallingDrop2d.

8.12.3. deepFallingDrop2d
A variation of the fallingDrop2d example type with a deeper pool of liquid that a drop will fall into.
This adapted pool depth allows changes to the droplet properties, such as size, density or velocity, to be
more apparent in the simulation.

8.12.4. rayleighInstability3d
This example covers Plateau-Rayleigh instabilities. Figure 8.24 shows the initial setup as well as multiple
steps throughout a simulation of this example. Each of these steps after the initial setup was captured after
1800 additional simulation steps. The perturbation was made by setting the fill level on the border of the
main cylinder according to a sine wave with a wave length set to δ = 9R with R being the radius of the
cylinder. This initial sine wave can be seen in the setup step: Red cells are full, blue cells are empty.

157
Figure 8.24.: Initial setup and different steps of the Plateau-Rayleigh Instability in
rayleighInstability3d.

158
9. Building and Running
OpenLB is developed for high performance computing. As such, Linux-based systems are the first class
target platform. The reason for this is that most HPC clusters and, in fact, all of the 500 fasted supercom-
puters in the world (cf. https://www.top500.org/) run some kind of Linux distributions.

9.1. Install Dependencies


GNU Make in addition to a reasonably current C++ compiler supporting C++20 is all that is needed
to compile and run non-parallelized OpenLB applications. OpenLB is able to utilize vectorization
(AVX2/AVX-512) on x86 CPUs and NVIDIA GPUs for block-local processing. CPU targets may addi-
tionally utilize OpenMP for shared memory parallelization while any communication between individual
processes is performed using MPI. It has been successfully employed for simulations on computers rang-
ing from low-end smartphones up to supercomputers.
The present release 1.6 has been explicitly tested in the following environments:

• NixOS 22.11 and unstable (Nix Flake provided)

• Ubuntu 20.04, 22.04

• Red Hat Enterprise Linux 8.x (HoreKa, BwUniCluster2)

• Windows 10, 11 (WSL)

• MacOS 13

as well as compilers:

• GCC 9 and later

• Clang 13 and later

• Intel C++ 2021.4 and later

• NVIDIA CUDA 11.4 and later

• NVIDIA HPC SDK 21.3 and later

• MPI libraries OpenMPI 3.1, 4.1 (CUDA-awareness required for Multi-GPU); Intel MPI 2021.3.0 and
later

Other CPU targets are also supported, e.g. common Smartphone ARM CPUs and Apple M1/M2.

159
9.1.1. Linux
It is recommended to work on a Linux-based machine. Please ensure to have the above-mentioned depen-
dencies installed (Section 9.1). Further description is provided below in Section 9.1.3.

9.1.2. Mac
Configuring OpenLB on MacOS is explained for the release 1.5 using MacOS 11.6 in the technical report
TR6: Configuring OpenLB on MacOS [79].

9.1.3. Windows
The preferable approach is to use the Windows Subsystem for Linux (WSL) introduced in Windows 10.
The here described installation procedure has been tested with OpenLB 1.7 and Windows 10 x64.

1. Install Windows subsystem for Linux (WSL) with the Unbuntu distribution1 as described on
https://learn.microsoft.com/en-us/windows/wsl/install

2. Open your terminal (CMD) and type wsl to start the subsystem

3. Before installing the required libraries run:


sudo apt-get update

4. Next, install the g++ compiler, which you will need to compile C++ programs:
sudo apt-get install g++ make

5. To benefit from the efficient parallelization, you will probably want to run the program on more
than one core, so it is recommended to install Open-MPI:
sudo apt-get install openmpi-bin openmpi-doc libopenmpi-dev

6. Download the latest OpenLB release from


http://www.openlb.net/download/

7. Copy the downloaded file in your Linux user directory


a) Open your file explorer
b) Open your Linux file system (see Figure 9.1)
c) Go to Ubuntu/home/<Username> and copy the file

8. Type cd to get into your user directory

9. Type tar xvfz <filename> to unpack the folder 2

10. Finally, change to the root folder of OpenLB and type make to compile the software library and all
examples. If your system is set up correctly, you should see a lot compiler messages but no errors.

1 If problems occur at this stage, it is not related to OpenLB. Please check resources on the web or post in the forum
2 <filename> need to be replaced with the filename of the .tgz file

160
Figure 9.1.: File explorer in Windows 10 showing the Linux directory.

9.2. Compiling OpenLB Programs


OpenLB consists of generic, template-based code, which needs to be included in the code of application
programs, and dependency libraries that are to be linked with the program. The installation process is
light and does not require an explicit precompilation and installation of libraries. Instead, it is sufficient to
unpack the source code into an arbitrary directory. Compilation of libraries is handled on-demand by the
Makefile of an application/example program.
To get familiar with OpenLB, new users are encouraged to have a look at programs in the examples di-
rectory. Once inside one of the example directories, entering the command make will first produce libraries
and then the end-user example program. This close relationship between the production of libraries and
end-user programs reflects the fact that using OpenLB presently translates to writing a C++ program using
the OpenLB library functions.
The file config.mk in the root directory can be easily edited to modify the compilation process. Avail-
able options include the choice of the compiler (GNU g++ is the default), optimization flags, a switch be-
tween normal/debug mode, between sequential/openmp-parallel/mpi-parallel programs, and between
(un)vectorized CPU and GPU platforms.
Example configuration files for common build types and systems are included in the config/ directory
of the release tarball.
To compile your own OpenLB programs from an arbitrary directory, make a copy of a sample Makef
ile contained in a default example folder. Edit the OLB_ROOT entry to indicate the location of the OpenLB
source, and the EXAMPLE entry to explicit the name of your program, without file extension.
A minimal but perfectly sufficent development environment for OpenLB consists of a supported C++
compiler, a plain text editor of choice and the GNU core utilities including make. This means that many
Linux distributions either already include everything one needs to get started or at least include everything
in their default package repositories for convenient installation. One may of course also use more involved

161
text editors with e.g. debugger integration or even full integrated development environments.
For compiling OpenLB applications outside of the provided build system one only needs to make the
src folder available for inclusion and define the compiler and linker flags as they are visible in the printout
of the default build. However, simply calling the appropriate make target from inside the text editor/IDE
may be more convenient.

9.2.1. Using NVIDIA GPUs in OpenLB


The following section is a quick guide on how to install the CUDA functionality for Nvidia graphics
cards on both Windows or Linux. The first two sections describe how to install CUDA on Windows via
WSL or Linux, respectively. The third section discusses how to set up OpenMPI, a CUDA-aware MPI
implementation, and the fourth and final section explains how to configure OpenLB to make use of the
installed functionalities.

9.2.1.1. CUDA on Windows with WSL

As mentioned in the chapter about install dependencies (Section 9.1.3), the preferred approach for OpenLB
on Windows is to use the Windows Subsystem for Linux (WSL). The following was written with the
assumption that OpenLB has been successfully set up on WSL with Ubuntu.
The following specifications are needed to get CUDA running via WSL:

• Windows 10 version 21H2 or higher

• CUDA compatible Nvidia graphics card

• WSL 2 with a glibc-based distribution (e.g. Ubuntu)

To find out which Windows version exactly you’re using, open up the run dialog box in Windows and
type in the command winver, which will display a pop-up window similar to the one below:
In order to find out what graphics card you have and whether it is compatible with CUDA, open the
up the Windows run dialog and type in the command dxdiag, which will open the DirectX Diagnostic
Tool. Under the tab Render, it will display the information regarding your graphics card. In the exam-
ple picture of the DirectX Diagnostic Tool below, the graphics card is a NVIDIA GeForce GTX 1650.
NVIDIA provides the information on which graphics card is compatible with CUDA on their website
(https://developer.nvidia.com/cuda-gpus).
CUDA is only supported on version 2 of the Windows Subsystem for Linux (WSL). To confirm which
version of WSL is installed, open the Windows PowerShell with administrator rights and type in the
command

wsl --list --verbose

This will display which Linux distribution and which version of WSL is currently installed. The output
should look similar to the following:

162
Figure 9.2.: Pop-up window displaying the exact version and build of Windows.

PS C:\Windows> wsl --list --verbose


NAME STATE VERSION
* Ubuntu Stopped 1

In this example the distribution that is installed is Ubuntu and the WSL version is 1. Upgrading to the
necessary version 2 can be done by typing

wsl --set-version Ubuntu 2

into the PowerShell terminal. Note that when using a different distribution for WSL, the command has
to be adjusted accordingly.
An error might occur claiming that a certain hard-link target does not exist. This means that there is
software installed on WSL that collides with the update. The error message will provide the path of the
non-existing hard-link, which will be a hint onto which package causes this error. In the WSL terminal,
the command

sudo apt list --installed

will give an overview over all the installed packages. The conflicting package can then be removed with

sudo apt-get remove [PACKAGE-NAME]

Once the package has been removed, WSL can be upgraded. On a successful upgrade, we should receive
a message that the conversion is complete and we can verify the version with the

wsl --list --verbose

163
Figure 9.3.: The Render tab of the DirectX Diagnostic Tool.

command. The conflicting package can then be reinstalled.


In order for WSL to have access to the GPU hardware, virtual GPU needs to be enabled on Windows.
This can be done by installing an appropriate driver on Windows. It should not be necessary to install any
device drivers on WSL itself. It is even highly suggested not to do so, since any installation of a driver
on WSL itself might override the functionality provided by the driver that is installed onto Windows. As
of the writing of this guide, the most recent NVIDIA drivers automatically support virtual GPU for WSL.
The newest driver can be directly downloaded from the NVIDIA website (https://www.nvidia.com
/download/index.aspx). The website offers drop down lists to specify what product type, device,
operating system, etc. the driver is needed for. Once the most recent driver is installed, we can install the
CUDA toolkit on WSL.
The following commands typed into the WSL terminal will install the Nvidia CUDA toolkit on WSL
(Ubuntu):

sudo apt-key del 7fa2af80

wget https://developer.download.nvidia.com/compute/cuda/repos/
wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin

164
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/
cuda-repository-pin-600

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/


compute/cuda/repos/wsl-ubuntu/
x86_64/3bf863cc.pub

sudo add-apt-repository ’deb https://developer.download.nvidia.com/


compute/cuda/repos/wsl-ubuntu/x86_64/ /’

sudo apt-get update

sudo apt-get -y install cuda

If the NVIDIA CUDA compiler is correctly installed, the command

nvcc --version

will reply with a message similar to the following:


1 nvcc: NVIDIA (R) Cuda compiler driver
2 Copyright (c) 2005-2022 NVIDIA Corporation
3 Built on Mon_Oct_24_19:12:58_PDT_2022
4 Cuda compilation tools, release 12.0, V12.0.76
5 Build cuda_12.0.r12.0/compiler.31968024_0

Listing 9.1: Version details of an installed Cuda compiler

To check the versions of CUDA and the driver, the command

nvidia-smi

will respond with the NVIDIA System Management Interface, displaying various information about the
installed GPUs (see Figure 9.4). The CUDA toolkit should now be properly installed and working.

9.2.1.2. CUDA on Linux

Before installing the CUDA toolkit on Linux, typing the command

lspci | grep -i nvidia

can confirm that the GPU is CUDA-capable.


To install the CUDA toolkit on Linux, visit the the NVIDIA website and choose the fitting operating
system, architecture, distribution, as well as the preferred installation type for your system (https:
//developer.nvidia.com/cuda-toolkit). The website will then provide you with the correct com-
mands with which you can install the CUDA toolkit on your Linux system.
After the installation of the toolkit, the environment variables need to be set. Add the following line to
the .bashrc file which can be found in the home directory:

165
Figure 9.4.: The NVIDIA System Management Interface

export PATH=/usr/local/cuda-12.0/bin${PATH:+:${PATH}}

To reload .bashrc type

source .bashrc

If the installation was done with a run file, the LD_LIBRARY_PATH variable has to be set, as well. The
following command sets this variable on a 64-bit system. The command for 32-bit systems is almost
identical: lib64 has to be exchanged for lib:

export LD_LIBRARY_PATH=/usr/local/cuda-12.0/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

If a different install path or version of the CUDA toolkit has been chosen during the installation process,
both commands above have to be altered accordingly. After restarting your machine, you can confirm that
the installation has been successful by using the commands

nvcc --version

and

nvidia-smi

If the CUDA toolkit has been installed correctly, outputs similar to those shown in Listing 9.1 and Fig-
ure 9.4 respectively should be visible.

9.2.1.3. OpenMPI

To have the functionality of MPI in combination with CUDA, there are several CUDA-aware MPI imple-
mentations available. In this guide we will describe the installation of the open-source implementation
OpenMPI in four steps:

166
1. Download the desired OpenMPI version from the website (https://www.open-mpi.org/sof
tware/). As of the writing of this section, the most current stable release version was openmpi-4.
1.5.tar.bz2.

2. In your Linux (or WSL for Windows) terminal, move to the folder where the file was saved to and
extract the downloaded package via the command

tar -jxf openmpi-4.1.5.tar.bz2

3. Change into this new directory to configure, compile and install OpenMPI with the following com-
mands:

./configure --prefix=$HOME/opt/openmpi
--with-cuda=/usr/local/cuda
--with-cuda-libdir=/usr/local/cuda/lib64/stubs
make all
make install

Note that the path following -prefix= is the path we wish to install OpenMPI in and the path
following -with-cuda= is the location of the include folder of your CUDA installation. These
paths might be different depending on the users choices.

4. The environment variables have to be set by adding the following lines to the .bashrc file:

export PATH=\$PATH:\$HOME/opt/openmpi/bin
export LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:\$HOME/opt/openmpi/lib

Once again the path for OpenMPI might be different, depending on where the software was in-
stalled.

To see whether the installation of OpenMPI was successful, we can enter the command

ompi_info --parsable -l 9
--all | grep mpi_built_with_cuda_support:value

If the installation was done successfully, the terminal should respond with the output true.

9.2.1.4. Utilizing CUDA in OpenLB

The root directory contains a folder named config, in which several build config examples can be found.
The config.mk makefile of the root directory can be replaced with the makefile that suits the current
needs (e.g. using only the GPU, using the GPU with MPI, using CPU with MPI, etc.). Each example
makefile also includes instructions. Make a backup of the current config.mk in the root directory and
replace it with a copy of the makefile gpu_only found in the config folder. After renaming gpu_only
to config.mk, we open the file and check the value of CUDA_ARCH: This value might have to be changed,

167
depending on your graphics card and its architecture. The file rules.mk in the root directory contains a
table that shows which architecture goes with which value:

## | CUDA Architecture | Version |


## |-------------------+------------|
## | Fermi | 20 |
## | Kepler | 30, 35, 37 |
## | Maxwell | 50, 52, 53 |
## | Pascal | 60, 61, 62 |
## | Volta | 70, 72 |
## | Turing | 75 |
## | Ampere | 80, 86, 87 |

Another table on the internet (https://en.wikipedia.org/wiki/CUDA) shows which graphics card


corresponds to which architecture. This guide used the GTX 1650 as an example for the graphics card.
Figure 9.5 is a snippet of this table and shows that the GTX 1650 corresponds to the Turing architecture,
so the value of CUDA_ARCH has to be set to 75 in both config.mk and rules.mk files. After saving

Figure 9.5.: Table containing Nvidia GPUs with the Turing Microarchitecture.

the changes of CUDA_ARCH in both config.mk and rules.mk, the config.mk can be compiled via the
command make clean; make in your WSL (or Linux) terminal.
Depending on the hardware and compilers being used, further compiler flags might have to be changed.
Additional documentation found in the main config.mk and in the other template config files can pro-
vide you with further instructions with regards to setting the correct flags.

168
10. Step by Step: Using OpenLB for
Applications
The general way of functioning in OpenLB follows a generic path. The following structure is maintained
throughout every OpenLB application example, to provide a common structure and guide beginners.

1st Step: Initialization The converter between physical and lattice units is set in this step. It is also de-
fined, where the simulation data is stored and which lattice type is used.

2nd Step: Prepare geometry The geometry is acquired, either from another file (a .stl file) or from
defining indicator functions. Then, the mesh is created and initialized based on the given geometry.
This consists of classifying voxels with material numbers, according to the kind of voxels they are:
an inner voxel containing fluid ruled by the fluid dynamics will have a different number than a
voxel on the inflow with conditions on its velocity. The function prepareGeometry is called for
these tasks. Further, the mesh is distributed over the threads to establish good scaling properties.

3rd Step: Prepare lattice According to the material numbers of the geometry, the lattice dynamics are
set here. This step characterizes the collision model and boundary behavior. The choices depend
on whether a force is acting or not, the use of single relaxation time (BGK) or multiple relaxation
times (MRT), the simulation dimension (it can also be a 2D model), whether compressible or incom-
pressible fluid is considered, and the number of neighboring voxels chosen. By the creation of a
computing grid, the SuperLattice, the allocation of the required data is done as well.

4th Step: Main loop with timer The timer is initialized and started, then a loop over all time steps iT
starts the simulation, during which the functions setBoundaryValues, collideAndStream and
getResults (the 5th, 6th, and 7th step, respectively) are called repeatedly until a maximum of itera-
tions is reached, or the simulation has converged. At the end, the timer is stopped and the summary
is printed to the console.

5th Step: Definition of initial and boundary conditions The first of the three important functions
called during the loop, setBoundaryValues, sets the slowly increasing inflow boundary condition.
Since the boundary is time dependent, this happens in the main loop. In some applications, the
boundaries stay the same during the whole simulation and the function doesn’t need to do any-
thing after the very first iteration.

6th Step: Collide and stream execution Another function collideAndStream is called each iteration
step, to perform the collision and the streaming step. If more than one lattice is used, the func-
tion is called for each lattice separately.

169
7th Step: Computation and output of results At the end of each iteration step, the function
getResults is called, which creates console output, .ppm files or .vti files of the results at cer-
tain time steps. The ideal is to get the relevant simulation data with functors and thus facilitate the
post processing significantly. By passing the converter and the time step, the frequency of writing
or displaying data can be chosen easily. In many applications, the console output is required more
often than the VTK data.

10.1. Lesson 1: Getting Started - Sketch of Application


This section presents example bstep2d that can be found in the recent release of OpenLB. This example
simulates a flow over a backward-facing step and serves as an illustration of OpenLB and its features.
In order to execute the simulation and get some results, download and unpack OpenLB on (preferably)
a Linux system, see Section 9.1.1. Then, generate a executable file by compiling the program through
the command make. Finally, launch the simulation by ./bstep2d and observe the terminal output, see
Section 6.6.
A few lines are invariably the same for all OpenLB applications, see Listing 10.1.
1 #include " olb2D . h "
2 #include " olb2D . hh "
3
4 using namespace olb; // OpenLB namespaces
5 using namespace olb::descriptors; //
6
7 using T = FLOATING_POINT_TYPE;
8 using DESCRIPTOR = D2Q9<>;

Listing 10.1: Framework of an OpenLB program. Fundamental properties of the simulation are
defined here.

Line 1: The header file olb2D.h includes definitions for the whole 2D code present in the release. In the
same way, access to 3D code is obtained by including the file olb3D.h.

Line 2: Most OpenLB code depends on template parameters. Therefore, it cannot be compiled in ad-
vance, and needs to be integrated “as is” into your programs via the file olb2D.hh or olb3D.hh
respectively.

Line 4: All OpenLB code is contained in the namespace olb. The descriptors have an own namespace
and define the lattice arrangement, e.g. D2Q9 or D3Q19.

Line 7: Choice of precision for floating point arithmetic. The default type FLOATING_POINT_TYPE is de-
fined in the config.mk file and usually equals float or double. Any other floating point type
can be used, including built-in types and user-defined types which are implemented through a C++
class.

Line 8: Choice of a lattice descriptor. Lattice descriptors specify not only which lattice (see Figure 1.1 for
exemplary velocity sets) are employed, but is also are used to compute the size of various dependent

170
fields such as force vectors.

The next code presents a brief overview about the structure of an OpenLB application, see Listing 10.2.
It aims rather to introduce and guidelines the beginners, than explain the classes and methods in depth.
Details on the shown functions can be found in the source code, this means in the bstep2d.cpp file, as
well as in the following chapters.
1 SuperGeometry<T,2> prepareGeometry(LBconverter<T> const& converter)
2 {
3 // create Cuboids and assign them to threads
4 // create SuperGeometry
5 // set material numbers
6 return superGeometry;
7 }
8 void prepareLattice(...)
9 {
10 // set dynamics for fluid and boundary lattices
11 // set initial values, rho and u
12 }
13 void setBoundaryValues(...)
14 {
15 // set Poiseuille velocity profile at inflow
16 // increase inflow velocity slowly over time
17 }
18 void getResults(...)
19 {
20 // write simulation data do vtk files and terminal
21 }
22
23 int main(int argc, char* argv[])
24 {
25 // === 1st Step: Initialization ===
26 olbInit( &argc, &argv );
27 singleton::directories().setOutputDir( " . / tmp/ " ); // set output directory
28 OstreamManager clout( std::cout, " main " );
29
30 UnitConverterFromResolutionAndRelaxationTime<T, DESCRIPTOR> converter(
31 (T) N, // resolution
32 (T) relaxationTime, // relaxation time
33 (T) charL, // charPhysLength: reference length of simulation geometry
34 (T) 1., // charPhysVelocity: maximal/highest expected velocity
during simulation in __m / s__
35 (T) 1./19230.76923, // physViscosity: physical kinematic viscosity in __m^2 /
s__
36 (T) 1. // physDensity: physical density in __kg / m^3__
37 );
38
39 // Prints the converter log as console output
40 converter.print();
41 // Writes the converter log in a file
42 converter.write( " bstep2d " );
43
44 // === 2nd Step: Prepare Geometry ===
45 // Instantiation of a superGeometry
46 SuperGeometry<T,2> superGeometry( prepareGeometry(converter) );
47

171
48 // === 3rd Step: Prepare Lattice ===
49 SuperLattice<T,DESCRIPTOR> sLattice( superGeometry );
50 BGKdynamics<T,DESCRIPTOR> bulkDynamics (
51 converter.getLatticeRelaxationFrequency(),
52 instances::getBulkMomenta<T,DESCRIPTOR>()
53 );
54
55 //prepare Lattice and set boundaryConditions
56 prepareLattice( converter, sLattice, bulkDynamics, superGeometry );
57
58 // instantiate reusable functors
59 SuperPlaneIntegralFluxVelocity2D<T> velocityFlux( sLattice,
60 converter,
61 superGeometry,
62 {lengthStep/2., heightInlet / 2.},
63 {0., 1.} );
64
65 SuperPlaneIntegralFluxPressure2D<T> pressureFlux( sLattice,
66 converter,
67 superGeometry,
68 {lengthStep/2., heightInlet / 2. },
69 {0., 1.} );
70
71 // === 4th Step: Main Loop with Timer ===
72 clout << " s t a r t i n g s i m u l a t i o n . . . " << std::endl;
73 Timer<T> timer( converter.getLatticeTime( maxPhysT ), superGeometry.getStatistics().
getNvoxel() );
74 timer.start();
75
76 for ( std::size_t iT = 0; iT < converter.getLatticeTime( maxPhysT ); ++iT ) {
77 // === 5th Step: Definition of Initial and Boundary Conditions ===
78 setBoundaryValues( converter, sLattice, iT, superGeometry );
79 // === 6th Step: Collide and Stream Execution ===
80 sLattice.collideAndStream();
81 // === 7th Step: Computation and Output of the Results ===
82 getResults( sLattice, converter, iT, superGeometry, timer, velocityFlux,
pressureFlux );
83 }
84
85 timer.stop();
86 timer.printSummary();
87 }

Listing 10.2: A brief overview of a typically OpenLB application, bstep2d. Details on the specific
functions can be found in the following chapters.

10.2. Lesson 2: Define and Use Boundary Conditions


The current OpenLB release offers a wide range of boundary conditions for the implementation of pressure
and velocity boundaries. They support boundaries that are aligned with the numerical grid, and also
implement proper corner nodes in 2D and 3D, and edge nodes that connect two plane boundaries in
3D. The choice of a boundary condition is conceptually separated from the definition of the location of

172
boundary nodes. It is therefore possible to modify the choice of the boundary condition by changing a
single instruction in a program. An overview of the available boundary conditions is given by [23].
The new boundary condition system utilizes free floating functions and doesn’t require a class structure.
Consequently the following classes are obsolete in the current release:
sOn/OffLatticeBoundaryConditionXD,
OnLatticeBoundaryConditionXD,
(Off)BoundaryConditionInstantiatorXD and
Regularized/InterpolationBoundaryManagerXD.
Key Features of the new system are:

• Free floating design that allows for general functions like "setBoundary" to be used in multiple
boundary Conditions.

• Overall slimmer design with fewer function calls and fewer loops through the block domain.

• MomentaVector and dynamicsVector is stored in /src/core/blockLatticeStructure3D.h

• Uncluttered function call design, which makes it easier to create new boundary conditions

The new boundaries are similarly named as in addSlipBoundary becomes setSlipBoundary in the new
system. For example, if you want to use a slip boundary condition:

Define dynamics Keep in mind that sLattice.defineDynamics is the same for the old and new bound-
ary system.

Define your boundary type In this case it is the slipBoundary. To set the boundary, call the fitting
setBoundaryCondition function in the following manner inside your prepareLattice function:

• setSlipBoundary<T,DESCRIPTOR>("superLattice",
"superGeometry","MaterialNumber");

• The difference between the old and the new system is that every boundaryCondition needs
to have the superLattice as an argument. The latticeRelaxationFrequency omega is usually
called as the second argument.

Define initial conditions


• on-lattice: sLattice.defineRhoU(..)
• off-lattice: sLattice.defineRho(..), sLattice.defineUBouzidi(..)

Define boundary values


• on-lattice: sLattice.defineU(..)
• off-lattice: sLattice.defineUBouzidi(..)

With the help of this system, one can treat local and non-local boundary conditions the same way.
Furthermore, they can be used both for sequential and parallel program execution, as it is shown in Les-
son 10. The mechanism behind this is explained in Lesson 7. The bottom line is that both local and
non-local boundary conditions instantiate a special dynamics object and assign it to boundary cells. Non-
local boundaries additionally instantiate post-processing objects which take care of non-local aspects of
the algorithm.

173
10.3. Lesson 3: UnitConverter - Lattice and Physical Units
Fluid flow problems are usually given in a system of metric units. For example consider a cylinder of
diameter 3 cm in a fluid channel with average inflow velocity of 4 m. The fluid has a kinematic viscosity
of 0.001 m2 s−1 . The value of interest is the pressure difference measured in P a at the front and the back of
the cylinder (with respect to the flow direction). However, the variables used in a LB simulation live in a
system of lattice units, in which the distance between two lattice cells and the time interval between two
iteration steps are chosen to be unity. Therefore, when setting up a simulation, a conversion directive has
to be defined that takes care of scaling variables from physical units into lattice units and vice versa.
In OpenLB, all these conversions are handled by a class called UnitConverter, see Listing 10.3. An
instance of the UnitConverter is generated with desired discretization parameters and reference values in
SI units. It provides a set of conversion functions to enable a fast and easy way to convert between physical
and lattice units. In addition, it gives information about the parameters of the fluid flow simulation, such
as the Reynolds number or the relaxation parameter ω.
Let’s have a closer look at the input parameters: The reference values represent characteristic quantities
of the fluid flow problem. In this example, it is suitable to choose the cylinder’s diameter as characteristic
length and the average inflow speed as characteristic velocity. Furthermore, two discretization parameters,
namely the grid size ∆x in m and time step size ∆t in s are provided to the converter. From these reference
values and discretization parameters, all the conversion factors and the relaxation time τ are calculated.
Due to the fact, that there are stability bounds for the relaxation time and the maximum occurring
lattice velocity, one does not usually chose ∆x and ∆t, but sets stable and accurate values for any two out
of resolution, relaxation time or characteristic (maximum) lattice velocity. To make that easily available for
the user of openLB, there are different constructors for the UnitConverter class:
UnitConverterFromRelaxationTimeAndLatticeVelocity,
UnitConverterFromResolutionAndLatticeVelocity,
UnitConverterFromResolutionAndRelaxationTime.
Once the converter is initialized, its methods can be used to convert various quantities such as velocity,
time, force or pressure. The function for the latter helps us to evaluate the pressure drop in our example
problem, as shown in the the following code snippet:
1 UnitConverterFromResolutionAndRelaxationTime<T, DESCRIPTOR> const converter(
2 int {N}, // resolution: number of voxels per charPhysL
3 (T) 0.53, // latticeRelaxationTime: relaxation time, have to be greater
than 0.5!
4 (T) 0.1, // charPhysLength: reference length of simulation geometry
5 (T) 0.2, // charPhysVelocity: maximal/highest expected velocity during
simulation in __m / s__
6 (T) 0.2*2.*0.05/Re, // physViscosity: physical kinematic viscosity in __m^2 / s__
7 (T) 1.0 // physDensity: physical density in __kg / m^3__
8 );
9 // Prints the converter log as console output
10 converter.print();
11 // Writes the converter log in a file
12 converter.write( " c o n v e r t e r L o g F i l e " );
13 // conversion from seconds to iteration steps and vice-versa

174
14 int iT = converter.getLatticeTime(maxPhysT);
15 T sec = converter.getPhysTime(iT);
16 <...> simulation
17 <...> evaluation of latticeRho at the back and the front of the cylinder
18 T latticePressureFront = latticeRhoFront / descriptors::invCs2<T,DESCRIPTOR>();
19 T latticePressureBack = latticeRhoBack / descriptors::invCs2<T,DESCRIPTOR>();
20 T pressureDrop = converter.getPhysPressure(latticePressureFront)
21 - converter.getPhysPressure(latticePressureBack);

Listing 10.3: Use of UnitConverter in a 3D problem.

Line 1–7: Instantiate an UnitConverter object and specify discretization parameters as well as character-
istic values.

Line 9: Write simulation parameters and conversion factors to terminal.

Line 11: Write simulation parameters and conversion factors in a logfile.

Line 13 and 14: The conversion from physical units (seconds) to discrete ones (time steps) is managed by
the converter.

Line 17–20: The converter automatically calculates the pressure values from the local density.

10.4. Lesson 4: Extract Data From a Simulation


When the collision step is executed, the value of the density and the velocity are computed internally, in
order to evaluate the equilibrium distribution. Those macroscopic variables are however interesting for
the OpenLB end-user as well, and it would be a shame to simply neglect their value after use. These
values are accessed through the method getStatistics() of a BlockLattice:

T lattice.getStatistics().getAverageRho() Returns average density evaluated during the previous colli-


sion step.

T lattice.getStatistics().getAverageEnergy() Returns half the average velocity norm evaluated during


the previous collision step.

T lattice.getStatistics().getMaxU() Returns maximum value of the velocity norm evaluated during the
previous collision step.

Often, the information provided by the statistics of a lattice in not sufficient, and more generally nu-
merical result are required. To do this, you can get data cell-by-cell from the BlockLatticeXD and
SuperLatticeXD through functors, see Chapter 2.6. Functors act on the underlying lattice and process
its data to relevant macroscopic units, e.g. density, velocity, stress, flux, pressure and drag. Functors pro-
vide an operator() that instead of access stored data, computes every time it is called the data. Since
OpenLB 0.8, the concept of functors unfold not only for postprocessing, but also for boundary conditions
and the generation of geometry, see Chapter 2.6. In Listing 10.4 it is shown, how to extract data out of a
SuperLattice named sLattice and an SuperGeometry3D named sGeometry. The data format is a legal
.vtk file, that can be processed further with ParaView.

175
1 // generate the writer object
2 SuperVTMwriter3D<T> vtmWriter( " bstep3d " );
3 // write every 0.2 seconds
4 if (iT==converter.getLatticeTime(0.2)) {
5 // create functors
6 SuperLatticeGeometry3D<T,DESCRIPTOR> geometry(sLattice, sGeometry);
7 SuperLatticeCuboid3D<T,DESCRIPTOR> cuboid(sLattice);
8 SuperLatticeRank3D<T,DESCRIPTOR> rank(sLattice);
9 // write functors to file system, vtk formata
10 vtmWriter.write(geometry);
11 vtmWriter.write(cuboid);
12 vtmWriter.write(rank);
13 }

Listing 10.4: Extract simulation data to vtk file format.

As before mentioned, OpenLB provides functors for a bunch of data, see Listing 10.5. More details
about writing simulation data can be found in Chapter 6.
1 // Create the functors by only passing lattice and converter
2 SuperLatticePhysVelocity3D<T,DESCRIPTOR> velocity(sLattice, converter);
3 SuperLatticePhysPressure3D<T,DESCRIPTOR> pressure(sLattice, converter);
4 // Create functor that corresponds to material numbers
5 SuperLatticeGeometry3D<T,DESCRIPTOR> geometry(sLattice, superGeometry);

Listing 10.5: Code example for creating velocity, pressure and geometry functors.

The most straightforward and convenient way of visualizing simulation data is to produce a 2D snap-
shot of a scalar valued functor. This is done through the BlockReduction3D2D, which puts a plane into
arbitrary 3D functors. Afterwards, this plane can be easily written to a image file. OpenLB creates images
of PPM format as shown in Listing 10.6.
1 // velocity is an application: R^3 -> R^3
2 // an image in its very basic sense is an application: R^2 -> R
3
4 // transformation of data is presented below
5 // get velocity functor
6 SuperLatticePhysVelocity3D<T,DESCRIPTOR> velocity(sLattice, converter);
7 // get scalar valued functor by applying the point wise l2 norm
8 SuperEuklidNorm3D<T> normVel( velocity );
9 // put a plane with normal (0,0,1) in the 3 dimensional data
10 BlockReduction3D2D<T> planeReduction( normVel, {0, 0, 1} );
11 BlockGifWriter<T> gifWriter;
12 // write ppm image to file system
13 gifWriter.write( planeReduction, iT, " v e l " );

Listing 10.6: Create a PPM image out of a 3D velocity functor.

This image writer provides in situ visualization which, in contrast to the VTKwriter, produces smaller
data sets that can be interpreted immediately without requiring other software.

176
10.5. Lesson 5: Convergence Check
The class ValueTracer checks for time-convergence of a given scalar ϕ. The convergence is reached when
the standard deviation σ of the monitored value ϕ is smaller than a given residuum ϵ times the expected
value ϕ̄. v
u N
u 1 X 2
σ(ϕ) = t ϕi − ϕ̄ < ϵϕ̄ (10.1)
N + 1 i=0

The expected value ϕ is the average over the last N time steps with ϕi := ϕ(t∗ − i∆t) where t∗ is the
current time step and ∆t denotes the time step size.

N
1 X
ϕ̄ = ϕi (10.2)
N + 1 i=0

The value N should be chosen as a problem specific time period. As an example charT = charL/charU
and N = converter.getLatticeTime(charT). To initialize a ValueTracer object use:
1 util::ValueTracer<T> converge( numberTimeSteps, residuum );

Listing 10.7: Create a PPM image out of a 3D velocity functor.

For example, to check for convergence with a residuum of ϵ = 10−5 every physical second:
1 util::ValueTracer<T> converge( converter.getLatticeTime(1.0), 1e-5 );

Listing 10.8: Create a PPM image out of a 3D velocity functor.

It is required to pass the monitored value to the ValueTracer object every time steps by:
1 for (iT = 0; iT < maxIter; ++iT) {
2 ...
3 converge.takeValue( monitoredValue, isVerbose );
4 ..
5 }

Listing 10.9: Create a PPM image out of a 3D velocity functor.

If you like to print average value and its standard derivation every number of time steps chosen during
initialization set isVerbose to true otherwise choose false. It is good idea to choose average energy as
monitored value:
1 converge.takeValue(SLattice.getStatistics().getAverageEnergy(),true);

Listing 10.10: Create a PPM image out of a 3D velocity functor.

Do something like the following in the time loop:


1 if (converge.hasConverged()) {
2 clout << " S i m u l a t i o n converged . " << std::endl;
3 break;

177
4 }

Listing 10.11: Create a PPM image out of a 3D velocity functor.

10.6. Lesson 6: Use an External Force


In simulations the dynamics of a fluid are often driven by some kind of externally imposed force field.
In order to optimize memory access and to minimize cache-misses, the value of this force can be stored
right alongside the cell’s population values. This is achieved by specifying additional fields in the lattice
descriptor (see Sections 2.1 and 4.4).
At this point we want to consider a time- and space-independent external force as a basic example.
Listing 10.12 shows how such an external force can be defined for all cells of a certain material number.
1 // Define constant force
2 AnalyticalConst2D<T,T> force(
3 8.0 * converter.getLatticeViscosity()
4 * converter.getCharLatticeVelocity()
5 / ( Ly*Ly ), // x-component of the force
6 0.0); // y-component of the force
7
8 // Initialize force for materials 1 and 2
9 superLattice.defineField<FORCE>(
10 superGeometry.getMaterialIndicator({1, 2}), force);

Listing 10.12: Define a constant external force

This code was adapted from examples/laminar/poiseuille2d where just such a constant force is
used to drive the channel flow. Note that the underlying D2Q9 descriptor’s field list must contain a FORCE
field in order for defineField<FORCE> to work.
The command SuperLattice::defineField provides just one of many template methods one can use
to work with descriptor fields. For example, we can read and write a cell’s force field as follows:
1 // Get a reference to the memory location of a cell’s force vector
2 FieldPtr<T,DESCRIPTOR,FORCE> force = cell.template getFieldPointer<FORCE>();
3 // Read a cell’s force vector as an OpenLB vector value
4 Vector<double,2> force = cell.template getField<FORCE>();
5 // Set a cell’s force vector to zero
6 cell.template setField<FORCE>(Vector<double,2>(0.0, 0.0));

Note that these methods work the same way for any other field that might be declared by a cell’s specific
descriptor.

10.7. Lesson 7: Understand Genericity in OpenLB


OpenLB is a framework for the implementation of lattice Boltzmann algorithms. Although most of the
code shipped with the distribution is about fluid dynamics, it is open to various types of physical mod-

178
els. Generally speaking, a model which makes use of OpenLB must be formulated in terms of the “local
collision followed by nearest-neighbor streaming” philosophy. A current restriction to OpenLB is that the
streaming step can only include nearest neighbors: there is no possibility to include larger neighborhoods
within the modular framework of the library, i.e. without tampering with OpenLB source code. Except for
this restriction, one is completely free to define the topology of the neighborhood of cells, to implement an
arbitrary local collision step, and to add non-local corrections for the implementation of, say, a boundary
condition.
To reach this level of genericity, OpenLB distinguishes between non-modifiable core components, which
you’ll always use as they are, and modular extensions. As far as these extensions are concerned, you
have the choice to use default implementations that are part of OpenLB or to write your own. As a
scientific developer, concentrating on these, usually quite short, extensions means that you can concentrate
on the physics of your model instead of technical implementation details. By respecting this concept of
modularity, you can automatically take advantage of all structural additions to OpenLB. In the current
release, the most important addition is parallelism: you can run your code in parallel without (or almost
without) having to care about parallelism and MPI.
The most important non-modifiable components are the lattice and the cell. You can configure their
behavior, but you are not expected to write a new class which inherits from or replaces the lattice
or the cell. Lattices are offered in different flavours, most of which inherit from a common interface
BlockStructureXD. The most common lattice is the regular BlockLatticeXD, which is replaced by the
SuperLatticeXD for parallel applications and for memory-saving applications when faced with irregular
domain boundaries. An alternative choice for parallelism and memory savings is the CuboidStructureXD,
which does not inherit from BlockStructureXD, but instead allows for more general constructs.
The modular extensions are classes that customize the behavior of core-components. An important ex-
tension of this kind is the lattice descriptor. This specifies the number of particle populations contained in
a cell, and defines the lattice constants and lattice velocities, which are used to specify the neighborhood
relation between a cell and its nearest neighbors. The lattice descriptor can also be used to require addi-
tional allocation of memory on a cell for external scalars, such as a force field. The integration of a lattice
descriptor in a lattice happens via a template mechanism of C++. This mechanism takes place statically,
i.e. before program execution, and avoids the potential efficiency loss of a dynamic, object-oriented ap-
proach. Furthermore, template specialization is used to optimize the OpenLB code specifically for some
types of lattices. Because of the template-based approach, a lattice descriptor needs not inherit from some
interface. Instead, you are free to simply implement a new class, inspired from the default descriptors in
the files core/latticeDescriptors.h and core/latticeDescriptor.hh.
The dynamics executed by a cell are implemented through a mechanism of dynamic (run-time) gener-
icity. In this way, the dynamics can be different from one cell to another, and can change during program
execution. There are two mechanisms of this type in OpenLB, one to implement local dynamics, and one
for non-local dynamics. To implement local dynamics, one needs to write a new class which inherits the
interface of the abstract class Dynamics. The purpose of this class is to specify the nature of the collision
step, as well as other important information (for example, how to compute the velocity moments on a

179
cell). For non-local dynamics, a so-called post-processor needs to be implemented and integrated into a
BlockLatticeXD through a call to the method addPostProcessorXD. This terminology can be somewhat
confusing, because the term “post-processing” is used in the CFD community in the context of data anal-
ysis at the end of a simulation. In OpenLB, a post-processor is an operator which is applied to the lattice
after each streaming step. Thus, the time-evolution of an OpenLB lattice consists of three steps: (1) local
collision, (2) nearest-neighbor streaming, and (3) non-local postprocessing. Implementing the dynamics
of a cell through a postprocessor is usually less efficient than when the mechanism of the Dynamics classes
is used. It is therefore important to respect the spirit of the lattice Boltzmann method and to express the
collision as a local operation whenever possible.

10.8. Lesson 8: Use Checkpointing for Long Duration


Simulations
All types of data in OpenLB can be stored in a file or loaded from a file. This includes the data of a
BlockLatticeXD and the data of a ScalarFieldXD or a TensorFieldXD. All these classes implement the
interface Serializable<T>. This guarantees that they can transform their content into a data stream of
type T, or read from such a stream. Serialization and unserialization of data is mainly used for file access,
but it can be applied to different aims, such as copying data between two objects of different type. The
data is stored in the ascii-based binary format Base64. Although Base64-encoded data requires 25% more
storage space than when a pure binary format is used, this approach was chosen in OpenLB to enhance
compatibility of the code between platforms. Saving and loading data is invoked by calling the save
and load method on the object to be serialized. These methods take the filename as an optional (but
recommended) argument, as shown below:
1 int nx, ny;
2 <...> initialization of nx and ny
3 BlockLattice<T,DESCRIPTOR> lattice(nx, ny);
4 // load data from a previous simulation
5 lattice.load( " s i m u l a t i o n . c h e c k p o i n t " );
6 <...> run the simulation
7 // save data for security, to be able to take up
8 // the simulation at this point later
9 lattice.save( " s i m u l a t i o n . c h e c k p o i n t " );

Listing 10.13: Store and load the state of the simulation.

Checkpointing is also illustrated in the example programs bstep2D and bstep3D (Section 8.4.1).

10.9. Lesson 9: Run Your Programs on a Parallel Machine


OpenLB programs can be executed on a parallel machine with distributed memory, based on MPI. To com-
pile an OpenLB program for parallel execution, modify the file named config.mk, found in the OpenLB
root directory, by removing the hashes before the lines: #CXX := mpic++, and #PARALLEL_MODE := MPI.

180
The modified lines are shown in Listing 10.14. Execute make clean and make cleanbuild within the
desired program directory to eliminate previously compiled libraries, and recompile the program by exe-
cuting the make command. To run the program in parallel, use the command mpirun -np 2 ./cavity2d.
Here -np 2 specifies the number of processors to be used.
1 #CXX := g++
2 #CXX := icpc -D__aligned__=ignored
3 #CXX := mpiCC
4 CXX := mpic++
5 ...
6 #PARALLEL_MODE := OFF
7 PARALLEL_MODE := MPI
8 #PARALLEL_MODE := OMP
9 #PARALLEL_MODE := HYBRID

Listing 10.14: Edited config.mk for MPI-parallel programs.

10.10. Lesson 10: Work with Indicators


Many of the methods covered up until this point accepted geometry and material number arguments to
define their working domain. This can lead to repetition and code that is harder to read than necessary.
An alternative to e.g. setting up bulk dynamics using raw material numbers is available in the form of
indicator functors.
In fact most of the material number accepting operations we have covered so far use generic lattice
indicators under the hood, specifically SuperIndicatorMaterial3D.
1 superLattice.defineDynamics(superGeometry, 1, &bulkDynamics);
2 superLattice.defineDynamics(superGeometry, 3, &bulkDynamics);
3 superLattice.defineDynamics(superGeometry, 4, &bulkDynamics);
4 // is equivalent to:
5 SuperIndicatorMaterial3D<T> bulkIndicator({1, 3, 4});
6 superLattice.defineDynamics(bulkIndicator, &bulkDynamics);

Listing 10.15: Indicator usage example

This can be further abstracted using SuperGeometry3D’s indicator factory:


1 auto bulkIndicator = superGeometry.getMaterialIndicator({1, 3, 4});
2 superLattice.defineDynamics(bulkIndicator, &bulkDynamics);

Listing 10.16: Indicator usage in bstep3d

The advantage of this pattern is that we explicitly named materials 1, 3 and 4 as bulk materials and can
reuse the indicator whenever we operate on bulk cells:
1 superLattice.defineRhoU(bulkIndicator, rho, u);
2 superLattice.iniEquilibrium(bulkIndicator, rho, u);

Listing 10.17: Indicator reusage in bstep3d

181
This way the bulk material domain is defined in a central place which will come in handy should we
need to change them in the future. Note that for one-off usage this can be written even more compactly:
1 superLattice.defineDynamics(
2 superGeometry.getMaterialIndicator({1, 3, 4}), &bulkDynamics);

Listing 10.18: Inline indicator usage

This pattern of using indicators instead of raw material numbers is available for all material number
accepting methods of SuperGeometryXD. The methods themselves support arbitrary SuperIndicatorFXD
instances and as such are not restricted to material indicators.
The following table 10.1 and table 10.2 give an overview over the indicator functions that are available
in OpenLB for 2D and 3D cases respectively:

Indicator Description
IndicatorF2DfromIndicatorF3D Creates a 2D version of a 3D cuboid-indicator.
IndicatorCuboid2D Creates a 2D rectangle.
IndicatorCircle2D Creates a 2D circle.
IndicatorTriangle2D Creates a 2D triangle.
IndicatorEquiTriangle2D Creates an equilateral triangle.
Indicator function for information obtained from
IndicatorBlockData2D
VTIreader.
Creates a layer around an input indicator or reduces an
IndicatorLayer2D
input indicator by a layer.
IndicatorSDF2D Converts signed distance function to a 2D indicator.

Table 10.1.: 2D indicators found in indicatorF2D.h.

Indicator Description
IndicatorTranslate3D Move the indicator to an other position.
Returns true for all positions inside of a 3D cylinder with
IndicatorCircle3D
a length five times of the floating point epsilon.
IndicatorSphere3D Returns true for all positions inside of a 3D sphere.
IndicatorLayer3D Extend the Layer of an indicator by a given thickness.
IndicatorCylinder3D Returns true for all positions inside of a 3D Cylinder.
IndicatorCone3D Returns true for all positions inside of a 3D cone.
IndicatorEllipsoid3D Returns true for all positions inside of a 3D ellipsoid.
Returns true for all positions inside of a 3D ellipsoid.
IndicatorSuperEllipsoid3D The geometry can be more complex than IndicatorEllip-
soid3D.

182
IndicatorCuboid3D Returns true for all positions inside of a 3D cuboid.
IndicatorCuboidRotate3D Rotate a cuboid indicator around an axis.
IndicatorSDF3D Converts signed distance function to an indicator.

Table 10.2.: 3D indicators found in indicatorF3D.h.

10.11. Alternative Approach: Using a Solver Class


Quite a lot of program components are similar for each OpenLB application: e.g. the collide and stream
loop is part of every simulation. The concept of a solver class is meant to perform such steps automatically,
s.t. the user only has to define those steps which are specific for his/her application. Moreover, a generic
interface shall be given for other programs (e.g. launch from python scripts or execution of optimization
routines). For both purposes, this is work in progress and more improvements and functionalities are
under development. In the following, the parts of an OpenLB program in solver style are explained.
These steps are also illustrated by the examples cavity2dSolver and porousPlate3dSolver.

10.11.1. Structure of an OpenLB Simulation in Solver Style


10.11.1.1. Parameter handling

In order to allow flexible interfaces to other programs, all parameters which are needed for simulation
and interface are stored publicly in structs. For different groups of parameters (e.g. simulation/ output/
stationarity), different structs are used.
For Simulation, Output and Stationarity, basic versions containing the essential parameters are
given by SimulationBase, OutputBase, StationarityBase, respectively. These can be supplemented
by inheritance.
More parameter structs could be added for individualization. For instance, a Results struct could be
used to save simulation results.

10.11.1.2. List parameter structs and lattices

A map of parameter structs with corresponding names is defined as a meta::map. Similarly, a map of
lattice names and descriptors is defined. Some typical names are provided at src/solver/names.h; a
list which is intended to be extended for individualization. The two maps are then given to the solver
class as template parameters.

10.11.1.3. Definition of a solver class

Many standard routines for simulation are implemented in the existing class LbSolver. It is templatized
w.r.t. maps of parameters and lattices and should therefore fit to most application cases. However, some

183
steps (like the definition of the geometry) depend on the application and have to be defined for each appli-
cation. Therefore, an application-specific solver class is created as a child class of LbSolver. It has to im-
plement the methods prepareGeometry, prepareLattices, setInitialValues and setBoundaryValues,
similar to the classical app structure. Moreover, methods getResults, computeResults, writeImages,
writeVTK and writeGnuplot can be defined if such output is desired. They are all called automatically
during construction/ simulation. The access to the parameter structs works with the tags defined above:
e.g. , this->parameters (Simulation()).maxTime gives the maximal simulation time (which is a mem-
ber of the struct SimulationBase). Similarly, we find access to super geometry and super lattices via
this->geometry() and this-> lattice(LatticeName()), respectively. An automatic check, whether
the simulation became stationary, is executed if a parameter struct with tag Stationarity is available
(and the corresponding struct inherits from StationarityBase).

10.11.1.4. Main method

First, instances of the parameter structs and the solver class are constructed. This can be done classically,
using the constructors (cf. example porousPlate3dSolver), or, if XML-reading has been implemented
for all parameter structs, with the create-from XML-interface (cf. example cavity2dSolver).
Secondly, the solve() method of the solver instance is called in order to run the simulation.

10.11.2. Set up an Application in Solver Style


In order to set up your own OpenLB application in solver style, the following steps should be followed:

• Select parameter structs. You can use existing ones or inherit from them/ define them completely
new. The simulation parameters are expected to inherit from SimulationParameters and provide
a unit converter. The output parameters should inherit from OutputParameters. You are free to
add more parameter structs (e.g. for simulation results) yourself.

• Define a solver class. It should inherit publicly from the LbSolver class and implement the missing
virtual methods like prepareGeometry.

• Define the main method. Either construct instances of the parameter structs and the solver class or
use the create-from-xml interface. Then call the solve() method and possibly perform postprocess-
ing.

10.11.3. Parameter Explanation and Reading from XML


Introduction

In this short overview the relevant parameters for an app in solver style are listed with the respective
names for using an xml-file for the input. The documentation is divided into two different subsections,
which are the different main parts of the xml-file:

10.11.3.1. Application

10.11.3.2. Output

184
In each subsection, the different parameters are explained via a table of the following form:

Default Explanation: (if avail-


Parameter Name(type) Class (file)
value able, all) possibilities

The explanation of each column is as follows:


• Parameter: Name of the parameter in the xml-file
• Name (type): Name of the parameter in the source code and its data type in brackets. Besides the
common data types the abbreviations S and T are used for template parameters.
• Class (file): name of the class in which the parameter is stored and name of the file in which the
class is defined.
• Default value: Is this parameter essential or optional or unused?
If a parameter is optional, it is not needed to be defined. Then, the default value can be seen in this
column.
If a parameter is so important that without it the program has to exit, it is labeled as EXIT.
Some parameters are indicated with unused which means, that the parameter is read but not used
afterwards.
• Explanation: (if available, all) possibilities: Brief description and explanation of the parameter.
Some parameters have different possibilities for their definition. In this case, all available possibili-
ties are also offered in bold type letters, e.g.:
ad for OptiCaseAD or
dual for OptiCaseDual or
adTest for OptiCaseADTest
The arrangement of the parameters in the xml-file has the following structures:

<Param>
<name o f t h e s u b s e c t i o n i n t h i s documentation>
< s u p e r o r d i n a t e term o f t h e parameter>
<name o f t h e parameter> value o f t h e parameter </name o f t h e parameter>
</ s u p e r o r d i n a t e term o f t h e parameter>
</name o f t h e s u b s e c t i o n i n t h i s documentation>
</Param>

example:

<Param>
<Application>
<Discretization>
< R e s o l u t i o n > 128 </ R e s o l u t i o n >
</ D i s c r e t i z a t i o n >
</ A p p l i c a t i o n >
</Param>

185
10.11.3.1. Application

In the following, all parameters for the general setup of the application are explained.

Default Explanation: (if avail-


Parameter Name(type) Class (file)
value able, all) possibilities
Output name Name.dat
name OutputGeneral for information of Unit-
Name "unnamed"
(std::string) (solverParameters.h) Converter in the tmp
folder
olbDir OutputGeneral Defines the trail for the
OlbDir "../../../"
(std::string) (solverParameters.h) OpenLB directory
Pressure- pressureFilter- SimulationBase Weights for moments
Filter On (bool) (solverParameters.h) computing,
if (this-> pressure-
false
FilterOn):
_lattice-> stripeOffDen-
sityOffset (_lattice->
getStatistics(). getAver-
ageRho() - (T) 1) in
lbSolver.hh
Discretization:
Defines the resolution
_resolution UnitConverter
Resolution number of the simula-
(int) (unitConverter.hh)
tion area
Defines the lattice relax-
Lattice- _lattice-
UnitConverter ation time and thereby
Relaxation- Relaxation-
(unitConverter.hh) the physicalDeltaT, _con-
Time Time (T)
versionTime respectively
PhysParameters:
CharPhys- _charPhys- UnitConverter Defines the characteristic
Length Length (T) (unitConverter.hh) physical length
CharPhys- _charPhys- UnitConverter Defines the characteristic
Velocity Velocity (T) (unitConverter.hh) physical velocity
_physDensity UnitConverter Defines the physical den-
PhysDensity
(T) (unitConverter.hh) sity
CharPhys- _charPhys- UnitConverter Defines the characteristic
Pressure Pressure (T) (unitConverter.hh) physical pressure
_phys-
Phys- UnitConverter Defines the physical vis-
Viscosity
Viscosity (unitConverter.hh) cosity
(T)

186
Defines the maximal
PhysMax- SimulationBase
maxTime (S) simulation time in sec-
Time (solverParameters.h)
onds
Defines the start time un-
til which time the sim-
ulation is started up.
From then on, the conver-
StartUp- startUpTime SimulationBase
gence criterion is checked
Time (S) (solverParameters.h)
(in solver3D.hh) and
determines the bound-
aries for defineU in pri-
malMode.
Boundary- _phys-
Value- Boundary- SimulationBase PhysMax- Defines the boundary
Update- ValueUpdate- (solverParameters.h) Time/100 value update time
Time Time (S)
mesh:
noCuboids- SimulationBase Defines the number of
noC (int) 1, unused
PerProcess (solverParameters.h) cuboids per process
ConvergenceCheck:
Define which quantity is
convergence- "Max- regarded when checking
Stationarity
Type Type[0] Lattice- the convergence. Al-
(solverParameters.h)
(std::string) Velocity" ternative values: Aver-
ageEnergy, AverageRho
phys- Time interval in physical
Stationarity
Interval Interval[0] in which the values are
(solverParameters.h)
(BaseType<T>) checked for convergence
epsilon[0] Stationarity sensitivity for the con-
Residuum
(BaseType<T>) (solverParameters.h) vergence check

187
10.11.3.2. Output

In the following, all parameters for the output produced by the application are explained.

Default Explanation: (if avail-


Parameter Name(type) Class (file)
value able, all) possibilities
Choose 0 or 1 for
clout.setMultiOutput,
Multi-
Used as (bool) testDomain3d.cpp output for all processors.
Output
Needs to be manually
added to the olbInit call
outputDir OutputGeneral Choose your output di-
OutputDir "./temp/"
(std::string) (solverParameters.h) rectory
printLog- If true, call writeLog-
PrintLog- OutputGeneral
Converter true Converter(), i.e. print the
Converter (solverParameters.h)
(bool) converter
Log:
Choose physical time,
write the data in the
logT OutputGeneral terminal each x sec-
SaveTime EXIT
(BaseType<T>) (solverParameters.h) onds until PhysMax-
Time; printLog(iT) in
lbSolver.hh
OutputGeneral Show statements in the
VerboseLog verbose (bool) 1 (true)
(solverParameters.h) terminal, if true
VisualizationVTK, VisualizationImages and VisualizationGnuplot:
OutputPlot (solverParam States whether Output or
Output out (bool) false
eters.h) not
filename OutputPlot (solverParam
FileName "unnamed" name of the outputfile
(std::string) eters.h)
Choose physical time,
OutputPlot (solverParam
SaveTime saveTime (S) EXIT write the data each x sec-
eters.h)
onds until PhysMaxTime

188
Timer:
mode of the display style
timerPrint- OutputGeneral
PrintMode 2 passed to printStep() of
Mode (int) (solverParameters.h)
Timer instance,
0 for single-line layout,
usable for data extraction
as csv,
1 for single-line layout,
(not conform with out-
put rules),
2 for double-line layout,
(not conform with out-
put rules),
3 for performance output
only

189
11. For Developers

11.1. Coding rules


• Indentation: 2 whitespaces, no tabs
• Line length: 80 characters recommended, comments may be longer (might be exceeded in excep-
tional cases for a maintainable layout)
• Naming conventions
• When in doubt use lower camel case (e.g. a function performing a collision and a streaming step is
called collideAndStream() )
• Template arguments in all caps, separated by underscores where required (e.g. template
<typename T, typename DESCRIPTOR>)

• Descriptor field types in all caps, separated by underscores where required (e.g. FORCE)
• Meta type template members in lower case, separated by underscores (e.g. include_fields)
– Variable names are in lower camel case
– Function names are in lower camel case
– Class names are in upper camel case
– Class instances beginning with a small letter
– Functions to set/access class internal variables are called setVariableName() and getVariable-
Name() respectively
• Output with OstreamManager (instead of printf(), for usage details see Section 6.6)
– Status information in semi-csv-style: variableName + ‘=’ + value + ‘; ’ + space
step=1700; avEnergy=5.55583e-05; avRho=1.00005; uMax=0.0321654

– Error messages: ‘Error:’ + space + small text


Error: file not found

• Documentation with doxygen


– Functions are commented by a preceding doxygen-comment in the corresponding header file:
///comment...
– Classes have a descriptive text, containing at best an example usage (also doxygen-style): /**
comment */

• Comments

190
– Comments should be as descriptive as possible. As such, try to write complete sentences and
mind a proper capitalization, spelling and punctuation. This will greatly improve readability!
• Files and names
– Only one public class per file (exceptions apply)
– File names are class names beginning with a lower case character

The OpenLB source code is written as generic high performance code to benefit from user defined precision
or AD-techniques like in [0]. For genericity, it based on the template parameter T instead of one fixed data
type like double. For high performance it is preferred to pass variables by reference, where appropriate,
to avoid copy overhead (exceptions apply). Therefore, we use three different file extensions:

*.h is a normal (often template-based) header file


*.hh contains the actual source code if templates have been used
*.cpp contains the actual source code if templates have not been used, otherwise the
template parameters for the target precompiled are here explicitly instansiated
(defined)

Further details concerning this template based programming can be found in [134]. Furthermore, there
are some points we recommend considering for readable code:
• The equal sign ‘=’ should be surrounded by whitespaces
• Rule of ‘multiplication and division before addition and subtraction’ using whitespaces:
instead of z = a * x + b * y write z = a*x + b*y
• Order of parameters: function(input_params, output_params)
• Order of sections: public, protected, private
• Order of includes: className.h, C/C++ libraries, foreign .h, OpenLB’s .h
• Don’t use relative paths like ‘../src/core/units.h’. This is unix specific and unnecessary
• template<typename T> is written in OpenLB one line above the actual function definition for read-
ability
Finally, keep in mind that even OpenLB’s programmers are not perfect. So the code might not follow
these rules 100%. In case of doubt look around and stick to the most important rule:

Be consistent with the surrounding code!

To apply the above defined coding rules, OpenLB provides a ClangFormat configuration in
.clang-format, which is disabled in some files due to bad results. Formatting a file src/io/base64.h
is achieved by the following command:

clang-format -i --style=file src/io/base64.h

required package can by installed by

sudo apt-get install clang-format

191
11.2. Compile Flags
The gcc compiler from the GNU compiler collection provides additional compiler flags to warn develop-
ers.
-Wfloat-equal warns if floating-point values are used in equality comparisons
-Wfloat-conversion warns for implicit conversions that reduce the precision of a real value
-Wshadow warns whenever variable or type declaration shadows another variable
-Wnon-virtual-dtor warns when a class has virtual functions and an accessible non-virtual destructor
itself or in an accessible polymorphic base class

11.3. GIT Repository


The source code of the OpenLB project is available in a private GIT-repository (https://en.
wikipedia.org/wiki/Git). The repository enables a controlled way of writing code together. Any
change is tracked and will be visible and available to the others which avoids a lot of trouble by means of
finding bugs very fast. There are a few very important rules you should follow. This will make the life for
every programmer, including yourself(!), much easier:
• Write your name in the top of any file you create. This will make you the person which needs to be
asked if someone want to change an interface or if someone reports a bug.
• Never change interfaces without contacting one of the programmer mentioned in the top of the file.
• Before pushing your changes, make sure that you have respected all coding rules (cf. Sec. 11.1) and
that all examples are compiling fine. You can use the command

make samples

from the root directory. Also check that the examples compile in both compile modes,
precompiled, generic, and in sequential as well as in parallel mode.

Access can be provided for every core developer. Please, send an email to [email protected] with
you full name and private address.
Already an expert? Copy the following URL https://gitlab.com/openlb/olb.git into your Git
client and start cloning (replace yourLoginName with your own login name!). Alternatively, navigate to the
Project Page, or for beginners, keep reading.
A detailed overview about GIT can be found in [80]. A short, helpful overview can be downloaded at
http://www.cheat-sheets.org/saved-copy/git-cheat-sheet.pdf and a complete introduc-
tion is given by http://git-scm.com/book/en.
OpenLB automatically installs a git hook that checks a number of common code style issues during
git commit. If the script detects one of the following problems, git commit is aborted and a list with all
detected problems is printed. Currently the script detects:
• trailing whitespaces: Spaces and tabs at the end of a line.
• DOS line endings: Please use Unix line endings and configure your editor or git to use Unix line
endings.

192
• Tabs: The OpenLB style guide requires 2 spaces instead of tabs in the source code. Configure your
editor to automatically replace tabs with 2 spaces.
• Git merge markers: Merge markers resulting from merge conflicts should be resolved before com-
mitting.

Getting started in four steps:


1. Install a Git client: We recommend a Git client version of 1.7.x and above.
• MacOS: Download git installer
• Windows: Download SmartGit
• Linux: Install from source or via your package management utility. e.g.:
yum install git-core
or
apt-get install git-core
2. Clone your Git Repository by

cd ~
mkdir gitSources
cd gitSources
git clone [email protected]:openlb/olb.git

Now, the files are copied to a folder called openlb.


3. Add some files and directories: Now you’ve referenced your on-line repository, it’s time to add
something. Create or edit a file, for example:

cd openlb
echo "first line of my first file... yay!" > helloworld.txt

Now that you have changed something you can add these changes to the index using

git add helloworld.txt

The next step is to commit the changes. Before you commit for the first time, configure your user
email and user name. You will only have to do this once, not every time you commit. The string
after -m is the commit message.

git config --global user.email "[email protected]"


git config --global user.name "Your Name"
git commit -m "test: first commit"

At any time during this process, you can look up what you have already done using

git status

193
4. Push your changes to the master repository:

git push

5. Pulling changes from the remote repository: Before pushing your new changes you always need to
get the latest changes from the remote repository. This can be done through

git pull -r

The -r option is important to ensure a clean git history. It can be enforced that each git pull automat-
ically uses the correct option through

git config --global pull.rebase true

The provider of the service Gitlab also has a web interface, which is available at https://gitlab.com.

Useful git commands


Sometimes there are a lot of irrelevant files in your git folder, i.e. executables, temporary files and all those
things. It is boring to delete them one-by-one and can by done by a git command.

git clean -n // lists all files which are placed in git folder,
// but are not added to the repository

git clean -f // delete all files in git folder that are not
// added to git repository

If you are lost and you only want the initial condition of the actual master branch.

git reset --hard master // reset the workspace to master by


// rewinding added files, ...

Bothered by entering password to push and pull to GitLab? Load your public ssh key to https://
gitlab.com/ and clone git repository via ssh:

git clone [email protected]:openlb/olb.git

11.4. Creating a Merge Request in GitLab


To merge new developments into OpenLB, we use merge requests in GitLab. This is mandatory and the
only way to add changes. For a successful merge, follow these steps:

1. Go to the GitLab merge request page by visiting the following URL: https://gitlab.com
/openlb/olb/-/merge_requests.
2. Click on the "New merge request" button.
3. Set the source branch to the feature branch where the changes were made.

194
4. Set the target branch to the master branch, where the changes should be added to.
5. It’s advisable to merge the master branch into your feature branch first to ensure that your changes
are compatible with the latest code in the master branch.
6. Provide a descriptive title and, if necessary, a description for your merge request. Note that it is
possible to add "Draft:" to the beginning of the title to indicate that work on the branch is still in
progress, and to avoid accidental merges.
7. Assign the merge request to yourself. Optionally, you can add a reviewer to review the code.
8. Several tests are automatically run to ensure correctness. The results of these tests are displayed in
the merge request. All tests must pass for the merge to proceed.
9. Once the merge request is approved and any feedback is addressed, you can merge it into the master
branch through the GitLab web interface.

By following these steps, you can ensure that code changes are reviewed and tested before being inte-
grated into the master branch, maintaining code quality and collaboration within the development team.
To promote continuous improvement, we recommend using the agile approach to merge small, incremen-
tal changes into OpenLB, fostering a more adaptive and efficient development process.

11.5. Debugging
1. Address sanitizer (https://en.wikipedia.org/wiki/Code_sanitizer) Compile with
-g -fsanitize=address
finds memory corruption bugs such as buffer overflows or accesses to a dangling pointer. Very
comfortable.
Works with clang (starting with v3.1) and g++ (starting with v4.8)
Never start as superuser.
2. Valgrind (http://valgrind.org/)
Framework for analysis tools. Comes with (amongst others):
a) Memcheck
finds memory problems malloc/new/free/delete
b) Cachegrind
cache profiler
c) massif
heap profiler
3. scan-build (https://de.wikipedia.org/wiki/Statische_Code-Analyse)
(https://en.wikipedia.org/wiki/Static_program_analysis)
Static analyser, comes with clang. Executed by
scan-build make

195
References

References Involving OpenLB (Articles)


[1] F. Bukreev, F. Raichle, H. Nirschl, and M. J. Krause. “Simulation of adsorption processes
on moving particles based on an Euler–Euler description using a lattice Boltzmann dis-
cretization”. In: Chemical Engineering Science 270 (2023), p. 118485. DOI: https://doi.
org/10.1016/j.ces.2023.118485.
[2] D. Dapelo, S. Simonis, M. J. Krause, and J. Bridgeman. “Lattice–Boltzmann coupled mod-
els for advection–diffusion flow on a wide range of Peclet numbers”. In: Journal of Com-
putational Science 51 (2021), p. 101363. DOI: https://doi.org/10.1016/j.jocs.
2021.101363.
[3] R. Ditscherlein, O. Furat, E. Löwer, R. Mehnert, R. Trunk, T. Leissner, M. J. Krause, V.
Schmidt, and U. Peuker. “PARROT: A Pilot Study on the Open Access Provision of
Particle-Discrete Tomographic Datasets”. In: Microscopy and Microanalysis 28.2 (2022),
pp. 350–360. DOI: 10.1017/S143192762101391X.
[4] M. Gaedtke, T. Hoffmann, V. Reinhardt, G. Thäter, H. Nirschl, and M. J. Krause. “Flow
and heat transfer simulation with a thermal large eddy lattice Boltzmann method in an
annular gap with an inner rotating cylinder”. In: International Journal of Modern Physics C
30.02n03 (2019), p. 1950013. DOI: 10.1142/S012918311950013X.
[5] M. Gaedtke, S. Wachter, S. Kunkel, S. Sonnick, M. Rädle, H. Nirschl, and M. J. Krause.
“Numerical study on the application of vacuum insulation panels and a latent heat stor-
age for refrigerated vehicles with a large Eddy lattice Boltzmann method”. In: Heat and
Mass Transfer (2019), pp. 1–13. DOI: 10.1007/s00231-019-02753-4.
[6] M. Gaedtke, S. Wachter, M. Rädle, H. Nirschl, and M. J. Krause. “Application of a lat-
tice Boltzmann method combined with a Smagorinsky turbulence model to spatially
resolved heat flux inside a refrigerated vehicle”. In: Computers & Mathematics with Ap-
plications 76.10 (Nov. 2018), pp. 2315–2329. DOI: https : / / doi . org / 10 . 1016 / j .
camwa.2018.08.018.
[7] N. Hafen, A. Dittler, and M. J. Krause. “Simulation of particulate matter structure de-
tachment from surfaces of wall-flow filters applying lattice Boltzmann methods”. In:
Computers & Fluids 239 (2022), p. 105381. DOI: https : / / doi . org / 10 . 1016 / j .
compfluid.2022.105381.
[8] N. Hafen, J. E. Marquardt, A. Dittler, and M. J. Krause. “Simulation of Particulate Matter
Structure Detachment from Surfaces of Wall-Flow Filters for Elevated Velocities Apply-
ing Lattice Boltzmann Methods”. In: Fluids 8.3 (2023). DOI: 10.3390/fluids8030099.
[9] N. Hafen, J. Thieringer, J. Meyer, M. J. Krause, and A. Dittler. “Numerical investigation
of detachment and transport of particulate structures in wall-flow filters using lattice
Boltzmann methods”. In: Journal of Fluid Mechanics 956 (2023), A30. DOI: 10 . 1017 /
jfm.2023.35.

196
[10] N. Hafen, J. E. Marquardt, A. Dittler, and M. J. Krause. “Simulation of Dynamic Rear-
rangement Events in Wall-Flow Filters Applying Lattice Boltzmann Methods”. en. In:
Fluids 8.7 (July 2023), p. 213. DOI: 10.3390/fluids8070213.
[11] M. Haussmann, A. Claro Berreta, G. Lipeme Kouyi, N. Riviere, H. Nirschl, and M. J.
Krause. “Large–eddy simulation coupled with wall models for turbulent channel flows
at high Reynolds numbers with a lattice Boltzmann method –Application to Coriolis
mass flowmeter”. In: Computers & Mathematics with Applications 78.10 (2019), pp. 3285–
3302. DOI: https://doi.org/10.1016/j.camwa.2019.04.033.
[12] M. Haussmann, N. Hafen, F. Raichle, R. Trunk, H. Nirschl, and M. J. Krause. “Galilean
invariance study on different lattice Boltzmann fluid–solid interface approaches for
vortex-induced vibrations”. In: Computers & Mathematics with Applications 80.5 (2020),
pp. 671–691. DOI: https://doi.org/10.1016/j.camwa.2020.04.022.
[13] M. Haussmann, P. Reinshaus, S. Simonis, H. Nirschl, and M. J. Krause. “Fluid–
Structure Interaction Simulation of a Coriolis Mass Flowmeter Using a Lattice Boltz-
mann Method”. In: Fluids 6.4 (2021). DOI: 10.3390/fluids6040167.
[14] M. Haussmann, F. Ries, J. Jeppener–Haltenhoff, Y. Li, M. Schmidt, C. Welch, L. Illmann,
B. Böhm, H. Nirschl, M. J. Krause, and A. Sadiki. “Evaluation of a Near–Wall–Modeled
Large Eddy Lattice Boltzmann Method for the Analysis of Complex Flows Relevant to
IC Engines”. In: Computation 8.43 (2020). DOI: 10.3390/computation8020043.
[15] M. Haussmann, S. Simonis, H. Nirschl, and M. J. Krause. “Direct numerical simulation of
decaying homogeneous isotropic turbulence – numerical experiments on stability, con-
sistency and accuracy of distinct lattice Boltzmann methods”. In: International Journal of
Modern Physics C 30.09 (2019), p. 1950074. DOI: 10.1142/S0129183119500748.
[16] T. Henn, G. Thäter, W. Dörfler, H. Nirschl, and M. J. Krause. “Parallel dilute particulate
flow simulations in the human nasal cavity”. In: Computers & Fluids 124 (2016), pp. 197–
207. DOI: http://dx.doi.org/10.1016/j.compfluid.2015.08.002.
[17] S. Höcker, R. Trunk, W. Dörfler, and M. J. Krause. “Towards the simulations of inertial
dense particulate flows with a volume-averaged lattice Boltzmann method”. In: Com-
puters & Fluids 166 (2018), pp. 152–162. DOI: https : / / doi . org / 10 . 1016 / j .
compfluid.2018.02.011.
[18] J. Jeßberger, J. E. Marquardt, L. Heim, J. Mangold, F. Bukreev, and M. J. Krause. “Op-
timization of a Micromixer with Automatic Differentiation”. In: Fluids 7.5 (2022). DOI:
10.3390/fluids7050144.
[19] F. Klemens, S. Schuhmann, R. Balbierer, G. Guthausen, H. Nirschl, G. Thäter, and M. J.
Krause. “Noise reduction of flow MRI measurements using a lattice Boltzmann based
topology optimisation approach”. In: Computers & Fluids 197 (2020), p. 104391. DOI: htt
ps://doi.org/10.1016/j.compfluid.2019.104391.
[20] F. Klemens, S. Schuhmann, G. Guthausen, G. Thäter, and M. J. Krause. “CFD-MRI: A
coupled measurement and simulation approach for accurate fluid flow characterisation
and domain identification”. In: Computers & Fluids 166 (2018), pp. 218–224. DOI: https:
//doi.org/10.1016/j.compfluid.2018.02.022.
[21] K. Klemens, B. Förster, M. Dorn, G. Thäter, and M. J. Krause. “Solving fluid flow domain
identification problems with adjoint lattice Boltzmann methods”. In: Computers & Math-
ematics with Applications 79.1 (2020), pp. 17–33. DOI: https://doi.org/10.1016/j.
camwa.2018.07.010.

197
[22] M. J. Krause, F. Klemens, T. Henn, R. Trunk, and R. Nirschl. “Particle flow simulations
with homogenised lattice Boltzmann methods”. In: Particuology 34 (2017), pp. 1–13. DOI:
http://doi.org/10.1016/j.partic.2016.11.001.
[23] M. J. Krause, A. Kummerländer, S. Avis, H. Kusumaatmaja, D. Dapelo, F. Klemens, M.
Gaedtke, N. Hafen, A. Mink, R. Trunk, J. Marquardt, M. Maier, M. Haussmann, and S.
Simonis. “OpenLB–Open source lattice Boltzmann code”. In: Computers & Mathematics
with Applications 81 (2021), pp. 258–288. DOI: https://doi.org/10.1016/j.camwa.
2020.04.033.
[24] M. J. Krause, G. Thäter, and V. Heuveline. “Adjoint-based Fluid Flow Control and Op-
timisation with Lattice Boltzmann Methods”. In: Computers & Mathematics with Applica-
tions 65.6 (2013), pp. 945–960. DOI: 10.1016/j.camwa.2012.08.007.
[25] A. Kummerländer, M. Dorn, M. Frank, and M. J. Krause. “Implicit propagation of di-
rectly addressed grids in lattice Boltzmann methods”. In: Concurrency and Computation:
Practice and Experience 35.8 (Mar. 2023), e7509. DOI: https://doi.org/10.1002/
cpe.7509.
[26] L. de Luca Xavier Augusto, J. Ross-Jones, G. Cantarelli Lopes, P. Tronville, J. Silveira
Gonçalves, M. Rädle, and M. J. Krause. “Microfiber Filter Performance Prediction using
a Lattice-Boltzmann Method”. In: Communications in Computational Physics 23.4 (2018),
pp. 910–931. DOI: 10.4208/cicp.OA-2016-0180.
[27] M.-L. Maier, T. Henn, G. Thaeter, H. Nirschl, and M. J. Krause. “Towards Validated Mul-
tiscale Simulation with a Two-Way Coupled LBM and DEM”. In: Chemical Engineering &
Technology 40.9 (2017), pp. 1591–1598. DOI: 10.1002/ceat.201600547.
[28] M.-L. Maier, S. Milles, S. Schuhmann, G. Guthausen, H. Nirschl, and M. J. Krause. “Fluid
flow simulations verified by measurements to investigate adsorption processes in a static
mixer”. In: Computers & Mathematics with Applications 76.11 (2018), pp. 2744–2757. DOI:
https://doi.org/10.1016/j.camwa.2018.08.066.
[29] M.-L. Maier, R. A. Patel, N. I. Prasianakis, S. V. Churakov, H. Nirschl, and M. J. Krause.
“Coupling of multiscale lattice Boltzmann discrete–element method for reactive particle
fluid flows”. In: Phys. Rev. E 103 (3 Mar. 2021), p. 033306. DOI: 10.1103/PhysRevE.
103.033306.
[30] J. E. Marquardt, C. Arlt, R. Trunk, M. Franzreb, and M. J. Krause. “Numerical and exper-
imental examination of the retention of magnetic nanoparticles in magnetic chromatog-
raphy”. In: Computers & Mathematics with Applications 89 (2021), pp. 34–43. DOI: https:
//doi.org/10.1016/j.camwa.2021.02.010.
[31] J. E. Marquardt, U. J. Römer, H. Nirschl, and M. J. Krause. “A discrete contact model for
complex arbitrary-shaped convex geometries”. In: Particuology 80 (2023), pp. 180–191.
DOI : https://doi.org/10.1016/j.partic.2022.12.005.

[32] J. E. Marquardt, N. Hafen, and M. J. Krause. “A novel particle decomposition scheme to


improve parallel performance of fully resolved particulate flow simulations”. In: Journal
of Computational Science 78 (2024), p. 102263. DOI: 10.1016/j.jocs.2024.102263.
[33] A. Mink, K. Schediwy, C. Posten, H. Nirschl, S. Simonis, and M. J. Krause. “Comprehen-
sive Computational Model for Coupled Fluid Flow, Mass Transfer, and Light Supply in
Tubular Photobioreactors Equipped with Glass Sponges”. In: Energies 15.20 (2022). DOI:
10.3390/en15207671.

198
[34] A. Mink, G. Thäter, H. Nirschl, and M. J. Krause. “A 3D Lattice Boltzmann Method for
Light Simulation in Participating Media”. In: Journal of Computational Science 17, Part 2
(2016), pp. 431–437. DOI: http://dx.doi.org/10.1016/j.jocs.2016.03.014.
[35] H. Mirzaee, T. Henn, M. J. Krause, L. Goubergrits, C. Schumann, M. Neugebauer, T.
Kuehne, T. Preusser, and A. Hennemuth. “MRI-based computational hemodynamics in
patients with aortic coarctation using the lattice Boltzmann methods: Clinical validation
study”. In: Journal of Magnetic Resonance Imaging 45.1 (2016), pp. 139–146. DOI: 10.1002/
jmri.25366.
[36] M. Mohrhard, G. Thäter, J. Bludau, B. Horvat, and M. J. Krause. “An Auto-Vecotorization
Friendly Parallel Lattice Boltzmann Streaming Scheme for Direct Addressing”. In: Com-
puters & Fluids 181 (2019), pp. 1–7. DOI: https://doi.org/10.1016/j.compfluid.
2019.01.001.
[37] P. Nathen, D. Gaudlitz, M. J. Krause, and N. Adams. “On the Stability and Accuracy of
the BGK, MRT and RLB Boltzmann Schemes for the Simulation of Turbulent Flows”. In:
Communications in Computational Physics 23.3 (2018), pp. 846–876. DOI: 10.4208/cicp.
OA-2016-0229.
[38] F. Reinke, N. Hafen, M. Haussmann, M. Novosel, M. J. Krause, and A. Dittler. “Applied
Geometry Optimization of an Innovative 3D-Printed Wet-Scrubber Nozzle with a Lattice
Boltzmann Method”. In: Chemie Ingenieur Technik 94.3 (2022), pp. 348–355. DOI: https:
//doi.org/10.1002/cite.202100151.
[39] J. Ross-Jones, M. Gaedtke, S. Sonnick, M. Rädle, H. Nirschl, and M. J. Krause. “Con-
jugate heat transfer through nano scale porous media to optimize vacuum insulation
panels with lattice Boltzmann methods”. In: Computers & Mathematics with Applications
77 (2019), pp. 209–221. DOI: https://doi.org/10.1016/j.camwa.2018.09.023.
[40] S. Simonis, M. Frank, and M. J. Krause. “On relaxation systems and their relation to
discrete velocity Boltzmann models for scalar advection–diffusion equations”. In: Philo-
sophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
378.2175 (2020), p. 20190400. DOI: 10.1098/rsta.2019.0400.
[41] S. Simonis, M. Frank, and M. J. Krause. “Constructing relaxation systems for lattice
Boltzmann methods”. In: Applied Mathematics Letters 137 (2023), p. 108484. DOI: https:
//doi.org/10.1016/j.aml.2022.108484.
[42] S. Simonis, M. Haussmann, L. Kronberg, W. Dörfler, and M. J. Krause. “Linear and
brute force stability of orthogonal moment multiple–relaxation–time lattice Boltzmann
methods applied to homogeneous isotropic turbulence”. In: Philosophical Transactions
of the Royal Society A: Mathematical, Physical and Engineering Sciences 379.2208 (2021),
p. 20200405. DOI: 10.1098/rsta.2020.0405.
[43] S. Simonis, J. Nguyen, S. J. Avis, W. Dörfler, and M. J. Krause. “Binary fluid flow simula-
tions with free energy lattice Boltzmann methods”. In: Discrete and Continuous Dynamical
Systems - S (2023). DOI: 10.3934/dcdss.2023069.
[44] S. Simonis, D. Oberle, M. Gaedtke, P. Jenny, and M. J. Krause. “Temporal large eddy sim-
ulation with lattice Boltzmann methods”. In: Journal of Computational Physics 454 (2022),
p. 110991. DOI: https://doi.org/10.1016/j.jcp.2022.110991.
[45] M. Siodlaczek, M. Gaedtke, S. Simonis, M. Schweiker, M. Homma, and M. J. Krause.
“Numerical evaluation of thermal comfort using a large eddy lattice Boltzmann
method”. In: Building and Environment 192 (2021), p. 107618. DOI: https://doi.org/
10.1016/j.buildenv.2021.107618.

199
[46] J. Thieringer, N. Hafen, J. Meyer, M. J. Krause, and A. Dittler. “Investigation of the Re-
arrangement of Reactive–Inert Particulate Structures in a Single Channel of a Wall–Flow
Filter”. In: Separations 9.8 (2022). DOI: 10.3390/separations9080195.
[47] R. Trunk, C. Bretl, G. Thäter, H. Nirschl, M. Dorn, and M. J. Krause. “A Study on Shape–
Dependent Settling of Single Particles with Equal Volume Using Surface Resolved Sim-
ulations”. In: Computation 9.4 (2021). DOI: 10.3390/computation9040040.
[48] R. Trunk, T. Henn, W. Dörfler, H. Nirschl, and M. J. Krause. “Inertial Dilute Particulate
Fluid Flow Simulations with an Euler-Euler Lattice Boltzmann Method”. In: Journal of
Computational Science 17, Part 2 (2016), pp. 438–445. DOI: http://dx.doi.org/10.
1016/j.jocs.2016.03.013.
[49] R. Trunk, J. Marquardt, G. Thäter, H. Nirschl, and M. J. Krause. “Towards the Simulation
of arbitrarily shaped 3D particles using a homogenised lattice Boltzmann method”. In:
Computers & Fluids 172 (2018), pp. 621–631. DOI: https://doi.org/10.1016/j.
compfluid.2018.02.027.
[50] R. Trunk, T. Weckerle, N. Hafen, G. Thäter, H. Nirschl, and M. J. Krause. “Revisiting the
Homogenized Lattice Boltzmann Method with Applications on Particulate Flows”. In:
Computation 9.2 (2021). DOI: 10.3390/computation9020011.
[51] A. Zarth, F. Klemens, G. Thäter, and M. J. Krause. “Towards shape optimisation of fluid
flows using lattice Boltzmann methods and automatic differentiation”. In: Computers &
Mathematics with Applications 90 (2021), pp. 46–54. DOI: https://doi.org/10.1016/
j.camwa.2021.02.016.

References Involving OpenLB (Proceedings and Preprints)


[52] C. Bretl, R. Trunk, H. Nirschl, G. Thäter, M. Dorn, and M. J. Krause. “Preliminary Study
of Particle Settling Behaviour by Shape Parameters via Lattice Boltzmann Simulations”.
In: High Performance Computing in Science and Engineering 20. Ed. by W. E. Nagel, D. H.
Kröner, and M. M. Resch. Cham: Springer International Publishing, 2021, pp. 245–259.
[53] A. Kummerländer, F. Bukreev, S. F. R. Berg, M. Dorn, and M. J. Krause. “Advances in
Computational Process Engineering using Lattice Boltzmann Methods on High Perfor-
mance Computers”. In: High Performance Computing in Science and Engineering ’22. Ed. by
W. E. Nagel, D. H. Kröner, and M. M. Resch. Cham: Springer Nature Switzerland, 2024,
pp. 233–247. DOI: https://doi.org/10.1007/978-3-031-46870-4_16.
[54] A. Kummerländer and M. J. Krause. “Research Software Engineering in OpenLB: Refac-
toring a Legacy Code to State-Of-The-Art Performance”. In: deRSE23 Conference for Re-
search Software Engineering in Germany. Zenodo, Feb. 2023. DOI: 10 . 5281 / zenodo .
7662082.
[55] J. E. Marquardt, N. Hafen, and M. J. Krause. A novel model for direct numerical simulation of
suspension dynamics with arbitrarily shaped convex particles. 2024. DOI: 10.48550/arXiv.
2401.10878.
[56] N. Nadim, T. Chandratilleke, and M. J. Krause. “LBM-LES Modelling of Low Reynolds
Number Turbulent Flow Over NACA0012 Aerofoil”. English. In: Fluid-Structure-Sound
Interactions and Control. Ed. by Y. Zhou, A. Lucey, Y. Liu, and L. Huang. Lecture Notes in
Mechanical Engineering. Springer Berlin Heidelberg, 2016, pp. 205–210. DOI: 10.1007/
978-3-662-48868-3_33.

200
[57] P. Nathen, D. Gaudlitz, M. J. Krause, and J. Kratzke. “An extension of the Lattice Boltz-
mann Method for simulating turbulent flows around rotating geometries of arbitrary
shape”. In: 21st AIAA Computational Fluid Dynamics Conference, San Diego. American In-
stitute of Aeronautics and Astronautics. 2013. DOI: doi:10.2514/6.2013-2573.
[58] S. Simonis and M. J. Krause. “Forschungsnahe Lehre unter Pandemiebedingungen”. In:
Mitteilungen der Deutschen Mathematiker-Vereinigung 30.1 (2022), pp. 43–45. DOI: doi :
10.1515/dmvm-2022-0015.

References Involving OpenLB (Software)


[59] M. J. Krause, S. Avis, D. Dapelo, N. Hafen, M. Haußmann, M. Gaedtke, F. Klemens, A.
Kummerländer, M.-L. Maier, A. Mink, J. Ross-Jones, S. Simonis, and R. Trunk. OpenLB
Release 1.3: Open Source Lattice Boltzmann Code. Version 1.3. May 2019. DOI: 10.5281/
zenodo.3625967.
[60] M. J. Krause, T. Henn, L. Baron, J. Kratzke, J. Fietz, and T. Dornieden. OpenLB Release
0.7: Open Source Lattice Boltzmann Code. Version 0.7. Feb. 2012. DOI: 10.5281/zenodo.
3625936.
[61] M. J. Krause, T. Henn, L. Baron, A. Mink, P. Weisbrod, P. Nathen, and G. Zahnd. OpenLB
Release 0.8: Open Source Lattice Boltzmann Code. Version 0.8. Nov. 2013. DOI: 10.5281/
zenodo.3625938.
[62] M. J. Krause, T. Henn, A. Mink, R. Trunk, P. Nathen, F. Klemens, M.-L. Maier, M.
Mohrhard, A. Claro Barreto, M. Haußmann, M. Gaedtke, and J. Ross-Jones. OpenLB Re-
lease 1.1: Open Source Lattice Boltzmann Code. Version 1.1. Apr. 2017. DOI: 10 . 5281 /
zenodo.3625955.
[63] M. J. Krause, T. Henn, A. Mink, R. Trunk, P. Weisbrod, P. Nathen, F. Klemens, and M.-L.
Maier. OpenLB Release 0.9: Open Source Lattice Boltzmann Code. Version 0.9. Mar. 2015.
DOI : 10.5281/zenodo.3625941.
[64] M. J. Krause, T. Henn, A. Mink, R. Trunk, P. Weisbrod, P. Nathen, F. Klemens, and M.-L.
Maier. OpenLB Release 1.0: Open Source Lattice Boltzmann Code. Version 1.0. Mar. 2016.
DOI : 10.5281/zenodo.3625943.
[65] M. J. Krause, A. Mink, R. Trunk, F. Klemens, M.-L. Maier, M. Mohrhard, A. Claro Barreto,
M. Haußmann, M. Gaedtke, and J. Ross-Jones. OpenLB Release 1.2: Open Source Lattice
Boltzmann Code. Version 1.2. Feb. 2018. DOI: 10.5281/zenodo.3625960.
[66] M. J. Krause, S. Zimny, T. Henn, and J. Fietz. OpenLB Release 0.6: Open Source Lattice
Boltzmann Code. Version 0.6. May 2011. DOI: 10.5281/zenodo.3625929.
[67] M. J. Krause et al. OpenLB Release 1.4: Open Source Lattice Boltzmann Code. Version 1.4.
Nov. 2020. DOI: 10.5281/zenodo.4279263.
[68] A. Kummerländer et al. OpenLB Release 1.5: Open Source Lattice Boltzmann Code. Ver-
sion 1.5. Apr. 2022. DOI: 10.5281/zenodo.6469606.
[69] A. Kummerländer et al. OpenLB Release 1.6: Open Source Lattice Boltzmann Code. Ver-
sion 1.6. Apr. 2023. DOI: 10.5281/zenodo.7773497.
[70] A. Kummerländer et al. OpenLB Release 1.7: Open Source Lattice Boltzmann Code. Ver-
sion 1.7.0. Feb. 2024. DOI: 10.5281/zenodo.10684609.
[71] J. Latt and M. J. Krause. OpenLB Release 0.3: Open Source Lattice Boltzmann Code. Ver-
sion 0.3. July 2007. DOI: 10.5281/zenodo.3625765.

201
[72] J. Latt and M. J. Krause. OpenLB Release 0.4: Open Source Lattice Boltzmann Code. Ver-
sion 0.4. Jan. 2008. DOI: 10.5281/zenodo.3625909.
[73] J. Latt, M. J. Krause, O. Malaspinas, and B. Stahl. OpenLB Release 0.5: Open Source Lattice
Boltzmann Code. Version 0.5. May 2008. DOI: 10.5281/zenodo.3625925.

Other References
[74] LB model with adjustable speed of sound. Technical report. http://www.openlb.
net/tech-reports.
[75] The Paraview project. http://www.paraview.org.
[76] The OpenGPI project. http://www.opengpi.org.
[77] The VTK data format documentation. http : / / www . vtk . org / VTK / img / file -
formats.pdf.
[78] FreeCAD: An Open Source parametric 3D CAD modeler. https://www.freecadweb
.org/.
[79] Configuring OpenLB on MacOS. Technical Report. http://www.openlb.net/tech-
reports.
[80] Git Verteilte Versionskontrolle für Code und Dokumente. http://gitbu.ch/.
[81] S. Ansumali. “Minimal kinetic modeling of hydrodynamics”. PhD thesis. Swiss Federal
Institute of Technology Zurich, 2004.
[82] B. F. Armaly, F. Durst, J. C. F. Pereira, and B. Schönung. “Experimental and theoretical
investigation of backward-facing step flow”. In: Journal of Fluid Mechanics 127 (1983),
pp. 473–496. DOI: 10.1017/S0022112083002839.
[83] P. L. Bhatnagar, E. P. Gross, and M. Krook. “A model for collision processes in gases. I.
Small amplitude processes in charged and neutral one-component systems”. In: Physical
Review 94.3 (1954), pp. 511–525. DOI: 10.1103/PhysRev.94.511.
[84] T. Borrvall and J. Petersson. “Topology optimization of fluids in Stokes flow”. In: Inter-
national Journal for Numerical Methods in Fluids 41.1 (2003), pp. 77–107. DOI: 10.1002/
fld.426.
[85] M. Bouzidi, M. Firdaouss, and P. Lallemand. “Momentum transfer of a Boltzmann-lattice
fluid with boundaries”. In: Physics of Fluids 13.11 (2001), pp. 3452–3459. DOI: 10.1063/
1.1399290.
[86] M. E. Brachet, D. I. Meiron, S. A. Orszag, B. Nickel, R. H. Morf, and U. Frisch. “Small-
scale structure of the Taylor–Green vortex”. In: Journal of Fluid Mechanics 130 (1983),
pp. 411–452.
[87] H. Brinkman. “A calculation of the viscous force exerted by a flowing fluid on a dense
swarm of particles”. English. In: Applied Scientific Research 1.1 (1949), pp. 27–34. DOI:
10.1007/BF02120313.
[88] H. Brinkman. “On the permeability of media consisting of closely packed porous par-
ticles”. English. In: Applied Scientific Research 1.1 (1949), pp. 81–86. DOI: 10 . 1007 /
BF02120318.
[89] A. Caiazzo and M. Junk. “Asymptotic analysis of lattice Boltzmann methods for flow-
rigid body interaction”. In: Progress in Computational Physics 3 (2013), p. 91.

202
[90] S. Chen and G. D. Doolen. “Lattice Boltzmann Method for Fluid Flows”. In: Ann. Rev.
Fluid Mech. 30 (1998), pp. 329–364.
[91] B. Chopard, A. Dupuis, A. Masselot, and P. Luthi. “Cellular Automata and Lattice Boltz-
mann techniques: an approach to model and simulate complex systems”. In: Adv. Compl.
Sys. 5 (2002), pp. 103–246. DOI: 10.1142/S0219525902000602.
[92] L. E. Czelusniak, V. P. Mapelli, M. S. Guzella, L. Cabezas-Gómez, and A. J. Wagner.
“Force approach for the pseudopotential lattice Boltzmann method”. In: Phys. Rev. E
102 (3 2020), p. 033307. DOI: 10.1103/PhysRevE.102.033307.
[93] D. d’Humières, I. Ginzburg, M. Krafczyk, P. Lallemand, and L.-S. Luo. “Multiple-relaxa-
tion-time lattice Boltzmann models in three dimensions”. In: Phil. Trans. R. Soc. Lond. A
360 (2002), pp. 437–451.
[94] T. Dornieden. “Optimierung von Strömungsgebieten mit adjungierten Lattice Boltz-
mann Methoden”. Diplomarbeit. Karlsruhe Institute of Technology (KIT), 2013.
[95] S. Emeis. Wind Energy Meteorology - Second Edition. Apr. 2018. DOI: 10.1007/978- 3-
319-72859-9.
[96] Z.-G. F. and E. E. M. “The immersed boundary-lattice Boltzmann method for solving
fluid–particles interaction problems”. In: Journal of Computational Physics 195.2 (2004),
pp. 602–628. DOI: 10.1016/j.jcp.2003.10.013.
[97] A. Fakhari and M. H. Rahimian. “Phase-field modeling by the method of lattice Boltz-
mann equations”. In: Physical Review E 81.3 (2010), p. 036707.
[98] C. Gau and R. Viskanta. “Melting and Solidification of a Pure Metal on a Vertikal Wall”.
In: Journal of Heat Transfer 108.1 (1986), pp. 174–181.
[99] Z. Guo, C. Zheng, and B. Shi. “Discrete lattice effects on the forcing term in the lattice
Boltzmann method”. In: Phys. Rev. E 65 (2002), p. 046308.
[100] Z. Guo, B. Shi, and C. Zheng. “A coupled lattice BGK model for the Boussinesq equa-
tions”. In: Int. J. Num. Meth. Fluids 39 (2002), pp. 325–342. DOI: 10.1002/fld.337.
[101] Z. Guo, B. Shi, and C. Zheng. “A coupled lattice BGK model for the Boussinesq equa-
tions”. In: International Journal for Numerical Methods in Fluids 39.4 (2002), pp. 325–342.
[102] Z. Guo and T. S. Zhao. “Lattice Boltzmann model for incompressible flows through
porous media”. In: Phys. Rev. E 66 (3 2002), p. 036304. DOI: 10.1103/PhysRevE.66.
036304.
[103] X.-Y. L. H-B Huang and M. C. Sukop. “Numerical study of lattice Boltzmann methods
for a convection-diffusion equation coupled with Navier-Stokes equations”. In: J. Phys.
A: Math. Theor. 44.5 (2011).
[104] R. Huang and H. Wu. “Phase interface effects in the total enthalpy-based lattice Boltz-
mann model for solid–liquid phase change”. In: Journal of Computational Physics 294
(2015), pp. 346–362.
[105] G. Kałuża. “The numerical solution of the transient heat conduction problem using the
lattice Boltzmann method”. In: Scientific Research of the Institute of Mathematics and Com-
puter Science 11.1 (2012), pp. 23–30.
[106] F. Klemens. “Combining computational fluid dynamics and magnetic resonance imag-
ing data using lattice Boltzmann based topology optimisation”. Doctoral thesis. Karl-
sruher Institut für Technologie (KIT), 2020. 131 pp. DOI: 10.5445/IR/1000125499.

203
[107] A. Komrakova, O. Shardt, D. Eskin, and J. Derksen. “Lattice Boltzmann simulations of
drop deformation and breakup in shear flow”. In: International Journal of Multiphase Flow
59 (2014), pp. 24–43. DOI: https://doi.org/10.1016/j.ijmultiphaseflow.
2013.10.009.
[108] T. Krüger, H. Kusumaatmaja, A. Kuzmin, O. Shardt, G. Silva, and E. M. Viggen. The
Lattice Boltzmann Method. Springer, 2017.
[109] A. Ladd and R. Verberg. “Lattice-Boltzmann simulations of particle-fluid suspensions”.
In: Journal of Statistical Physics 104.5-6 (2001), pp. 1191–1251.
[110] D. Lagrava, O. Malaspinas, J. Latt, and B. Chopard. “Automatic grid refinement criterion
for lattice Boltzmann method”. In: ArXiv e-prints (July 2015).
[111] L. Larocque, J. Imran, and M. Chaudhry. “Experimental and Numerical Investigations
of Two-Dimensional Dam-Break Flows”. In: Journal of Hydraulic Engineering 139 (June
2013), pp. 569–579. DOI: 10.1061/(ASCE)HY.1943-7900.0000705.
[112] J. Latt and B. Chopard. “Lattice Boltzmann Method with regularized non-equilibrium
distribution functions”. In: Math. Comp. Sim. 72 (2006), pp. 165–168.
[113] L. Li, R. Mei, and J. F. Klausner. “Boundary conditions for thermal lattice Boltzmann
equation method”. In: Journal of Computational Physics 237 (2013), pp. 366–395.
[114] Q. Liu and Y.-L. He. “Double multiple-relaxation-time lattice Boltzmann model for
solid–liquid phase change with natural convection in porous media”. In: Physica A: Sta-
tistical Mechanics and its Applications 438 (2015), pp. 94–106.
[115] A. Mezrhab, M. A. Moussaoui, M. Jami, H. Naji, and M. Bouzidi. “Double MRT thermal
lattice Boltzmann method for simulating convective flows”. In: Physics Letters A 374.34
(2010), pp. 3499–3507.
[116] S. C. Mishra and H. K. Roy. “Solving transient conduction and radiation heat transfer
problems using the lattice Boltzmann method and the finite volume method”. In: Journal
of Computational Physics 223.1 (2007), pp. 89–107.
[117] A. A. Mohamad. Lattice Boltzmann Method - Fundamentals and Engineering Applications
with Computer Codes. Springer-Verlag, 2011.
[118] A. Mohamad and A. Kuzmin. “A critical evaluation of force term in lattice Boltzmann
method, natural convection problem”. In: International Journal of Heat and Mass Transfer
53.5 (2010), pp. 990–996.
[119] P. Mohammadmoradi. A Multiscale Sandy Microstructure. https://www.digitalroc
ksportal.org/projects/92. 2017. DOI: doi:10.17612/P7PC7C.
[120] D. Noble and J. Torczynski. “A lattice-Boltzmann method for partially saturated compu-
tational cells”. In: Int. J. Modern Phys. C 9.8 (1998), pp. 1189–1202.
[121] C. Peng, L. F. Ayala, and O. M. Ayala. “A thermodynamically consistent pseudo-
potential lattice Boltzmann model for multi-component, multiphase, partially miscible
mixtures”. In: Journal of Computational Physics 429 (2021), p. 110018. DOI: https://doi.
org/10.1016/j.jcp.2020.110018.
[122] Y. Peng, C. Shu, and Y. Chew. “Simplified thermal lattice Boltzmann model for incom-
pressible thermal flows”. In: Physical Review E 68.2 (2003), p. 026701.
[123] G. Pingen, A. Evgrafov, and K. Maute. “Topology optimization of flow domains using
the lattice Boltzmann method”. English. In: Structural and Multidisciplinary Optimization
34.6 (2007), pp. 507–524. DOI: 10.1007/s00158-007-0105-7.

204
[124] R. Rannacher. Einfuehrung in die Numerische Mathematik (Numerik 0). Vorlessungsskrip-
tum SS 2005. Universitaet Heidelberg, 2006.
[125] C. Semprebon, T. Krüger, and H. Kusumaatmaja. “A Ternary Free Energy Lattice Boltz-
mann Model with Tunable Surface Tensions and Contact Angles”. In: Physical Review E
93.3 (2016), p. 033305.
[126] X. Shan and H. Chen. “Lattice Boltzmann model for simulating flows with multiple
phases and components”. In: Phys. Rev. E 47 (1993), pp. 1815–1819. DOI: 10 . 1103 /
PhysRevE.47.1815.
[127] X. Shan and G. Doolen. “Multicomponent lattice-Boltzmann model with interparticle
interaction”. In: Journal of Statistical Physics 81 (1995), pp. 379–393.
[128] S. Simonis. “Lattice Boltzmann methods for Partial Differential Equations”. Doctoral the-
sis. Karlsruhe Institute of Technology (KIT), 2023. DOI: 10.5445/IR/1000161726.
[129] P. A. Skordos. “Initial and boundary conditions for the lattice Boltzmann method”. In:
Phys. Rev. E 48 (1993), pp. 4824–4842.
[130] M. A. A. Spaid and F. R. Phelan. “Lattice Boltzmann methods for modeling microscale
flow in fibrous porous media”. In: Physics of Fluids 9.9 (1997), pp. 2468–2474. DOI: 10.
1063/1.869392.
[131] S. Stasius. “Identifikation von Strömungsgebieten mit adjungierten Lattice Boltzmann
Methoden (ALBM)”. Diplomarbeit. Karlsruhe Institute for Technology (KIT), 2014.
[132] N. Thurey. “Physically based animation of free surface flows with the lattice Boltzmann
method”. In: Ph. D. Thesis, University of Erlangen (2007).
[133] S. Turek and M. Schäfer. “Benchmark computations of laminar flow around cylinder”.
In: Flow Simulation with High-Performance Computers II. Vol. 52. Notes on Numerical Fluid
Mechanics. Vieweg, Jan. 1996, pp. 547–566.
[134] D. Vandevoorde and N. M. Josuttis. C++ Templates: The Complete Guide. Addison-Wesley
Professional, 2003.
[135] L. Wang, Z. L. Guo, and J. C. Mi. “Drafting, kissing and tumbling process of two particles
with different sizes”. In: Computers & Fluids 96 (2014), pp. 20–34. DOI: 10 . 1016 / j .
compfluid.2014.03.005.
[136] D. Yu, R. Mei, L.-S. Luo, and W. Shyy. “Viscous flow computations with the method of
lattice Boltzmann equation”. In: Progress in Aerospace Sciences 39.5 (2003), pp. 329–367.
[137] H. Zheng, C. Shu, and Y.-T. Chew. “A lattice Boltzmann model for multiphase flows
with large density ratio”. In: Journal of Computational Physics 218.1 (2006), pp. 353–371.
[138] Q. Zou and X. He. “On pressure and velocity boundary conditions for the lattice Boltz-
mann BGK model”. In: Phys. Fluids 9 (1997), pp. 1592–1598.

205
A. Appendix

A.1. Q&A
In this Q&A part, some potential questions concerning the code are answered:

What do I need the unit converter for?


The unit converter (Listing A.1) is used in every simulation done with OpenLB. In this class, the physical
units, like length or mass, are converted to lattice units and vice versa This step is necessary to get a result
in the correct physical dimensions and units.
1 UnitConverterFromResolutionAndLatticeVelocity<T,DESCRIPTOR> converter(
2 (int) res, //resolution
3 ( T ) charLatticeVelocity, //charLatticeVelocity
4 ( T ) charPhysLength, //charPhysLength
5 ( T ) charPhysVelocity, //charPhysVelocity
6 ( T ) physViscosity, //physViscosity
7 ( T ) physDensity //physDensity
8 );
9 converter.print();

Listing A.1: UnitConverter

For a closer look, also checkout the respective example in Section 10.3.

How can I write my own dynamics?


Dynamics are the classes that model the cell-specific computation of momenta, equilibria and collision
operator. Ideally, new dynamics are constructed using OpenLB’s flexible dynamics tuple system which
allows the composition of dynamics as a tuple of momenta, equilibria, and collision operator in addition to
a optional combination rule for declaring e.g. forcing schemes. This is the approach used for most dynamics
in OpenLB (also see Section 2.3).
1 template <typename T, typename DESCRIPTOR>
2 using ForcedTRTdynamics = dynamics::Tuple<
3 T, DESCRIPTOR,
4 momenta::BulkTuple,
5 equilibria::SecondOrder,
6 collision::TRT,
7 forcing::Guo
8 >;

Modifying this example to use the BGK collision operator without forcing is as simple as writing:

206
1 template <typename T, typename DESCRIPTOR>
2 using BGKdynamics = dynamics::Tuple<
3 T, DESCRIPTOR,
4 momenta::BulkTuple,
5 equilibria::SecondOrder,
6 collision::BGK
7 >;

In most cases that are not yet covered by the extensive library of dynamics tuples, it should be sufficient to
write e.g. a new collision operator to be plugged into this framework. As a fallback, fully custom dynamics
can be implemented using dynamics::CustomCollision following this basic scaffold:
1 template <typename T, typename DESCRIPTOR, typename MOMENTA=momenta::BulkTuple>
2 struct MyCustomDynamics final : public dynamics::CustomCollision<T,DESCRIPTOR,MOMENTA>
{
3 using MomentaF = typename MOMENTA::template type<DESCRIPTOR>;
4
5 // Declare list of parameter fields (can be empty)
6 using parameters = meta::list<...>;
7
8 // Allow exchanging the momenta, used for example to construct boundary dynamics
9 template <typename M>
10 using exchange_momenta = MyCustomDynamics<T,DESCRIPTOR,DYNAMICS,M>;
11
12 template <typename CELL, typename PARAMETERS, typename V=typename CELL::value_t>
13 CellStatistic<V> apply(CELL& cell, PARAMETERS& parameters) any_platform {
14 // Implement custom collision here
15 };
16
17 T computeEquilibrium(int iPop, T rho, const T u[DESCRIPTOR::d]) const override
any_platform {
18 // Implement custom equilibrium computation here
19 };
20
21 std::type_index id() override {
22 return typeid(MyCustomDynamics);
23 };
24
25 AbstractParameters<T,DESCRIPTOR>& getParameters(BlockLattice<T,DESCRIPTOR>& block)
override {
26 return block.template getData<OperatorParameters<MyCustomDynamics>>();
27 }
28
29 // Return human readable name
30 std::string getName() const override {
31 return " MyCustomDynamics< " + MomentaF().getName() + " > " ;
32 };
33
34 };

How can I write my own post processor?


A non-local operator, also referred to as a post processor in OpenLB, is any class that provides a scope, a
priority and an apply method template (also see Section 2.4). The scope declares how the apply method is

207
to be called, the priority is used to sort the execution sequence of multiple operators assigned to the same
stage and the apply method template implements the actual instructions to be performed.
1 struct MyCustomPostProcessor {
2 // One of OperatorScope::(PerCell,PerCellWithParameters,PerBlock)
3 static constexpr OperatorScope scope = OperatorScope::PerCell;
4
5 int getPriority() const {
6 return 0;
7 }
8
9 template <typename CELL>
10 void apply(CELL& cell) any_platform {
11 // custom non-local code here
12 // access neighbors via ‘cell.neighbor(c_i)‘
13 }
14 };

Listing A.2: Simple post processor implementation

This new post processor can be assigned to cells of the lattice using the various overloads of
SuperLattice::addPostProcessor:

1 // Assign MyCustomPostProcessor to all cells


2 sLattice.addPostProcessor(meta::id<MyCustomPostProcessor>{});
3 // Assign MyCustomPostProcessor to indicated cells
4 sLattice.addPostProcessor(indicatorF,
5 meta::id<MyCustomPostProcessor>{});

Listing A.3: Simple post processor assignment

If the operator depends on non-cell-specific parameters, they can be declared by changing the scope to
OperatorScope::PerCellWithParameters and modifying the apply template arguments.

1 struct MyCustomPostProcessor {
2 static constexpr OperatorScope scope = OperatorScope::PerCellWithParameters;
3
4 using parameters = meta::list<OMEGA>;
5
6 int getPriority() const {
7 return 0;
8 }
9
10 template <typename CELL, typename PARAMETERS>
11 void apply(CELL& cell, PARAMETERS& parameters) any_platform {
12 // access parameter via ‘parameters.template get<OMEGA>()‘
13 }
14 };

Listing A.4: Simple post processor implementation with parameters

The parameters are set in the same way as dynamics parameters.


1 // Set OMEGA parameter of all dynamics and post processors to 0.6
2 sLattice.setParameter<OMEGA>(0.6);

208
How can I write my own coupling operator?
A coupling operator is any class that provides a scope and an apply method template. Different from
single-lattice non-local operators we receive not one cell but a named tuple of them. Consider for illustra-
tion a basic coupling between NSE and ADE lattices in Listing A.5.
1 struct NavierStokesAdvectionDiffusionCoupling {
2 // Declare that we want cell-wise coupling with some global parameters
3 static constexpr OperatorScope scope = OperatorScope::PerCellWithParameters;
4
5 // Declare the two parameters custom to this coupling operator
6 struct FORCE_PREFACTOR : public descriptors::FIELD_BASE<0,1> { };
7 struct T0 : public descriptors::FIELD_BASE<1> { };
8
9 // Declare which parameters are required
10 using parameters = meta::list<FORCE_PREFACTOR,T0>;
11
12 template <typename CELLS, typename PARAMETERS>
13 void apply(CELLS& cells, PARAMETERS& parameters) any_platform
14 {
15 using V = typename CELLS::template value_t<names::NavierStokes>::value_t;
16 using DESCRIPTOR = typename CELLS::template value_t<names::NavierStokes>::
descriptor_t;
17
18 // Get the cell of the NavierStokes lattice
19 auto& cellNSE = cells.template get<names::NavierStokes>();
20 // Get the cell of the Temperature lattice
21 auto& cellADE = cells.template get<names::Temperature>();
22
23 // Computation of the Bousinessq force
24 auto force = cellNSE.template getFieldPointer<descriptors::FORCE>();
25 auto forcePrefactor = parameters.template get<FORCE_PREFACTOR>();
26 V temperatureDifference = cellADE.computeRho() - parameters.template get<T0>();
27 for (unsigned iD = 0; iD < DESCRIPTOR::d; ++iD) {
28 force[iD] = forcePrefactor[iD] * temperatureDifference;
29 }
30 // Velocity coupling
31 V u[DESCRIPTOR::d] { };
32 cellNSE.computeU(u);
33 cellADE.template setField<descriptors::VELOCITY>(u);
34 }
35
36 };

Listing A.5: Full implementation of basic NSE-ADE coupling

Coupling operators are instantiated using the SuperLatticeCoupling template by providing a list
of names and assigned lattices. For the NSE-ADE coupling this will look similar to (see e.g.
thermal/rayleighBenard(2,3)d for a practical example):

1 SuperLatticeCoupling coupling(
2 NavierStokesAdvectionDiffusionCoupling{},
3 names::NavierStokes{}, NSlattice,
4 names::Temperature{}, ADlattice);
5 coupling.setParameter<NavierStokesAdvectionDiffusionCoupling::T0>(...);
6 coupling.setParameter<NavierStokesAdvectionDiffusionCoupling::FORCE_PREFACTOR>(...);

209
As we can see the parameters are set for the specific coupling instance.

What are aliases used for?


Aliases are used to keep the code simpler for users and allow them to get a faster overview over the code.
Aliases aren’t used for all parts of the program. Especially if the user likes to change the code for a special
problem, alias-functions may not be available and therefore need to be created by the user himself or
normal functions can be used. To sum up, alias-functions aren’t necessary for the simulation to work, but
allow to simplify the code for a better overview for the user.

Why do we use "constexpr" to define some functions?


Constexpr allows you to get the output of the function at the time of compilation. The value returned is
set to a constant expression and as a result the runtime can be reduced, due to the constant value. If the
return value is part of an if-loop, it won’t be checked more than once, as the result is already clear, after
the first check. In the example below the rotation of a 2D-particle is calculated and therefore the output is
set to a concrete value at compilation time, and this value stays constant.
1 static constexpr Vector<T,2> execute( Vector<T,2> input, Vector<T,4>
rotationMatrix, Vector<T,2> rotationCenter = Vector<T,2>(0.,0.) )
2 {
3 Vector<T,2> dist = input - rotationCenter;
4 return Vector<T,2>(
5 dist[0]*rotationMatrix[0] +
6 dist[1]*rotationMatrix[2],
7 dist[0]*rotationMatrix[1] +
8 dist[1]*rotationMatrix[3] );
9 }

Listing A.6: constexpr

7. What is the difference between files ending with .h and ending with .hh?

The files ending with .h like particleDynamics.h are header-files, where classes and functions are
only declared. The specification (definition) happens in the .hh files. In the following two code-snippets
of the same function, the differences in terms of specification can be compared.
1 template<typename T, typename DESCRIPTOR>
2 class VerletParticleDynamics : public ParticleDynamics<T,DESCRIPTOR> {
3 public:
4 /// Constructor
5 VerletParticleDynamics( T timeStepSize );
6 /// Procesisng step
7 void process (Particle<T,DESCRIPTOR>& particle) override;
8 private:
9 T _timeStepSize;
10 };

Listing A.7: particleDynamics.h

210
1 template<typename T, typename PARTICLETYPE>
2 VerletParticleDynamics<T,PARTICLETYPE>::VerletParticleDynamics ( T timeStepSize
)
3 : _timeStepSize( timeStepSize )
4 {
5 this->getName() = " V e r l e t P a r t i c l e D y n a m i c s " ;
6 }
7
8 template<typename T, typename PARTICLETYPE>
9 void VerletParticleDynamics<T,PARTICLETYPE>::process (
10 Particle<T,PARTICLETYPE>& particle )
11 {
12 //Calculate acceleration
13 auto acceleration = getAcceleration<T,PARTICLETYPE>( particle );
14 //Check for angular components
15 if constexpr ( providesAngle<PARTICLETYPE>() ) {
16 //Calculate angular acceleration
17 auto angularAcceleration = getAngAcceleration<T,PARTICLETYPE>(
particle );
18 //Verlet algorithm
19 particles::dynamics::velocityVerletIntegration<T, PARTICLETYPE
>(
20 particle, _timeStepSize , acceleration, angularAcceleration );
21 //Update rotation matrix
22 updateRotationMatrix<T,PARTICLETYPE>( particle );
23 } else {
24 //Verlet algorithm without rotation
25 particles::dynamics::velocityVerletIntegration<T, PARTICLETYPE
>(
26 particle, _timeStepSize , acceleration );
27 }
28 }

Listing A.8: particleDynamics.hh

A.2. List of Project Participants


Since 2006 the following persons have contributed source code to OpenLB:
Armani Arfaoui: core: performance improvements for D3Q19 BGK collision operator
Sam Avis: dynamics: multicomponent free energy model
Saada Badie: core: performance improvements for D3Q19 BGK collision operator
Lukas Baron: utilities: (parallel) console output, time and performance measurement, dynamics:
porous media model, functors: concept, div. functors implementation
Tim Bingert: multi-phase-multi-component: MCMP Shan-Chen models and equation of state UnitCon-
verter, examples: air-water equilibrium examples, organization: testing
Fedor Bukreev: reaction: adsorption and reaction models, examples: adsorption examples, organiza-
tion: testing

211
Vojtech Cvrcek: dynamics: power law, examples: power law, updates, functors: 2D adaptation
Davide Dapelo: core: power-law unit converter, dynamics: Guo-Zhao porous, contributions on power-
law, contributions on HLBM, examples: reactionFiniteDifferences2d, advectionDiffusion3d, advec-
tionDiffusionPipe3d, functors: contributions on indicator and smooth indicator
Tim Dornieden: functors: smooth start scaling, io: vti writer
Simon Englert: documentation: user guide
Jonas Fietz: io: configure file parsing based on XML, octree STL reader interface to CVMLCPP, commu-
nication: heuristic load balancer
Benjamin Förster: core: super data implementation, io: new serializer and serializable implementation,
vti writer, new vti reader, functors: new discrete indicator
Max Gaedtke: core: unit converter, dynamics: thermal, examples: thermal
Simon Großmann: example: poiseuille2dEOC, io: csv and gnuplot interface, postprocessing: eoc anal-
ysis
Nicolas Hafen: dynamics: moving porous media (HLBM), examples: surface resolved particle simu-
lations, bifurcation3d, particles: core framework, surface resolved particles, coupling, dynamics,
creator-functions, particle framework refactoring, sub-grid scale refactoring, particle decomposi-
tion, documentation: user guide
Marc Haussmann: dynamics: turbulence modelling, examples: tgv3d, io: gnuplot heatmap
Marc Heinzelmann: postProcessor: surface reaction models using source term and robin-type boundary
conditions, examples: longitudinalMixing3d
Thomas Henn: io: voxelizer interface based on STL, particles: particulate flows
Claudius Holeksa: postProcessor: free surface, example: free surface
Anna Husfeldt: functors: signed distance surface framework
Shota Ito: core: solver, boundary conditions, optimization, reaction: adsorption and reaction models,
examples: poiseuille3dEoc, convectedPlate3d, documentation: user-guide
Jonathan Jeppener-Haltenhoff: functors: wall shear stress, examples: channel3d, poiseuille3d, core:
contributions to define field, documentation: user guide
Julius Jeßberger: core: solver, template momenta concept, optimization, examples: poiseuille2d, cav-
ity2dSolver, porousPlate3dSolver, testFlow3dSolver, optimization examples, postprocessing: error
analysis, utilities: algorithmic differentiation
Florian Kaiser: examples: solidPlate2d, holeyPlate2d, dynamics: LBM for solids, documentation: user
guide
Fabian Klemens: functors: flux, indicator-based functors, io: gnuplot interface
Jonas Kratzke: core: unit converter, io: GUI interface based on description files and OpenGPI, bound-
aries: Bouzidi boundary condition
Mathias J. Krause: core: hybrid-parallelization approach, super structure, communication: OpenMP
parallelization, cuboid data structure for MPI parallelization, load balancing, general: makefile en-
vironment for compilation, integration and maintenance of added components (since 2008), bound-
aries: Bouzidi boundary condition, convection, geometry: concept, parallelization, statistics, io:

212
new serializer and serializable concept, functors: concept, div. functors implementation, examples:
venturi3d, aorta3d, optimization, organization: integration and maintenance of added components
(2008-2017), project management (2006-)
Louis Kronberg: core: ade unit converter, dynamics: KBC, entropic LB, Cumulant, examples: advec-
tionDiffusion1d, advectionDiffusion2d, bstep2d
Eliane Kummer: documentation: user guide
Adrian Kummerländer: core: SIMD CPU support, CUDA GPU support, population and field data struc-
ture, propagation pattern, vector hierarchy, cell interface, field data interface, meta descriptors, auto-
matic code generation, dynamics: new dynamics concept, dynamics tuple, momenta concept, com-
munication: block propagation, communication layer, functors: lp-norm, flux, reduction, lattice
indicator, error norms, refinement quality criterion, composition, boundaries: new post processor
concept, water-tightness testing and post-processor priority, documentation: metadata, user guide,
release organization, general: CI maintenance, Nix environment, Core development, Operator-style
model refactoring
Jonas Latt: core: basic block structure, communication: basic parallel block lattice approach (< release
0.9), io: vti writer, general: integration and maintenance of added components (2006-2008), bound-
aries: basic boundary structure, dynamics: basic dynamics structure, examples: numerous exam-
ples, which have been further developed in recent years, organization: integration and maintenance
of added components (2006-2008), project management (2006-2008)
Marie-Luise Maier: particles: particulate flows, frame change
Orestis Malaspinas: boundaries: alternative boundary conditions (Inamuro, Zou/He, Nonlinear FD),
dynamics: alternative LB models (Entropic LB, MRT)
Jan E. Marquardt: examples: surface resolved particle simulations, resolvedRock3d, particles: core
framework, surface resolved particles, coupling, dynamics, creator-functions, particle decompo-
sition, dynamics: Euler-Euler particle dynamics, functors: signed distance surface framework, util-
ities: algorithmic differentiation, documentation: user guide, general: CI maintenance
Cyril Masquelier: functors: indicator, smooth indicator
Albert Mink: functors: arithmetic, io: parallel VTK interface3, zLib compression for VTK data, GifWriter,
dynamics: radiative transport, boundary: diffuse reflective boundary
Markus Mohrhard: general: makefile environment for parallel compilation, organization: integration
and maintenance of added components
Johanna Mödl: core: convection diffusion reaction dynamics, examples: advectionDiffusionReaction2d
Patrick Nathen: dynamics: turbulence modeling (advanced subgrid-scale models), examples: nozzle3d
Johannes Nguyen: examples: four roll mill, binary shear flow
Aleksandra Pachalieva: dynamics: thermal (MRT model), examples: thermal (MRT model)
Martin Sadric: particles: core framework, creator-functions, documentation: user guide
Maximilian Schecher: postProcessor: free surface, example: free surface, general: Adaptation of non-
particle examples to the operator-style / GPU support

213
Stephan Simonis: core: ade unit converter, examples: advectionDiffusion1d, advectionDiffusion2d, ad-
vectionDiffusion3d, advectionDiffusionPipe3d, binaryShearFlow2d, fourRollMill2d, documenta-
tion: user guide, dynamics: MRT, KBC, Cumulant, entropic LB, free energy model
Lukas Springmann: particles: user-guide, unit tests
Bernd Stahl: communication: 3D extension to MultiBlock structure for MPI parallelization (< release
0.9), core: parallel version of (scalar or tensor-valued) data fields (< release 0.9), io: VTK output of
data (< release 0.9)
Dennis Teutscher: core: porting slipBoundary3d to GPU, examples: porous city3d, functors: atmo-
spheric boundary layer, porous geometry importer (vtk/vdb files), organization: testing, docu-
mentation: user guide
Robin Trunk: dynamics: parallel thermal, advection diffusion models, 3D HLBM, Euler-Euler particle,
multicomponent free energy model
Peter Weisbrod: dynamics: parallel multi phase/component, examples: structure and showcases, phas-
eSeparationXd
Gilles Zahnd: functors: rotating frame functors
Asher Zarth: core: vector implementation
Mingliang Zhong: dynamics: uncertainty quantification, stochastic Galerkin LBM, stochastic collocation
LBM, Monte Carlo methods, external: features
Simon Zimny: io: pre-processing: automated setting of boundary conditions

214
A.3. GNU Free Documentation License
GNU Free Documentation License
Version 1.2, November 2002
Copyright © 2000,2001,2002 Free Software Foundation, Inc.

51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is
not allowed.

Preamble
The purpose of this License is to make a manual, textbook, or other functional and useful document
“free” in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with
or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for
the author and publisher a way to get credit for their work, while not being considered responsible for
modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must them-
selves be free in the same sense. It complements the GNU General Public License, which is a copyleft
license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software
needs free documentation: a free program should come with manuals providing the same freedoms that
the software does. But this License is not limited to software manuals; it can be used for any textual work,
regardless of subject matter or whether it is published as a printed book. We recommend this License
principally for works whose purpose is instruction or reference.

1. Applicability and definitions


This License applies to any manual or other work, in any medium, that contains a notice placed by the
copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-
wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The
“Document”, below, refers to any such manual or work. Any member of the public is a licensee, and is
addressed as “you”. You accept the license if you copy, modify or distribute the work in a way requiring
permission under copyright law.
A “Modified Version” of the Document means any work containing the Document or a portion of it,
either copied verbatim, or with modifications and/or translated into another language.
A “Secondary Section” is a named appendix or a front-matter section of the Document that deals exclu-
sively with the relationship of the publishers or authors of the Document to the Document’s overall subject
(or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the
Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.)
The relationship could be a matter of historical connection with the subject or with related matters, or of
legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being those of
Invariant Sections, in the notice that says that the Document is released under this License. If a section

215
does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The
Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections
then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover
Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be
at most 5 words, and a Back-Cover Text may be at most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in a format whose
specification is available to the general public, that is suitable for revising the document straightforwardly
with generic text editors or (for images composed of pixels) generic paint programs or (for drawings)
some widely available drawing editor, and that is suitable for input to text formatters or for automatic
translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise
Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage
subsequent modification by readers is not Transparent. An image format is not Transparent if used for any
substantial amount of text. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input
format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming sim-
ple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats
include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only
by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not gener-
ally available, and the machine-generated HTML, PostScript or PDF produced by some word processors
for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages as are needed
to hold, legibly, the material this License requires to appear in the title page. For works in formats which
do not have any title page as such, “Title Page” means the text near the most prominent appearance of the
work’s title, preceding the beginning of the body of the text.
A section “Entitled XYZ” means a named subunit of the Document whose title either is precisely
XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here
XYZ stands for a specific section name mentioned below, such as “Acknowledgements”, “Dedications”,
“Endorsements”, or “History”.) To “Preserve the Title” of such a section when you modify the Document
means that it remains a section “Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License ap-
plies to the Document. These Warranty Disclaimers are considered to be included by reference in this Li-
cense, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers
may have is void and has no effect on the meaning of this License.

2. Verbatim Copying
You may copy and distribute the Document in any medium, either commercially or noncommercially,
provided that this License, the copyright notices, and the license notice saying this License applies to the
Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this
License. You may not use technical measures to obstruct or control the reading or further copying of the
copies you make or distribute. However, you may accept compensation in exchange for copies. If you
distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.

216
3. Copying in quantity
If you publish printed copies (or copies in media that commonly have printed covers) of the Document,
numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the
copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover,
and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the
publisher of these copies. The front cover must present the full title with all words of the title equally
prominent and visible. You may add other material on the covers in addition. Copying with changes
limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can
be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed
(as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must ei-
ther include a machine-readable Transparent copy along with each Opaque copy, or state in or with each
Opaque copy a computer-network location from which the general network-using public has access to
download using public-standard network protocols a complete Transparent copy of the Document, free
of added material. If you use the latter option, you must take reasonably prudent steps, when you begin
distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible
at the stated location until at least one year after the last time you distribute an Opaque copy (directly or
through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistribut-
ing any large number of copies, to give them a chance to provide you with an updated version of the
Document.

4. Modifications
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and
3 above, provided that you release the Modified Version under precisely this License, with the Modified
Version filling the role of the Document, thus licensing distribution and modification of the Modified
Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:

A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from
those of previous versions (which should, if there were any, be listed in the History section of the
Document). You may use the same title as a previous version if the original publisher of that version
gives permission.
B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the
modifications in the Modified Version, together with at least five of the principal authors of the
Document (all of its principal authors, if it has fewer than five), unless they release you from this
requirement.
C. State on the Title page the name of the publisher of the Modified Version, as the publisher.
D. Preserve all the copyright notices of the Document.
E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
F. Include, immediately after the copyright notices, a license notice giving the public permission to use
the Modified Version under the terms of this License, in the form shown in the Addendum below.

217
G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in
the Document’s license notice.
H. Include an unaltered copy of this License.
I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item stating at least the
title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there
is no section Entitled “History” in the Document, create one stating the title, year, authors, and
publisher of the Document as given on its Title Page, then add an item describing the Modified
Version as stated in the previous sentence.
J. Preserve the network location, if any, given in the Document for public access to a Transparent copy
of the Document, and likewise the network locations given in the Document for previous versions
it was based on. These may be placed in the “History” section. You may omit a network location for
a work that was published at least four years before the Document itself, or if the original publisher
of the version it refers to gives permission.
K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title of the section,
and preserve in the section all the substance and tone of each of the contributor acknowledgements
and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section
numbers or the equivalent are not considered part of the section titles.
M. Delete any section Entitled “Endorsements”. Such a section may not be included in the Modified
Version.
N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in title with any
Invariant Section.
O. Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qualify as Secondary
Sections and contain no material copied from the Document, you may at your option designate some or
all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified
Version’s license notice. These titles must be distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements of
your Modified Version by various parties–for example, statements of peer review or that the text has been
approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words
as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of
Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any
one entity. If the Document already includes a cover text for the same cover, previously added by you or
by arrangement made by the same entity you are acting on behalf of, you may not add another; but you
may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their
names for publicity for or to assert or imply endorsement of any Modified Version.

5. Combining documents

218
You may combine the Document with other documents released under this License, under the terms
defined in section 4 above for modified versions, provided that you include in the combination all of the
Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of
your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sec-
tions may be replaced with a single copy. If there are multiple Invariant Sections with the same name but
different contents, make the title of each such section unique by adding at the end of it, in parentheses, the
name of the original author or publisher of that section if known, or else a unique number. Make the same
adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various original docu-
ments, forming one section Entitled “History”; likewise combine any sections Entitled “Acknowledge-
ments”, and any sections Entitled “Dedications”. You must delete all sections Entitled “Endorsements”.

6. Collections of documents
You may make a collection consisting of the Document and other documents released under this Li-
cense, and replace the individual copies of this License in the various documents with a single copy that is
included in the collection, provided that you follow the rules of this License for verbatim copying of each
of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this
License, provided you insert a copy of this License into the extracted document, and follow this License
in all other respects regarding verbatim copying of that document.

7. Aggregation with independent works


A compilation of the Document or its derivatives with other separate and independent documents or
works, in or on a volume of a storage or distribution medium, is called an “aggregate” if the copyright
resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what
the individual works permit. When the Document is included in an aggregate, this License does not apply
to the other works in the aggregate which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Doc-
ument is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers
that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is
in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.

8. Translation
Translation is considered a kind of modification, so you may distribute translations of the Document
under the terms of section 4. Replacing Invariant Sections with translations requires special permission
from their copyright holders, but you may include translations of some or all Invariant Sections in addition
to the original versions of these Invariant Sections. You may include a translation of this License, and all
the license notices in the Document, and any Warranty Disclaimers, provided that you also include the
original English version of this License and the original versions of those notices and disclaimers. In case
of a disagreement between the translation and the original version of this License or a notice or disclaimer,
the original version will prevail.

219
If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “History”, the require-
ment (section 4) to Preserve its Title (section 1) will typically require changing the actual title.

9. Termination
You may not copy, modify, sublicense, or distribute the Document except as expressly provided for
under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and
will automatically terminate your rights under this License. However, parties who have received copies,
or rights, from you under this License will not have their licenses terminated so long as such parties remain
in full compliance.

10. Future revisions of this license


The Free Software Foundation may publish new, revised versions of the GNU Free Documentation
License from time to time. Such new versions will be similar in spirit to the present version, but may differ
in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that
a particular numbered version of this License “or any later version” applies to it, you have the option of
following the terms and conditions either of that specified version or of any later version that has been
published (not as a draft) by the Free Software Foundation. If the Document does not specify a version
number of this License, you may choose any version ever published (not as a draft) by the Free Software
Foundation.

220

You might also like