High Performance Computing of Fluid Dynamics
High Performance Computing of Fluid Dynamics
1 Introduction 2
1.1 Lattice Boltzmann Equation . . . . . . . . . . . . . . . . . . . 2
1.2 Shear Wave Decay . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Couette Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Poiseuille Flow . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 The Sliding Lid . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Goal of this Report . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Methods 4
2.1 Lattice Boltzmann Equation . . . . . . . . . . . . . . . . . . . 4
2.2 Shear Wave Decay . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Couette Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Poiseuille Flow . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 The Sliding Lid . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Implementation 8
4 Results 12
4.1 Shear Wave Decay . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Couette Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Poiseuille Flow . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 The Sliding Lid . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Conclusions 19
1
1
Introduction
2
channel drives the flow, resulting in a parabolic velocity profile across the
channel. The Poiseuille flow experiment is widely used to study fluid flow
in confined geometries and is an important benchmark for validating fluid
flow simulations.
3
2
Methods
∂f (r, v, t)
+ v · ∇r f (r, v, t) + a∇v f (r, v, t) = C(f (r, v, t)). (1)
∂t
where ci is the velocity set. The streaming operator is given by:
4
The equilibrium distribution function is given by :
9 3
feqi (ρ(r), u(r)) = wi ρ(r) 1 + 3ci · u(r) + (ci · u(r))2 − |u(r)|2 (2.2)
2 2
where : Z
ρ(x, y) = dv f (x, y, v) (2.3)
is the density
1
Z
v(x, y) = dv f (x, y, v) · c(v) (2.4)
ρ(x, y)
is the velocity
wi = (4/9, 1/9, 1/9, 1/9, 1/9, 1/36, 1/36, 1/36, 1/36) (2.5)
is the weigthing
5
we apply periodic boundary contitions in x directions, as shown here:
And we will observe how the velocity field looks like over time.
6
2.5 The Sliding Lid
We will apply bounceback boundary conditions on the left lower and on the
right wall. On the Top wall we will apply a moving top boundary condition.
We will set up a square with the dimensions 300x300 as our simulation
domain and we will set our initial conditions to ρ(t = 0) = 1, ux (t = 0) = 0
and uy (t = 0) = 0. We will analyse the evolution of the velocity flied with
a streamplot. We will observe different flow pattern for different Reynolds
numbers. The Reynolds number is given by:
Lu
Re = (2.8)
v
Where u is the velocity of the Wall. Finaly we will parallelize the Code
and we will measure the Million Lattice Updates Per Second (MLUPS).
It is intended to parallelize our code using domain decomposition and the
Message Passing Interface (MPI) along with libraries such as ‘mpi4py‘ and
‘ipyparallel‘.
7
3
Implementation
8
u_anm = np . z e r o s ( ( 2 ,NX,NY) )
rho_nm = np . z e r o s ( (NX,NY) )
rho_nm = 1 + ( 0 . 0 1 ∗ np . s i n ( ( 2 ∗ np . p i ∗X. T) /NX) )
The second Shearwave experiment is initialized as follows:
u_anm = np . z e r o s ( ( 2 ,NX,NY) )
u_anm [ 0 ] = 0 . 0 1 ∗ np . s i n ( ( 2 ∗ np . p i ∗Y. T) /NY)
rho_nm = np . o n e s ( (NX,NY) )
the vicosity is calculated analyticly by:
def a n a l v i s c ( omega ) :
return 1 / 3 ∗ ( 1 / omega −1/2)
the vicosity on basis of the simulation is calculated by:
def c l c v i s c ( u1 , u2 ) :
m1 = max( u1 [ 0 ,NX/ / 2 , : ] )
m2 = max( u2 [ 0 ,NX/ / 2 , : ] )
return ( ( ( np . l o g (m1)−np . l o g (m2) ) / ( ( 2 ∗ np . p i /NY) ∗ ∗ 2 ∗ 7 5 0 ) ) )
streaming is done by
def s t r e a m i n g ( f_inm , c_ai ) :
f o r i in range ( 0 , 9 ) :
f_inm [ i ] = np . r o l l ( f_inm [ i ] , c_ai . T [ i ] , a x i s = ( 0 , 1 ) )
return f_inm
The boundary conditions for the Cuetteflow are implemented by:
def c o u e t t e f l o w ( f_inm ) :
un = 0 . 0 0 1
#moving t o p boundary
f_inm [ 4 , : , − 2 ] = f_inm [ 2 , : , − 1 ]
f_inm [ 7 , : , − 2 ] = np . r o l l ( f_inm [5 ,: , −1] , −1) − un/6
f_inm [ 8 , : , − 2 ] = np . r o l l ( f_inm [ 6 , : , − 1 ] , 1 ) + un/6
#b o u n c e b a c k on bottom
f_inm [ 2 , : , 1 ] = f_inm [ 4 , : , 0 ]
f_inm [ 5 , : , 1 ] = np . r o l l ( f_inm [ 7 , : , 0 ] , 1 )
f_inm [ 6 , : , 1 ] = np . r o l l ( f_inm [ 8 , : , 0 ] , − 1 )
return f_inm
The equilibrium density function for the inlet and the outlet of the Poiseuille
Flow are implemented as follows:
def f e q 1 d ( c_ai , rho_n , w_i , u_an ) :
wr = np . einsum ( ’ i , j −> i j ’ , w_i , rho_n )
cu = np . einsum ( ’ i a , an␣−>i n ’ , c_ai . T, u_an )
c u s q u a r e = cu ∗∗2
return wr ∗(1+(3∗ cu ) + ( ( 9 / 2 ) ∗ ( c u s q u a r e ) ) − ( ( 3 / 2 ) ∗ ( c u s q u a r e ) ) )
9
The boundary conditions for the Poiseuille Flow are implemented by:
def P o i s e u i l l e f l o w ( f_inm , feq_inm , d e l t a p , u_anm) :
c = 1/3
r h o i n = np . o n e s (NY+2)#c a l c u l a t e r h o i n
r h o o u t = ( ( c ∗ r h o i n )− d e l t a p ) / c #c a l c u l a t e r h o o u t
u i n = u_anm [ : , 1 , : ]#c l c v i n
uout = u_anm [ : , − 2 , : ]#c l c v o u t
#p e r i o d i c boundary c o n d i t i o n w i t h p r e s s u r e v a r i a t i o n on x a x e s
f_inm [ : , 0 , : ] = f e q 1 d ( c_ai , r h o i n , w_i , uout ) +
( f_inm [ : , − 1 , : ] − feq_inm [ : , − 1 , : ] )
f_inm [ : , − 1 , : ] = f e q 1 d ( c_ai , rhoout , w_i , u i n ) +
( f_inm [ : , 0 , : ] − feq_inm [ : , 0 , : ] )
#stream
f o r i in range ( 0 , 9 ) :
f_inm [ i ] = np . r o l l ( f_inm [ i ] , c_ai . T [ i ] , a x i s = ( 0 , 1 ) )
#b o u n c e b a c k on bottom
f_inm [ 2 , : , 1 ] = f_inm [ 4 , : , 0 ]
f_inm [ 5 , : , 1 ] = np . r o l l ( f_inm [ 7 , : , 0 ] , 1 )
f_inm [ 6 , : , 1 ] = np . r o l l ( f_inm [ 8 , : , 0 ] , − 1 )
#b o u n c e b a c k on t o p
f_inm [ 4 , : , − 2 ] = f_inm [ 2 , : , − 1 ]
f_inm [ 7 , : , − 2 ] = np . r o l l ( f_inm [ 5 , : , − 1 ] , − 1 )
f_inm [ 8 , : , − 2 ] = np . r o l l ( f_inm [ 6 , : , − 1 ] , 1 )
return f_inm
The Boundary conditions for the sliding Lid are implemented as follows:
def s l i d i n g l i d ( f_inm , up , down , l e f t , r i g h t ) :
i f boundary_k [ 0 ] : # l e f t
f_inm [ [ 1 , 8 , 5 ] , 1 ] = l e f t
i f boundary_k [ 1 ] : # r i g h t
f_inm [ [ 3 , 7 , 6 ] , − 2 ] = r i g h t
i f boundary_k [ 2 ] : # down
f_inm [ [ 2 , 5 , 6 ] , : , 1 ] = down
i f boundary_k [ 3 ] : # t o p ( moving w i t h v _ w a l l )
f_inm [ 4 , : , − 2 ] = up [ 0 ]
f_inm [ 7 , : , − 2 ] = up [1] − vn /6
f_inm [ 8 , : , − 2 ] = up [ 2 ] + vn /6
return f_inm
The Communication and Calculation for the sliding Lid Experiment are
implemented as follows:
def c l c _ p l t _ s l i d i n g l i d ( ) :
f_inm=np . o n e s ( ( 9 , nxsub , nysub ) ) / 9
f o r i in range ( s t e p s ) :
10
# communication
r i g h t=f_inm [ [ 1 , 5 , 8 ] , − 2 ] . copy ( )
l e f t =f_inm [ [ 3 , 6 , 7 ] , 1 ] . copy ( )
r e c v b u f = np . z e r o s ( ( 3 , nysub ) )
cartcomm . Sendrecv ( r i g h t , dR , r e c v b u f = r e c v b u f , s o u r c e = sR )
f_inm [ [ 1 , 5 , 8 ] , 0 ] = r e c v b u f
cartcomm . Sendrecv ( l e f t , dL , r e c v b u f = r e c v b u f , s o u r c e = sL )
f_inm [ [ 3 , 6 , 7 ] , − 1 ] = r e c v b u f
r e c v b u f = np . z e r o s ( ( 3 , nxsub ) )
up=f_inm [ [ 2 , 5 , 6 ] , : , − 2 ] . copy ( )
down=f_inm [ [ 4 , 7 , 8 ] , : , 1 ] . copy ( )
cartcomm . Sendrecv ( up , dU , r e c v b u f = r e c v b u f , s o u r c e = sU )
f_inm [ [ 2 , 5 , 6 ] , : , 0 ] = r e c v b u f
cartcomm . Sendrecv ( down , dD , r e c v b u f = r e c v b u f , s o u r c e = sD )
f_inm [ [ 4 , 7 , 8 ] , : , − 1 ] = r e c v b u f
#c a l c u l a t i o n
f_inm = s t r e a m i n g ( f_inm )
f_inm = s l i d i n g l i d ( f_inm , up , down , l e f t , r i g h t )
rho_nm = c l c _ r h o ( f_inm )
u_anm = clc_u ( c_ai , f_inm , rho_nm )
feq_inm = f e q ( c_ai , rho_nm , w_i , u_anm)
f_inm = f b t e ( f_inm , feq_inm , omega )
return f_inm
The MLUPS are calculated as follows:
def measure_performance (numCpu , NX,NY, time , t i m e s t e p s ) :
mlups = ( (NX∗NY) ∗ t i m e s t e p s ) / time
dat = np . a r r a y ( ( [ int (numCpu ) ] , [ mlups ] ) )
np . s a v e ( f o l d e r p a t h + s t r (numCpu ) , dat )
The domain decomposition was achieved using the following code:
i f NX < NY:
s e c t s X=int ( np . f l o o r ( np . s q r t ( s i z e ∗NX/NY) ) )
s e c t s Y=int ( np . f l o o r ( s i z e / s e c t s X ) )
e l i f NX > NY:
s e c t s Y=int ( np . f l o o r ( np . s q r t ( s i z e ∗NY/NX) ) )
s e c t s X=int ( np . f l o o r ( s i z e / s e c t s Y ) )
e l i f NX==NY:
s e c t s Y=int ( np . f l o o r ( np . s q r t ( s i z e ) ) )
s e c t s X=int ( s i z e / s e c t s Y )
11
4
Results
In this chapter, we will present and analyze the results obtained from the
Lattice Boltzmann simulations for each experiment. We will showcase the
temporal evolution of the fluid velocity field using graphical representations
such as streamplots or velocity vectors. Additionally, we will calculate and
compare various flow parameters to gain insights into the fluid behavior in
different scenarios.
12
When observing the plot, it becomes evident that the initial sinusoidal os-
cillation is gradually damped over time. Starting from time step 9950, a
straight line emerges, indicating a density that is uniformly 1 throughout
the domain. The Plot of the second Experiment of the Shear wave decay is
shown underneath, ω is set to 1.
When observing the plot, it becomes evident that the initial sinusoidal os-
cillation is gradually damped over time. Starting from time step 10000, a
straight line emerges, indicating a velocity that is uniformly zero throughout
the domain. The Plot of the analytical and the measured Viscosity is shown
underneath. Omega was chosen at intervals of 0.01 between 0 and 2. U was
initialized using a sine wave, and ρ was initialized to one.
13
When observing the plot, one notices that the analytical solution yields an
increasing viscosity for smaller values of omega and a decreasing viscosity for
larger values of omega. To be more precise, as omega approaches zero, the
viscosity tends towards infinity, while as omega approaches 2, the viscosity
approaches zero. In the case where the kinematic viscosity was calculated
from the simulation results, it can be observed that for omega approaching
zero, the viscosity tends towards zero as well. Moreover, the graphs exhibit
identical behavior starting from an omega value of 0.25. The fact that both
density and velocity decrease after each iteration, transitioning from the
sinusoidal initial state to a linear one, indicates the correct functioning of
the code. The fact that the theoretical and practical viscosity values behave
identically starting from an omega of 0.25 indicates that the code functions
correctly beyond an omega value of 0.25.
After 100 time steps, it can be observed that the velocities in the x-direction
closely resemble the velocities at the walls. A significant portion of the
velocity is zero, except for the last part which rapidly increases to match
the velocity of the upper wall. After 1000 time steps, the velocity alignment
with the lower wall (velocity = 0) becomes less prominent, and the gradual
increase towards the upper wall’s velocity starts earlier and takes longer.
Starting from time step 35000, a linear trend is noticeable, indicating a
direct increase in velocity from 0 to the velocity of the upper wall. The fact
that the simulated value doesn’t differ after a time step of 35000 suggests
that the code is suitable for simulations beyond time steps of 35000.
14
4.3 Poiseuille Flow
In the Poiseuille flow experiment, we will investigate the parabolic velocity
profile and the pressure variation along the channel. We will compare the
numerical results with analytical solutions to validate the accuracy of the
Lattice Boltzmann simulations. The Plot of velocity in the x direction at
the inlet during the Poiseuille Flow experiment is shown underneath, ω is
set to 1.
Plot of the velocity at the inlet during the Poiseuille Flow experiment
It can be observed that at the entrance of the Poiseuille Flow channel, the
velocities at the top and bottom remain equal to zero. In the middle of
the channel, they gradually rise from zero to the flow velocity. Initially,
an almost linear trend is noticeable. Starting from the 7700th time step,
a parabolic pattern becomes clearly visible, expanding further over time.
After the 11000th iteration, the simulation matches the analytical solution.
The Plot of velocity in the x direction at the centerline during the Poiseuille
Flow experiment is shown underneath
You can observe that the x-velocity remains unchanged from the inlet to the
middle of the channel. The parabolic shape remains consistent.
15
The Plot of density at the centerline during the Poiseuille Flow experiment
is shown underneath, ω is set to 1.
Plot of the velocity Profile during the Sliding Lid experiment with Dimensions of
300x300
16
After the 1000th iteration, we can observe the formation of a vortex at the
upper-right corner.
Plot of the velocity Profile during the Sliding Lid experiment with Dimensions of
300x300
After the 200000th iteration, it becomes apparent that this vortex has shifted
towards the center, forming a whirlpool there. Small eddies can be observed
at the corners. The steady state has been reached.
The following 2 plots are with dimensions of 1200x1200
Plot of the velocity Profile during the Sliding Lid experiment with Dimensions of
1200x1200
After the 100000th iteration, it becomes apparent that this vortex has shifted
slighty towards the center, forming a whirlpool there. One Small whirl can
be observed at the right corner.
17
Plot of the velocity Profile during the Sliding Lid experiment with Dimensions of
1200x1200
After the 100000th iteration, it becomes apparent that this vortex has shifted
slighty towards the center, forming a whirlpool there. When comparing the
last two plots, it becomes evident that the lower Reynolds number con-
tributes to an earlier establishment of the steady state.
The following plot illustrates the MLUPS (Million Lattice Updates Per Sec-
ond) for different numbers of CPUs.
It can be observed that with a low number of CPUs, adding CPU cores makes
a more significant difference compared to a higher number of CPUs. This can
be explained by the fact that there is a computational time required, which
decreases as the number of CPUs increases. On the other hand, there’s a
communication time needed for data exchange, and this time increases as
the number of CPUs rises. Eventually, the advantage of faster computation
becomes balanced out, and it even requires more time.
18
5
Conclusions
In this final chapter, we will summarize the key findings and conclusions
drawn from the Lattice Boltzmann simulations.
If the density is initialized with a sinusoidal wave plus p0, the integral
over the density equals p0. At the beginning, there are points where the
density is greater or smaller than p0. After a certain time, depending on
the initial density difference and the dimensions of the observed space, the
density should become p0 everywhere. Since this occurred after 9950 steps,
this experiment can be considered successful. When initializing the velocity
with a sinusoidal wave, it should similarly become zero after some time,
as in the case above. Since this also happened, the code works for this
experiment as well. The viscosity has taken on the analytical value starting
from an omega value of 0.25. This means that the viscosity of the simulation
is correct from an omega value of 0.25 onwards, assuming the analytical
viscosity is correct.
During the Couetteflow experiment The velocity profile between the
plates should forms a linear gradient. The velocity should increase linearly
from the stationary plate to the moving plate. The rate of change of ve-
locity with respect to distance between the plates is known as the velocity
gradient. Since the velocity gradient behaved as expected, this experiment
can be considered successful.
Due to the periodicity of the y-axis and the use of bounce-back condi-
tions for the upper and lower walls in the Poiseuille Flow experiment, the
analytical velocity profile takes the form of a parabola. This velocity profile
is consistent at every x-axis cross-section due to the periodicity of the y-axis.
The fact that we observed the same profile as the analytical solution at the
inlet and in the middle after a certain time step indicates that the code is
functioning correctly.
The sliding lid experiment demonstrates the development of a shear layer
between a moving lid and stationary walls. This shear layer gives rise to
velocity gradients and complex flow patterns, which can be influenced by
19
factors like Reynolds number, domain size, and initial conditions. The ve-
locity behavior is driven by the interaction between the moving boundary
and the bounce back conditions at the stationary walls.
In summary, the lattice Boltzmann method can be effectively used under
specific conditions. These conditions include:
One should use an omega value between 0.25 and 1.7. This is because the
simulation matches the expected behavior starting from an omega of 0.25,
and using an omega greater than 1.7 leads to ultrasonicsound velocities in
the results.
A simulation duration greater than 35000 should be chosen, as the sim-
ulation values before that point do not align with theoretical expectations.
When using a grid size of 300x300, it’s recommended to utilize 36 CPUs
to minimize the time to obtain results.
Following these restrictions, one can effectively simulate the behavior of
fluids using the lattice Boltzmann method.
20
Bibliography
21