Algorithms of Scientific Computing II: 3. Algebraic Multigrid Methods
Algorithms of Scientific Computing II: 3. Algebraic Multigrid Methods
3.4 AMG
3.4 AMG
3.4 AMG
h = (hx , hy , hz )
• replace derivatives with difference quotients:
• first derivatives: forward-, backward-, or central differences
∂u . u(ξ + hx ) − u(ξ) u(ξ) − u(ξ − hx ) u(ξ + hx ) − u(ξ − hx )
(ξ) = , ,
∂x hx hx 2hx
• second derivatives: standard 3-point stencil
∂ 2 u . u(ξ + hx ) − 2u(ξ) + u(ξ − hx )
=
∂x2 h2x
• Laplace operator in 2D or 3D: 5-point or 7-point stencil
• There are broader stencils, too (involving more neighbours).
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 7
Technische Universität München
• curse of dimensionality:
• O(N d ) points are needed in case of d dimensions
• starting points for improvements:
• stencils of higher order (cubic, quartic, ...):
• take more than two neighbouring points into account
• problem: matrix gets denser (less non-zeroes)
• being more economic with grid points:
• use locally refined grids (adaptive grids)
• problem: what to do where two resolutions meet – which values?
• consider iterative methods, start from x(0) ∈ IRn and end (hopefully)
close to the solution x of Ax = b:
x(0) → x(1) → . . . → x(i+1) → . . . → lim x(i) = x
i→∞
• speed of convergence:
Relaxation Methods
• Richardson iteration:
for i = 0,1,...
(i+1) (i) (i)
for k = 1,...,n: xk := xk + rk
Here, simply the residual r(i) is taken as correction to that current
approximation x(i) (component-wise).
• Jacobi iteration:
for i = 0,1,...
1 (i)
for k = 1,...,n: yk := akk
· rk
(i+1) (i)
for k = 1,...,n: xk := xk + yk
• Gauß-Seidel Iteration:
for i = 0,1,...
for
P k = 1,...,n:
(i) (i+1) (i)
:= bk − k−1 − n
P
rk j=1 akj xj j=k akj xj
M x(i+1) + (A − M )x(i) = b
• If all eigenvalues have a modulus smaller than 1 and if, hence, ρ < 1,
then all error components are reduced in each step of the iteration. If
ρ > 1, at least one error component will/may grow.
• When constructing iterative schemes, our goal must, of course, be a
spectral radius that is as small as possible (as close to zero as possible).
• Obviously, ρ is not only crucial for the question whether the iteration
scheme converges at all, but also for its quality, that is its speed of
convergence: The smaller ρ is, the faster all components of the error e(i)
are reduced in each iterative step.
• In practice, unfortunately, the above results on convergence are of a
more theoretical value only, since ρ is frequently that close to 1 that –
despite convergence – the number of steps to obtain convergence is far
too big.
• An important scenario is the dicretization of partial differential equations:
• Typically, ρ depends on the problem size n and, hence, on the
resolution h of the underlying grid, for example
1
ρ = O(1 − h2l ) = O 1 − l
4
with a mesh width hl = 2−l .
• This is a huge drawback: The finer and more accurate our grid is,
the poorer gets the convergence behaviour of our relaxation
methods. Hence, better iterative solvers are a must!
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 24
Technische Universität München
3.4 AMG
A Simple Example
The V-Cycle
3.4 AMG
3.4 AMG
• Attention: Though AMG does not need any geometric context, AMG is
also frequently used for partial differential equations (which, of course,
have a geometric context). That’s why we will meet geometric analogies
again and again.
• This gets apparent when we consider the adjacency graph of A:
• nodes represent unknowns, i. e. components xi of the solution
vector x;
• edges are present, if ai,j 6= 0.
• If A stems from a 2 D PDE grid, then grid and graph frequently look
the same.
Algebraic Smoothness
• What does smooth mean, if there isn’t any relation to geometry nor
frequency?
• geometrically smooth: the usual idea of smoothness (not
oscillating etc.)
• algebraically smooth: simply everything the smoother produces
(in an AMG context, typically, Gauß-Seidel or related methods) –
which might also imply something geometrically non-smooth
• An algebraically smooth error typically mainly consists of components of
eigenvectors to small eigenvalues (so-called small eigenmodes):
• This means that smoothing especially damps the largest
eigenmodes.
• The smallest eigenmodes – the ”almost-kernel” of A – thus require
the largest attention when doing the coarse grid correction.
Algebraic smoothness. (a) Error for our sample problem, after seven Gauß-Seidel
sweeps. (b) The error is geometrically smooth in x-direction, (c) but oscillates in
y-direction in the right half. (d) In this example, AMG coarsens the grid in the direction
of geometric smoothness. Bigger squares correspond to coarser grids.
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 41
Technische Universität München
(a) each node gets the number of its off-diagonal connections as weight
(number of off-diagonal non-zeroes, after removing small (and irrelevant)
entries)
(b) a point of maximum weight is selected as a C-point
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 44
Technische Universität München
(c) all neighbours of the new C-point that have not yet been assigned to
either F or C become F -points
(d) each new F -point causes an additional weight of 1 to all neighbours that
have not yet been assigned
(...) bottom row: go on until all points are either C-points or F -points
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 45
Technische Universität München
Interpolation
• starting point again: algebraically smooth error means dominance of
small eigenmodes v
• because of r T r = eT A2 e ≈ v T A2 v = λ2 1 (length of v normed to 1!),
this also implies small residuals
• To understand interpolation, assume r = Av = 0 or r = Ae = 0, resp.
(with an error e essentially consisting of small eigenmodes). For an
F -point i, we get
X X X
ai,i ei = − ai,j ej − ai,j ej − ai,j ej ,
j∈Ci j∈Fis j∈Fiw
where
• Ci denotes the C-points with a strong relation to i (interpolation in i
will be based on these only),
• Fis denotes the F -points with strong connection to i, and
• Fiw denotes all points with weak connection to i.
• The crucial point is now to add the error components ej from the second
and third partial sum above either to the points from Ci or directly to
point i.
• 2 examples:
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 46
Technische Universität München
(a) standard 9-point finite element stencil (or matrix relations, resp.)
(b) strongly connected F -points are interpolated from the neighbouring
C-points (weights according to the matrix entries)
(c) then, the strong connections to the F -points are added to the relations to
the C-points (according to the interpolation weights)
(d) from that, the collpased stencil results ...
(e) ... and from that the interpolation rule
Hans-Joachim Bungartz: Algorithms of Scientific Computing II
3. Algebraic Multigrid Methods, 47
Technische Universität München
(a) anisotropic 9-point finite element stencil (or matrix relations, resp.)
(b) weak relations are included into the diagonal element
(c) from that, the collpased stencil results ...
(d) ... and from that the interpolation rule
Performance Features
References