Scientific Computing EE
Scientific Computing EE
Martijn van Beurden
Neil Budko
Wil Schilders Editors
Scientific
Computing
in Electrical
Engineering
SCEE 2020, Eindhoven, The Netherlands,
February 2020
MATHEMATICS IN INDUSTRY 36
Series Editors
Hans-Georg Bock, Interdisciplinary Center for Scientific Computing IWR,
Heidelberg University, Heidelberg, Germany
Frank de Hoog, CSIRO, Canberra, Australia
Avner Friedman, Ohio State University, Columbus, OH, USA
Arvind Gupta, University of British Columbia, Vancouver, BC, Canada
André Nachbin, IMPA, Rio de Janeiro, RJ, Brazil
Tohru Ozawa,Waseda University, Tokyo, Japan
William R. Pulleyblank, United States Military Academy, West Point, NY,
USA
Torgeir Rusten, Det Norske Veritas, Hoevik, Norway
Fadil Santosa, University of Minnesota, Minneapolis, MN, USA
Jin Keun Seo, Yonsei University, Seoul, Korea (Republic of)
Anna-Karin, Tornberg, Royal Institute of Technology (KTH), Stockholm,
Sweden
SUBSERIES
Managing Editor
Michael Günther, University of Wuppertal, Wuppertal, Germany
Series Editors
Luis L. Bonilla, University Carlos III Madrid, Escuela, Leganes, Spain
Otmar Scherzer, University of Vienna, Vienna, Austria
Wil Schilders, Eindhoven University of Technology, Eindhoven,
The Netherlands
The ECMI subseries of the Mathematics in Industry series is a project of The
European Consortium for Mathematics in Industry. Mathematics in Industry
focuses on the research and educational aspects of mathematics used in industry and
other business enterprises. Books for Mathematics in Industry are in the following
categories: research monographs, problem-oriented multi-author collections,
textbooks with a problem-oriented approach, conference proceedings. Relevance to
the actual practical use of mathematics in industry is the distinguishing feature of
the books in the Mathematics in Industry series.
Scientific Computing
in Electrical Engineering
SCEE 2020, Eindhoven, The Netherlands,
February 2020
Editors
Martijn van Beurden Neil Budko
Electrical Engineering Department of Applied Mathematics
Eindhoven University of Technology Delft University of Technology
Eindhoven, The Netherlands Delft, The Netherlands
Wil Schilders
Eindhoven University of Technology
Eindhoven, The Netherlands
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
From February 16 until February 20, 2020, the 13th International Conference on
“Scientific Computing in Electrical Engineering” (SCEE) was held in Eindhoven,
The Netherlands. It was jointly organized by the Centre for Analysis, Scientific
Computing and Analysis (CASA) and the Electromagnetics group of Eindhoven
University of Technology, and the group Numerical Analysis of Delft University of
Technology.
Even though 13 is a number often associated with bad luck, this edition was
actually very fortunate. Already prior to and during the conference, the world was
discussing the SARS-CoV-2virus and associated problems and measures, and not
long after the conference ended, there was a lockdown in many countries. We are
very happy that SCEE 2020 took place, not in a virtual way, but with many face-
to-face contacts, meeting our esteemed colleagues once again, having lunches and
dinner together in an excellent location, the “Academisch Genootschap Eindhoven.”
Participants enjoyed the setting and the surroundings, as well as the opportunity to
sit in the garden and discuss.
The thirteenth edition of the SCEE conference brought together some 85 partici-
pants from the fields of applied mathematics, electrical and electronic engineering,
and the computer sciences as well as scientists from industry. Again, it created an
excellent working atmosphere, especially due to its unique workshop character,
where all talks and poster introductions were presented in plenary sessions. In
addition, we had very clear and high-quality talks and poster presentations, lively
and fruitful discussions, and a great deal of personal networking.
The Scientific Program Committee invited four experts to give keynote presen-
tations on the main topics in the regular program. Keynote speakers at SCEE 2020
were (in alphabetical order):
• Liliane Borcea (University of Michigan—USA), “Reduced order model
approach for inverse scattering”
• Romanus Dyczij-Edlinger (Universty of Saarland—Germany), “Reduced-order
finite-element modeling and optimization of antennas”
v
vi Preface
We would like the thank Eindhoven University of Technology, viz. the Centre for
Analysis, Scientific Computing and Applications (CASA) within the Department of
Mathematics and Computer Science and the Electromagnetics (EM) group within
the Department of Electrical Engineering, and Delft University of Technology,
Department of Applied Mathematics (DIAM), for their help and support in the
organization of the SCEE 2020 Conference.
We are also grateful for the financial support by the Applied Mathematics
Institute of the four Universities of Technology in The Netherlands (4TU.AMI), the
mathematics cluster NDNS+ (Nonlinear Dynamics of Natural Systems), the Dutch
National Organisation for Research (NWO), the European Marie-Curie-Skłodowska
Industrial Doctorate Project ROMSOC (Reduced Order Modelling, Simulation and
Optimization of Coupled Systems), and CST—Computer Simulation Technology
AG in Darmstadt, part of Dassault Systèmes.
Last but not least, we would like to thank all the members of the Local Organizing
Committee and the Scientific Committee who helped us very much in preparing and
running the conference. The careful reviewing process was only possible with the
help of the members of the Scientific Committee who were handling the reviewing
process. The anonymous referees did a wonderful job that helped the authors to
improve the quality of their contributions.
Finally, we express our gratitude to our colleagues from Springer Heidelberg for
continued support and patience during the preparation of this volume.
ix
Organization
Program Committee
xi
xii Organization
Standing Committee
Sponsors
TU Eindhoven
TU Delft
NWO
CST
4TU.AMI
NDNS+
ECMI
Contents
xiii
xiv Contents
3 Results . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 196
3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 197
4 Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198
Waveform Relaxation for Low Frequency Coupled Field/Circuit
Differential-Algebraic Models of Index 2.. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201
Idoia Cortes Garcia, Jonas Pade, Sebastian Schöps, and Caren Tischendorf
1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201
2 Field/Circuit Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 202
3 Waveform Relaxation and Convergence . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 204
4 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 207
5 Conclusions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 208
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 209
Splitting Methods for Linear Circuit DAEs of Index 1 in
port-Hamiltonian Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 211
Malak Diab and Caren Tischendorf
1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 211
2 Circuit Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 212
3 Operator Splitting for Index-1 Circuit DAEs . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 215
3.1 Subsystem Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 215
3.2 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 216
4 Numerical Simulation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 217
5 Conclusions and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 218
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 218
Reduced Order Modelling for Wafer Heating with the Method of
Freezing . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 221
E. J. I. Hoeijmakers, H. Bansal, T. M. van Opstal, and P. A. Bobbert
1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 221
2 Theory . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 222
2.1 Model Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 223
2.2 Model Reformulation: Method of Freezing.. . . . . . .. . . . . . . . . . . . . . . . . . . . 224
3 Reduced Order Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 225
3.1 Standard Reduced Order Modelling Approach .. . .. . . . . . . . . . . . . . . . . . . . 225
3.2 Proposed Reduced Order Modelling Approach . . .. . . . . . . . . . . . . . . . . . . . 226
4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 226
5 Conclusion and Future Outlook . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
Multirate DAE-Simulation and Its Application in System
Simulation Software for the Development of Electric Vehicles . . . . . . . . . . . . . 231
Michael Kolmbauer, Günter Offner, Ralf Uwe Pfau,
and Bernhard Pöchtrager
1 Background and Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 232
2 Problem Formulation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 232
Contents xix
1 Introduction
A myelinated axon (Fig. 1) consists of myelinated sections through which the signal
dissipates, which alternate with Ranvier nodes where the signal is regenerated
(“saltatory conduction”). To model the transmission of signals through this chain,
the phenomena occurring in the Ranvier nodes have to be coupled with those in the
myelinated sections (internodes). The underlying mechanisms of a Ranvier node are
well described by the Hodgkin & Huxley model [10], or its reduced versions [19].
Fig. 1 Simplified
geometrical model of a
myelinated axon, as a chain
of internode myelinated
compartments and Ranvier
nodes
iP1 Rk Rk Rk +1
1 Vk 1 (t) Vk (t) Vk +1 (t) iP2
uP1 Gk 1
Ck 1 Gk Ck Gk +1 Ck +1
uP2
PORT 1 PORT 2
Fig. 2 The companion circuit of an internode. The network of RC (in our case identical) cells is
generated by the spatial discretization with centered differences of the transmission line equation.
The outer dotted line outlines the system and the inner dashed line denotes one individual RC cell
points in the complex plane. This formulation is preferred to the direct reduction of
the number of cells of the segmented numerical model, which already is a reduced
model.
3 System Reduction
Rr = T QRQ Br = T QB
and they are used to construct the reduced system in the port-Hamiltonian form:
ẋr = (Jr − Rr )Qr xr + Br u(t)
. (3)
y = BTr Qr xr
Such reduced order system matches the 0-order moments of the original sys-
tem at the chosen interpolation points [3]. The reduction procedure is structure-
preserving, in the sense that the reduced system is still in the port-Hamiltonian form,
but the matrices have lost some of their properties, for instance Qr is not diagonal
anymore, Rr is not tridiagonal, but is still symmetric and positive definite, Br is now
likely full.
Each impedance ZT0 , ZT1 and ZT2 can be realized through a pole-residue decom-
position as the sum of the impedances of r cells connected
in series, each composed
of a capacitor in parallel with a conductance: Zxx = rk=1 1/(Ck s + Gk ).
The reduced system (3) can be viewed in the form of a standard description of an
RC circuit
Cẋr = −Gxr + Bu(t)
y = Exr
In theory the capacitance Ck and the conductance Gk of a cell may have any sign.
But the CG pair signs differ only by the signs of the diagonal values of the matrices
C and G. Here C is the identity matrix, so clearly positive definite. G is a diagonal
matrix that comes from the original system matrix RQ, which is positive definite.
Since the reduction procedure guarantees passivity, it will preserve the definiteness
of the system matrix. Hence G has only positive values on the diagonal. This means
that for every cell, Ck and Gk are either both positive or both negative.
Consider the synthesized circuit of Zxx as in Fig. 4 (left), where the first two cells
have positive values and the third has negative values. In Fig. 4 (right) the circuit
is split into the “positive” and the “negative” contributions [18]. For the negative
subcircuit the signs for both Ck and Gk are reversed and the same excitation is used
Efficient Model Reduction of Myelinated Compartments as Port-Hamiltonian Systems 9
Fig. 4 Synthesis of a component Zxx of order 3. (Left): The circuit with positive and negative
CG pairs, y = V(1). (Right): The circuit split into the “positive” and “negative” subcircuits, y =
V(1+) − V(1−) extracted as the voltage of a null current source that connects the two subcircuits
for both subcircuits. The initial circuit has the same output y = V(1) as the circuit
after splitting, computed as the difference of two voltages y = V(1+) − V(1−).
5 Results
Figure 5 shows the frequency responses of the two components of the original (50
cells) and reduced (order 5) systems. The response of the transfer component (1,2)
is very far from the original system’s, but the values are so small that this graph is
in fact not relevant accuracy-wise, because the reduction procedure has an implicit
minimization of the H2 norm. This is proved by almost identical step responses.
The relative error is under 2% even for order 1 and is comparable with the one
obtained with vector fitting (Fig. 6 left) with the adaptive frequency sampling (AFS)
procedure described in [6]. In the interest of fairness, the errors are computed for
10-2 10-50
10-3
original model (order 50)
reduced model (order 5) - iteration1
-4 -100
10 10 reduced model (order 5) - iteration2
10-5
10-6 10-150
100 102 104 106 108 100 102 104 106 108
Frequency [Hz] Frequency [Hz]
Fig. 5 The frequency responses of the original (50 cells) and reduced (order 5) systems (note the
diminutive vertical scale on the graph to the right)
10 R. Barbulescu et al.
10 -1 35
30
10 -2
10 -3
20
15
10 -4
10
10 -5
pH moment-matching - s in {0,[1e1,1e3]} (lin) 5
vector fitting AFS - input [1e0,1e7] Hz (log)
10 -6 0
1 2 3 4 5 6 7 8 9 0 5 10 15 20 25 30
Order of reduced model Time [ms]
Fig. 6 (Left): The relative error vs. the order of the reduced model for pH and VF methods;
f
errrel = fmM w(f )
Zorig (f ) − Zred (f )
2 df/Z0 , frequency f ∈ [fm , fM ] = [100 , 107 ] Hz,
logarithmically spaced, w(f ) is the weight function, Z0 is the d.c. impedance of the line [11].
(Right): The reproduced output of the reduced circuit (order 3) built in NEURON
Fig. 7 The ECi+ circuit extracted from the reduced system of order 3 and reproduced in
NEURON. The output is [V(l1p) − V(l1m); V(r3p) − V(r2m)]
the best set of conditions for each of the two methods. The new method is not meant
to improve accuracy but to generate a model without controlled sources and with
positive elements, which is a requirement for the inclusion in NEURON.
The reduced circuit of order 3 is built in NEURON (Fig. 7) and the output is
reproduced in Fig. 6 (right). The input is a rectangular pulse in the left (i1 = I1p =
Efficient Model Reduction of Myelinated Compartments as Port-Hamiltonian Systems 11
I1m) and open-circuit in the right (i2 = I2p = I2m = 0). The corresponding outputs
copy the shape of the input and the relative difference between the corresponding
peaks of the original circuit (50 cells) and the reduced one is between 2% and 3%.
6 Conclusions
Acknowledgments This work was partially supported by Portuguese national funds through FCT,
Fundação para a Ciência e a Tecnologia, under project UIDB/50021/2020 as well as project
PTDC/EEI-EEE/31140/2017.
References
1 Introduction
Another direction of obtaining the steady state is based on the solution of the
joint space- and time-discrete time-periodic system formulated on the whole period
[8]. There the initial and final values are coupled through the prescribed periodicity
condition. An obstacle within the solution of the periodic problem in the time
domain becomes the large size of the system matrix as well as its special block
structure due to the interdependence of the solution vectors over the period. To deal
with this difficulty a frequency domain approach was proposed in [2]. In case of
linear problems, the method takes advantage of the block-cyclic matrix structure
by applying the discrete Fourier transform. It fully decouples the variables, thereby
allowing for the separate solution of each harmonic coefficient. This approach was
further extended and incorporated into the Parareal framework by the authors in
[10]. There, a simplified Newton-based iterative algorithm was presented together
with its convergence analysis for the efficient treatment of nonlinear problems.
Solutions of time-periodic problems become much more challenging when the
period is not given. Such situation occurs, e.g., when dealing with an autonomous
system [3]. In contrast to a non-autonomous problem, the periodicity cannot be
determined from the applied excitation. This paper proposes a numerical algorithm
capable of determining an appropriate period automatically using parallelization in
the time domain. Extending the idea of the multiple shooting method we include the
unknown period together with multiple initial values as the sought parameters into
the iterative procedure. Verification of the presented approach is illustrated through
its application to the Colpitts oscillator model [9].
The paper is organized as follows. Section 2 describes the basis of the multiple
shooting approach including the unknown period as an additional variable. This is
further expanded to the family of the Parareal-based methods in Sect. 3. Section 4
applies the proposed parallel-in-time approach to the Colpitts oscillator model using
a particular linerization on the coarse level. The paper is finally summarized in
Sect. 5.
where the period T > 0 and the vector ũ : [0, T ] → Rd , d ≥ 1 are sought. M is a
given non-singular mass matrix, f is a bounded and Lipschitz continous right-hand
side (RHS) function. Following [3] we incorporate the period T as an unknown
parameter by performing the change of variables
[0, T ] t
→ τ := t/T ∈ [0, 1]. (2)
Parallel-in-Time Calculation of Time-Periodic Solutions with Unknown Period 15
The problem (1) is thereby transformed into the equivalent one: find T > 0 and
u : [0, 1] → Rd such that
Mu (τ ) = T f(u(τ )), τ ∈ (0, 1)
(3)
u(0) = u(1).
The unit interval [0, 1] is then partitioned into N windows by the nodes 0 = τ0 <
τ1 < · · · < τN = 1. The n-th subinterval has length Δτn = τn − τn−1 , for n =
1, . . . , N.
For a given discrete variable Un−1 , we consider an initial value problem (IVP)
on the window (τn−1 , τn ]
and let F (τn , τn−1 , Un−1 , T ) denote the solution operator of (4) for n = 1, . . . , N.
A sketch of the piecewise-defined solution due to the interval splitting is shown
in Fig. 1. In order to eliminate the jumps at the synchronization points τn , n =
1, . . . , N − 1 as well as the difference between the initial value at τ0 and the final
one at τN the matching conditions:
F (τN , τN−1 , UN−1 , T ) − U0 = 0,
(z) := (5)
F (τn , τn−1 , Un−1 , T ) − Un = 0, n = 1, . . . , N − 1
have to be satisfied, where z = U
0 , . . . , UN−1 , T . System (5) represents the
root-finding problem for the mapping : RNd+1 → RNd . The Jacobian of is
given by
⎡ ⎤
−I GN gN
⎢G1 −I g1 ⎥
⎢ ⎥
J (z) = ⎢ .. .. .. ⎥ , (6)
⎣ . . . ⎦
GN−1 −I gN−1
Fig. 1 Example of the interval splitting within the multiple shooting for N = 5. The mismatches
at the synchronization points τn , n = 1, . . . , N − 1 together with the periodicity jump between
the solution at τ0 and τN are eliminated (up to a prescribed tolerance) by solving the root-finding
problem
16 I. Kulchytska-Ruchka and S. Schöps
where
∂F ∂F
Gn = (τn , τn−1 , Un−1 , T ), gn = (τn , τn−1 , Un−1 , T ), (7)
∂Un−1 ∂T
n = 1, . . . , N and I is the identity matrix. The root z of (5) can then be calculated
using the Newton method, i.e., for a given z(0) and k = 0, 1, . . . solution z(k+1) at
the iteration k + 1 is updated through
J z(k) Δz(k) = − z(k) , (8)
z(k+1) = Δz(k) + z(k). (9)
Note that due to the introduction of the additional variable T the system of
Eq. (8) is underdetermined and can be solved, e.g., by calculating the Moore-
Penrose pseudoinverse. A generalized eigenvalue-based gauging as well as the
corresponding theory for the Moore-Penrose pseudoinversion was presented in [12].
We note that in case when the size of (8) is large it can be condensed to a d-
dimensional system with d + 1 unknowns by block Gaussian elimination [3].
Inheriting the idea of the Parareal algorithm [7, 11] we approximate the derivative in
Gn (7) in a finite difference way using a coarse propagator G , i.e., for the iteration
k and n = 1, . . . , N
Similar to the fine propagator F , the operator G solves the IVP (4) on each
time window. However, in contrast to the fine solver the coarse propagator has a
considerably lower precision, e.g., it uses a lower-order time integrator or bigger
time steps. Substituting (10) into (8) we obtain the periodic Parareal with periodic
coarse problem [6] with unknown period (PP-PC-UP):
⎡ ⎤
⎡ (k)
⎤ U(k+1) ⎡ (k) ⎤
−I GN ·, T (k) gN 0
⎢ (k+1) ⎥ b
⎢ (k) ⎥ ⎢ U1 ⎥ ⎢ N(k) ⎥
⎢G1 ·, T (k) −I g1 ⎥ ⎢ . ⎥ ⎢ b1 ⎥
⎢ .. ⎥ ⎢ .. ⎥ = ⎢ .. ⎥
⎥ ⎢ ⎥ ⎢
⎢ .. .. ⎥,
⎣ . . . ⎦ ⎢ (k+1) ⎥ ⎣ . ⎦
⎣UN−1 ⎦
GN−1 ·, T (k) −I g(k)
N−1 b(k)
T (k+1) N−1
(11)
Parallel-in-Time Calculation of Time-Periodic Solutions with Unknown Period 17
τn
−1
f(u(τ ))dτ ≈ M−1 Δτn f F τn , τn−1 , Un−1 , T (k) ,
(k)
=M
τn−1
(12)
for n = 1, . . . , N. In the general case, the system of Eq. (11) is nonlinear and
implicit, which requires an additional linearization.
Building upon the ideas presented in [10], which dealt with the time-periodic
problem for a known given period T , we encorporate an additive splitting of the
system matrix in (11). For this let us introduce a modified coarse propagator G¯,
which instead of (4) solves an approximate model with a linearized function f̄(u) =
Au + c on the RHS, i.e.,
with a given Jacobi-matrix An and a vector c. Having the linear coarse model we
construct a fixed point iteration: for s = 0, 1, . . .
⎡ ⎤
⎡ (k) ⎤ U(k+1,s+1) ⎡ (k+1,s) ⎤
−I ¯
GN ·, T (k) gN 0
⎢U(k+1,s+1) ⎥ hN
⎢ ¯ (k) ⎥ ⎢ ⎥ ⎢ (k+1,s) ⎥
⎢G1 ·, T (k) −I g1 ⎥ ⎢ 1 . ⎥ ⎢h1 ⎥
⎢ .. ⎥ ⎢ .. ⎥=⎢ . ⎥
⎢ .. .. ⎥⎢ ⎥ ⎢ . ⎥
⎣ . . . ⎦ ⎢ (k+1,s+1) ⎥ ⎣ . ⎦
⎣UN−1 ⎦
¯
GN−1 ·, T (k) −I
(k)
gN−1
(k+1,s)
hN−1
T (k+1)
(14)
where h(k+1,s) := b(k) ¯
n + G τn , τn−1 , Un−1
(k+1,s)
, T (k) − G τn , τn−1 , U(k+1,s) , T (k)
n (k) n−1
and G¯n ·, T := G¯ τn , τn−1 , ·, T (k) for n = 1, . . . , N. Assuming that G¯
solves (13) with the implicit Euler method using a single step on (τn−1 , τn ] and
that all the windows have the same length Δτ , we have an explicit representation
for the coarse solution
1/Δτ · M − T (k) A G¯ τn , τn−1 , Un−1 , T (k) = 1/Δτ · MUn−1 + T (k) cn ,
(k+1,s) (k+1,s)
(15)
18 I. Kulchytska-Ruchka and S. Schöps
Remark 1 We note that when the period T is given within the problem setting (1),
the corresponding block-cyclic matrix (system matrix of (3) without the last
column) can be transformed into a block-diagonal using the frequency domain
transformation [2]. A detailed description of the approach as well as a Newton-like
linearization of the periodic system within the parallel-in-time setting is presented
in [10].
4 Numerical Example
R2
4
R1
L C1 = 50 pF, C2 = 1 nF,
C1 C4 C3 = 50 nF, C4 = 100 nF,
2 Uop
R1 = 12 kΩ, R2 = 3 Ω,
R3 = 8.2 kΩ, R4 = 1.5 kΩ,
1 C3
C2 L = 10 mH, Uop = 10 V.
3
R3
R4
60
U1 U2
Rel. error ||U(k) − U(k− 1) ||/ ||U(k) ||
PP-PC
U3 U4 10− 1 PP-PC-UP
40
Voltage / V
20
10− 2
10− 3
− 20
0 0.2 0.4 0.6 0.8 1 1 3 5 7 9 11
Time / s ·10− 3 Iteration k
Fig. 3 Left: Transient behavior of the Colpitts oscillator until the steady state. Right: Convergence
of the PP-PC approach with a linearized coarse grid problem for the case when the period T is given
[10] and of PP-PC-UP when T is unknown (3)
20 I. Kulchytska-Ruchka and S. Schöps
level is performed using a surrogate linear model, i.e., G¯ solves the problem (13)
with f̄(U) = AU + c given by
⎡ ⎤
−R2 /L R2 /L
⎢ −1/R2 −xC /ŪT IS /ŪT (xC − IS )/ŪT ⎥
A=⎢⎣ 0
⎥,
⎦
IS /ŪT −1/R4 − xE /ŪT (xE − IS )/ŪT
0 yC /ŪT yE /ŪT −1/R3 − 1/R1 − yE /ŪT − yC /ŪT
(17)
c = 0, Uop /R2 , 0, Uop /R1 , (18)
with ŪT = 0.2585 V. The unit interval [0, 1] is split into N = 10 windows. The
coarse time step Δτ = 0.1 and the fine step δτ = 10−4 were chosen within the
time integration. The calculated period with the fixed point iteration (14) is T =
0.1125 ms. The right-hand part of Fig. 3 shows convergence of the PP-PC iteration
with the linearization from [10] for a given period T as well as for an unknown
period (PP-PC-UP). Both results are obtained up to the relative tolerance of 10−3 .
One can see that in case when T is known the method required less iterations, as
one would expect. When comparing the computational cost of the computations in
terms of the number of linear systems solves, PP-PC and PP-PC-UP delivered the
periodic solution effectively 4 and 3 times faster than the sequential time stepping,
respectively.
5 Conclusions
Acknowledgments The authors thank Roland Pulch from Universität Greifswald for his assis-
tance with implementation and for the fruitful discussions on the Colpitts oscillator model.
This research was supported by the Graduate School CE within the Centre for Computational
Engineering at Technische Universität Darmstadt, as well as by DFG grant SCHO1562/1-2 and
BMBF grant 05M2018RDA (PASIROM).
Parallel-in-Time Calculation of Time-Periodic Solutions with Unknown Period 21
References
Abstract We discuss the general form of the transfer functions of linear lumped
circuits. We show that an arbitrary transfer function defined on such circuits has a
functional dependence on individual circuit parameters that is rational, with multi-
linear numerator and denominator. This result demonstrates that rational polynomial
chaos expansions provide more suitable models than standard polynomial chaos for
the uncertainty quantification of this class of circuits.
1 Introduction
The polynomial chaos expansion (PCE) method [6] has emerged in the macromod-
eling and model-order reduction communities because of the remarkable accuracy
and efficiency in the uncertainty quantification by stochastic systems, including
electric and electronic circuits [3]. Stochastic output variables of interest are
approximated with a suitable polynomial model w.r.t. random input parameters,
from which statistical information is inexpensively extracted. While the method was
demonstrated to provide very high accuracy with a very limited expansion order
in many application scenarios, the modeling of resonant and/or distributed circuits
may require large orders and the accuracy of the calculated PCE coefficients may
be deteriorated by the large variability of the outputs.
A rational polynomial chaos (RPC) model with tensor-product truncation was
recently introduced [4] and was shown to provide better performance, compared
to the conventional single PCE with total-degree truncation that is used in most
engineering applications, specifically in electrical engineering [3]. In this work,
we show that the general form of any transfer function defined for a linear
lumped circuit is rational w.r.t. both frequency and element values. Specifically,
both numerator and denominator are multi-linear functions of element values.
where s is the Laplace variable (complex frequency). In (1), the basis functions
ϕ
are multivariate orthogonal polynomials in the uncertain variables ξ , and the
coefficients N
and D
are computed using a linearized and iteratively re-weighted
regression. It was empirically shown [4] that, for the uncertainty quantification
of electric circuits, the RPC (1) is more accurate than the standard PCE [3]. The
purpose of this work is to provide a rigorous justification.
We review the basic modified nodal analysis (MNA) formulation [2] of lumped
linear time-invariant (LTI) circuits with RGLC components. The main objective
of this derivation is to reveal in explicit form the functional dependence on the
individual circuit parameters of any transfer function that can be defined on such
circuits.
Let us consider a lumped LTI P -port circuit with n nodes and b branches (one-
port elements). The branches are split into bR resistors with resistance Rk , bG
resistors with conductance Gk , bL inductors with inductance Lk , and bC capacitors
with capacitance Ck , where k is an index identifying individual components. We
distinguish between resistance-defined and conductance-defined resistors to allow
additive variations of either parameter. In addition, the last bJ = P branches are
Exactness of Rational Polynomial Chaos Formulation for the Uncertainty Quantification 25
assumed to represent the P ports of the structure. We place ideal current sources Jk
providing an excitation to the circuit, with the objective of characterizing the P × P
impedance matrix Z(s) in the Laplace domain by computing the corresponding port
voltages as outputs.
The branch voltage and current vectors v, i ∈ Rb are split according to element
types as
T T
v = v TR , v TG , v TL , v TC , v TJ , i = i TR , i TG , i TL , i TC , i TJ ,
where v ν , i ν ∈ Rbν for ν ∈ {R, G, L, C, J }, and where the passive sign convention
is used for each branch, including sources. The branch characteristic equations are
collectively written for each class of components as
Note that the current Jk of each source is incident into its positive node.
Circuit connectivity is described by the (reduced) incidence matrix A ∈ Rn−1,b ,
with the n-th node serving as reference for the definition of the set of nodal voltages
e ∈ Rn−1 . The incidence matrix columns are partitioned according to the branch
classes as
A = AR , AG , AL , AC , AJ . (3)
d
Gx + C x = Bu (4a)
dt
y = BT x, (4b)
which represents the standard MNA formulation. In (4), u = J denotes the port
currents, considered as inputs, y = v J denotes the corresponding port voltages,
26 P. Manfredi and S. Grivet-Talocia
Throughout this work, we denote with 0 an all-zero matrix or vector, whose size is
inferred from the context.
The so-called stamps of the individual circuit elements in the MNA system (4)
are now easily characterized. A straightforward derivation shows that
ba
bd
G(θ ) = G0 + (pk p Tk ) θk , C(ζ ) = (q k q Tk ) ζk , (6)
k=1 k=1
where
• ba = bR + bG is the number of adynamic components with values collected in
vector θ ∈ Rba , having elements {θk }bk=1
a
= {Rk }bk=1
R
∪ {Gk }bk=1
G
;
• bd = bL + bC is the number of dynamic components with values collected in
vector ζ ∈ Rbd , having elements {ζk }bk=1d
= {Lk }bk=1
L
∪ {Ck }bk=1
C
;
bG
• the constant vectors p k ∈ R collect the sets {p k }k=1 = {r k }bk=1
m ba R
∪ {gk }k=1
T T
individually defined as r k = 0, 1TbR ,k , 0 and g k = a TG,k , 0, 0 , where 1bν ,k
denotes the Euclidean basis vector in Rbν with all vanishing elements except the
k-th component equal to 1, and a G,k is the k-th column of AG ;
• the constant vectors q k ∈ Rm collect the sets {q k }bk=1
d
= {l k }bk=1
L
∪ {ck }bk=1
C
T T
individually defined as l k = 0, 0, 1TbL ,k and ck = a TC,k , 0, 0 , where a C,k
is the k-th column of AC ;
• the constant matrix G0 is defined as
⎛ ⎞
0 AR AL
G0 = ⎝−ATR 0 0 ⎠ . (7)
−ATL 0 0
For the uncertainty quantification problem to be well posed, we assume that the
circuit is well defined and uniquely solvable for all parameter configurations, i.e.,
∃s ∈ C for which det(G(θ ) + sC(ζ )) = 0. Equivalently, the pencil (G, C) is regular
for any θ, ζ . We further consider a nominal parameter configuration θ = θ̄ and
Exactness of Rational Polynomial Chaos Formulation for the Uncertainty Quantification 27
θ = θ̄ + ε, ζ = ζ̄ + δ, (8)
ba
bd
G = G(ε) = Ḡ + (pk p Tk ) εk , C = C(δ) = C̄ + (q k q Tk ) δk . (9)
k=1 k=1
We see that both the static (G) and the dynamic (C) MNA matrices are expressed
as a finite sum of rank-one updates with respect to the nominal circuit formulation.
Each rank-one update pertains to a single individual stochastic circuit element. The
corresponding constant rank-one matrices pk p Tk and q k q Tk are recognized as the
standard MNA stamps of the various circuit elements.
Let us now consider the Laplace-domain solution of (4), which in the present
case corresponds to the impedance matrix of the considered P -port element and
reads
N(s; ξ )
Z(s; ξ ) = BT [G(ε) + s C(δ)]−1 B = , (10)
D(s; ξ )
Ξ = diag(ξ1 , . . . , ξd ), (11)
which we use to cast the MNA matrix in the compact form, by restating (9) as
where the constant matrices P and Q collect as columns all the vectors pk and q k ,
respectively. From now on, we will omit the dependence on the Laplace variable s,
since we are interested in the dependence on the stochastic variables ξ .
We introduce two useful lemmas:
Lemma 1 Given a square invertible matrix X and two matrices U , V of compatible
size, we have
The above Lemma 1 is known as matrix determinant lemma, see [1] for a proof.
Lemma 2 Let a matrix W ∈ Rn,n have elements in the form Wij = Fij + ξi Bij ,
where Fij , Bij are constants for i, j = 1, . . . , n, and ξi are independent parameters.
Then,
n
det(W ) = βk ξ
αk
, (14)
k
=1
Expanding det(W ) using Laplace’s formula along the first row, we get
n
det(W ) = (−1)1+j (F1j + ξ1 B1j )M1j , (15)
j =1
n
α
M1j = βk ξ
k
, αk
∈ {0, 1} ∀k,
. (16)
k
=2
Exactness of Rational Polynomial Chaos Formulation for the Uncertainty Quantification 29
n
n
α
det(W ) = (−1)1+j (F1j + ξ1 B1j ) βk ξ
k
j =1 k
=2
n
n
α
n
α
n
α
= (−1)1+j F1j βk ξ
k
+ B1j βk ξ1 ξ
k
= β̂k ξ
k
,
k j =1
=2
=2 k
=1
−1
where both B = S Ȳ U and det Ȳ depend only on s and are thus constant with
respect to the stochastic parameters ξ . We have
⎛ ⎞
1 + ξ1 B11 ξ1 B12 ··· ξ1 B1n
⎜ ξ2 B21 1 + ξ2 B22 ··· ξ2 B2n ⎟
⎜ ⎟
I + ΞB = ⎜ .. .. .. .. ⎟.
⎝ . . . . ⎠
ξn Bn1 ξn Bn2 · · · 1 + ξn Bnn
n
α
det (I + Ξ B) = βk ξ
k
(19)
k
=1
n
α
D(s; ξ ) = dk (s) ξ
k
, αk
∈ {0, 1} ∀k,
. (20)
k
=1
Due to the lumped nature of the system under consideration, the coefficients dk (s)
are polynomials in s of degree up to the dynamic order N of the circuit.
The same arguments used for the denominator D(s; ξ ) can be seamlessly adopted
to show that also the elements of the numerator matrix N(s; ξ ) in (10) have the same
30 P. Manfredi and S. Grivet-Talocia
4 An Illustrative Example
We consider the filter of Fig. 1 (left), which is designed to exhibit both a band-
and a high-pass behavior. All 9 circuit elements are uncertain, with inductances and
capacitances having independent Gaussian variations with a 20% standard deviation
around the nominal values indicated in the schematic.
The right panel in Fig. 1 shows the variability of the insertion loss of the filter.
The gray lines are a subset of random samples from a reference Monte Carlo (MC)
simulation with 10,000 runs, whereas the solid blue line is the standard deviation
of the MC samples. The dashed red and green lines are the standard deviations
obtained with a conventional PCE having a maximum total degree of three, and
92 nH 1
port 1 116.6 pF port 2
0.8
250 pF
0.6
73.6 pF 73.6 pF
146 nH 146 nH 0.4
0
0 20 40 60 80 100
Fig. 1 Left: filter schematic. Right: variability of the insertion loss of the filter. Gray lines: MC
samples; solid blue, dashed red, and dashed green lines: standard deviation obtained with MC,
conventional PCE, and proposed RPC methods, respectively
Exactness of Rational Polynomial Chaos Formulation for the Uncertainty Quantification 31
5 Conclusions
This work presented a formal derivation that any frequency-domain transfer func-
tion defined on linear lumped circuits is a rational function with multi-linear
dependence on the circuit element values. This results provides a rigorous moti-
vation for using a Rational Polynomial Chaos (RPC) model for the uncertainty
quantification of the frequency-domain responses of electrical circuits, and more
generally of electromagnetic systems. Our findings are illustrated based on a lumped
filter example.
While a first-order tensor-product truncation provides an exact model for lumped
circuits, a more compact total-degree truncation (possibly of higher order) can be
used to improve the efficiency, especially for applications in which the exactness
no longer holds. This is the case, for example, of distributed, electromagnetic,
and/or photonic systems. We are also currently investigating a compression strategy,
based on principal component analysis, that avoids having to optimize the model
coefficients separately for each frequency.
References
1. D.A. Harville, Matrix Algebra From a Statistician’s Perspective (Springer, New York, 1997)
2. C.-W. Ho, A. Ruehli, P. Brennan, The modified nodal approach to network analysis. IEEE Trans.
Circ. Syst. 22(6), 504–509 (1975)
3. A. Kaintura, T. Dhaene, D. Spina, Review of polynomial chaos-based methods for uncertainty
quantification in modern integrated circuits. Electronics 7(3), 1–21 (2018)
4. P. Manfredi, S. Grivet-Talocia, Rational polynomial chaos expansions for the stochastic
macromodeling of network responses. IEEE Trans. Circ. Syst. I Reg. Papers 67(1), 225–234
(2020)
5. J. Vlach, K. Singhal, Computer Methods for Circuit Analysis and Design (Wiley, New York,
1983)
6. D. Xiu, Fast numerical methods for stochastic computations: a review. Commun. Comput. Phys.
5(2–4), 242–272 (2009)
Parallel-in-Time Simulation of Power
Converters Using Multirate PDEs
Abstract This paper presents a numerical algorithm for the simulation of pulse-
width modulated power converters via parallelization in time domain. The method
applies the multirate partial differential equation approach on the coarse grid of
the (two-grid) parallel-in-time algorithm Parareal. Performance of the proposed
approach is illustrated via its application to a DC-DC converter.
1 Introduction
L RL 100
iL
voltage vi (V)
50
vi vC C R th
0 th
Ts D= Ts
0 2 4 6
time (ms)
(a) (b)
Fig. 1 Power converter model with pulsed voltage source: (a) Circuit of a simplified buck
converter. Transistor switching is modeled as pulsed voltage source. (b) PWM generated pulsed
voltage
Current iL (A) and voltage vC (V)
100
50
iL (A)
0 vC (V)
0 2 4 6 8 10
time (ms)
Fig. 2 Exemplary solution of the buck converter depicted in Fig. 1a. Switching frequency fs =
1/Ts = 5 kHz
The paper is organized as follows: first we introduce our model problem with
pulsed excitation in Sect. 2, then in Sect. 3 the Parareal method is summarized,
Sect. 4 proposes the usage of MPDEs as coarse propagators for Parareal that can
deal with pulsed right-hand sides and finally Sect. 5 discusses a numerical example
before concluding the paper.
lower output voltage. It consists of a part that generates a pulsed voltage vi and
a filter circuit. The latter is shown in Fig. 1a. The pulsed voltage, see Fig. 1b, is
often generated using PWM. Important quantities defining the pulsed signal are the
switching period Ts and the duty cycle D which is the relation between the “on”-
time and the switching period. Given a reference signal r(t) and a carrier signal s(t)
the pulsed voltage is generated by
Vi
vi (t) = sgn (r(t) − s(t)) + 1 , (1)
2
where sgn denotes the sign function and Vi is the amplitude. The converter circuit
is mathematically described by a system of ordinary differential or differential-
algebraic equations, e.g.,
d
A x(t) + B x(t) = c(t), t ∈ (t0 , T ], (2)
dt
with given initial value x(t0 ) = x0 , where x(t) ∈ RNs is the unknown solution
vector consisting for example of currents and voltages, A, B ∈ RNs ×Ns are matrices,
and c(t) ∈ RNs is the right-hand side containing current and voltage sources,
e.g., the pulsed voltage vi (t). The system may be assembled from lumped element
descriptions based on loop or (modified) nodal analysis as described in [2]. Please
note, that we focus on the linear case but the approach can be straight-forwardly
generalized, e.g., considering B = B(x).
The solution of power converters, as for example the one shown in Fig. 2,
exhibits the multirate phenomenon: slow variations in the solution require large time
intervals, i.e., a large end time point T , while the fast dynamics due to the switching
enforce small time steps. This is the motivation to turn to (parallel) methods that can
exploit this multirate behavior. In the following, we focus on the settling process
until the steady state is reached. If one is interested only in the latter, then other
methods may also be used, for example the application of Parareal for time-periodic
problems is a natural generalization of this work, see, e.g., [5].
3 Parareal Algorithm
The solution operator F is assumed to deliver a very accurate solution (e.g., using a
numerical time-integration method with small time steps δT ) and can be executed in
parallel, while G gives rough information about the solution using a cheap method
(e.g., using a numerical method with large time steps ΔTi = Ti+1 − Ti ) and has to
be calculated sequentially, cf. (4).
A difficulty in applying Parareal to solve problems with PWM input is that a
naive implementation of a coarse propagator using a time-integrator with large
time steps will not capture the high-frequency dynamics and may also fail to
propagate low-frequency components. A modified Parareal algorithm which still
approximately captures the high-frequency behavior was introduced in [4]. The
idea is to separate the high-frequency (pulsed) components from the low-frequency
components, i.e.,
d
A x(t) + B x(t) = c̄(t) + c̃(t), (5)
dt ! "# $
=c(t )
where c̄ can be given as the sum of a few low-frequency sinusoids from a (fast)
Fourier transform and c̃(t) := c(t) − c̄(t) is the remainder. This allows to define a
reduced coarse propagator G¯fft which solves
d
A x(t) + B x(t) = c̄(t) (6)
dt
and gives rise to a modified Parareal update formula with coarse propagator G¯fft
in (3)–(4). This modified method converges reliably but possibly with reduced order
[4]. In this paper we propose an alternative method to perform time integration by
using the MPDE approach as the coarse propagator.
4 Multirate PDEs
The MPDE approach, which is used for obtaining the coarse solution in Parareal
uses the MPDE concept [1]. For the given problem the solution can be conveniently
decomposed into a slowly varying envelope and fast periodically varying ripples
using the solution expansion [8]
Np
.
%
xj (t1 , t2 ) = yj,k (t1 )wk (τ (t2 )) = w(τ (t2 ))yj (t1 ), (7)
k=1
Parallel-in-Time Simulation of Power Converters Using Multirate PDEs 37
where yj,k (t1 ) are slowly varying coefficients and wk (τ (t2 )) are a finite set of basis
functions (k = 1, . . . , Np ) whose periodicity is accounted for by the relative time
τ (t2 ) = Tt2s mod 1. Its application to (2) yields
& '
x(t1 , t2 ) ∂%
∂% x(t1 , t2 )
A + x(t1 , t2 ) = %
+ B% c(t1 , t2 ), (8)
∂t1 ∂t2
where the relation between the original (2) and the MPDE (8) solution and right-
hand side are given by
%
x(t, t) = x(t), %
c(t, t) = c(t). (9)
This implies that if a solution to (8) is found, the solution of (2) can be extracted
from it. The specification of a suitable multitime source %c(t1 , t2 ) has to be supplied
by the user. However, the method converges to the correct solution for any choice
that fulfills (9) but it may not be more efficient than conventional time stepping.
A suitable choice for PWM excitations is discussed in Sect. 5. Now, applying a
Galerkin approach along the fast time scale t2 leads to the enlarged equation system
dy
A + B y(t1 ) = C(t1 ) , (10)
dt1
1
A = A ⊗ J, with J = Ts w(τ ) w(τ ) dτ,
0
1
∂w(τ )
B = B ⊗ J + A ⊗ Q, with Q=− w (τ ) dτ,
∂τ
0
Ts
C= %
c(t1 , t2 ) ⊗ w(τ (t2 )) dt2 .
0
Suitable basis functions, which can well represent the ripples in the power converter
solution, are, e.g., B-Splines with suitable continuity or the PWM basis functions
[6]. The latter are global polynomial ansatz functions with w1 (τ, D) = 1, w2 (τ, D)
piecewise linear and wk (τ, D) is obtained recursively by integrating wk−1 (τ ) and
orthonormalizing for 3 ≤ k ≤ Np , see Fig. 3. It has been shown in [8] that they are
capable of very effectively representing the ripples in linear problems.
Finally, Eq. (10) can be time-stepped along t1 by using much larger time steps
than are needed to solve (2) since the fast variations are taken into account by the
basis functions. The accuracy of the solution (reconstructed using (7)) increases with
Np . However increasing Np also makes each time step of an implicit method more
38 A. Pels et al.
2 w1 Vi
w2
1 w3
w4
0
−1
−2 0
0 D 1 0 D 1
relative time τ relative time τ
(a) (b)
Fig. 3 Construction of basis functions with cusp at relative switching time D: (a) PWM basis
functions on relative time interval and (b) right-hand side
costly since an enlarged linear equation system has to be solved. Nevertheless, even
with very few basis functions the reconstructed solution can be expected to capture
the main features of the exact solution. This motivates the introduction of another
coarse propagator G¯mpde in Parareal which solves (10) and extracts afterwards the
single-time solution according to (7).
5 Numerical Experiments
The proposed approach is applied to the academic example of the buck converter
(see Fig. 1a). Its circuit is described by the IVP (2) given by
L 0 RL 1 vi (t)
A= ; B= and c(t) = , (11)
0C −1 1/R 0
PWM
maximal relative l2 error
10− 1 DC
MPDE 1
10− 2 MPDE 3
10− 3
10− 4
10− 5
10− 6
10− 7
1 2 3 4 5 6 7 8 9
number of iterations k
Fig. 4 Convergence of Parareal towards the sequential (reference) solution using different coarse
propagators for the given example
Parareal [4]); 2. G¯mpde which solves (2) using the MPDE approach with Np = 1 and
Np = 3 with the right-hand side% c(t1 , t2 ) = c(t2 ).
The maximal relative l2 error w.r.t. the sequential (reference) solution xseq (t)
is depicted in Fig. 4 for all the considered approaches ∈ {PWM, DC, MPDE 1,
MPDE 3} at iteration k. The conventional Parareal converges for our test case (11)
up to a relative error of 10−6 in 9 iterations which is remarkable since the time step
of the coarse propagator does not resolve the dynamics of the PWM input and also
violates smoothness assumption, see [4] for details.
This method requires 2 700 and 360 sequential solutions of linear algebraic
systems of size Ns = 2 on the fine and the coarse levels, respectively, or 3 060
linear systems in total. By the number of sequential solves we mean the number
of solver calls which cannot be carried out in parallel (communication costs are
neglected). The approaches using the DC component and the MPDE approach with
Np = 1 both required 8 iterations (2 400 fine and 320 coarse solves, or in total
2 720 solutions of linear systems in 2 variables). Finally, the MPDE approach with
Np = 3 basis functions on the coarse level converged after 7 iterations, thereby
solving 2 100 linear systems of size Ns = 2 on the fine level and 280 linear systems
of size Ns × Np = 6 on the coarse level. One observes that the conventional coarse
propagator requires always roughly 1-2 Parareal iterations more than the MPDE
40 A. Pels et al.
approach with Np = 3 to obtain the same accuracy, e.g., MPDE 3 needs 3 instead
of 5 iterations for an error of 4 × 10−4 .
From Fig. 4 we see that Parareal with coarse propagator G¯mpde using a constant
basis function, i.e., Np = 1 and the modified Parareal with G¯fft using only the
DC excitation perform very similarly (if not identically). This resemblance is not
surprising since the MPDE 1 approach with Np = 1 computes only the envelope of
the solution, which is conceptually similar to the modified Parareal with a smooth
(in this case constant) coarse input. Finally, further tests show that the exploitation
of more basis functions (Np > 3) does not improve the convergence of Parareal,
they are similar to the case Np = 3.
6 Conclusions
Acknowledgments The authors thank Ruth Vazquez Sabariego from KU Leuven for many fruitful
discussions on the MPDE approach. This research was supported by the Graduate School CE
within the Centre for Computational Engineering at Technische Universität Darmstadt, as well as
by DFG grant SCHO1562/1-2 and BMBF grant 05M2018RDA.
References
6. J. Gyselinck, C. Martis, R.V. Sabariego, Using dedicated time-domain basis functions for the
simulation of pulse-width-modulation controlled devices – application to the steady-state regime
of a buck converter, in Electromotion 2013, 2013
7. J.L. Lions, Y. Maday, G. Turinici, A parareal in time discretization of PDEs. Comp. Rend. de
l’Académie des Sci. – Ser. I – Math. 332(7), 661–668 (2001)
8. A. Pels, J. Gyselinck, R.V. Sabariego, S. Schöps, Efficient simulation of DC-DC switch-
mode power converters by multirate partial differential equations. IEEE J. Multiscale Multiphys.
Comput. Tech. 4(1), 64–75 (2019)
Part II
Device Simulation
A Maximum Principle for Drift-Diffusion
Equations and the Scharfetter-Gummel
Discretization
Abstract The solution of the drift-diffusion equation does not satisfy a maximum
principle in general. Here it is shown that a maximum principle can be established
for the so called Slotboom variable, which permits statements on uniqueness,
stability, and positivity. This maximum principle is preserved for the discretized
system obtained by the Scharfetter-Gummel scheme.
1 Introduction
A maximum principle states that the solution of certain partial differential equations
attains its maximum on the boundary of the solution domain. Usually, if a
maximum principle holds, there follows also a corresponding minimum principle
by straightforward arguments (as changing signs). The maximum principle implies
several properties as uniqueness and stability of solutions. Furthermore, it often
ensures positiveness of physical quantities, as e.g. electron and hole densities in
semiconductors, where negative values would be non-physical.
A drift-diffusion equation, used e.g. to model the transport of electrons and
holes in a semiconductor, does not exhibit a maximum principle for the densities in
general. The densities in semiconductor devices may vary by orders of magnitude
due to huge differences in doping concentrations. On the over hand the drift-
diffusion equations ensures the positivity of densities. In a numerical scheme this
positivity shall still be guaranteed, which is indeed the case for the Scharfetter-
Gummel discretization [1]. However, the situation looks different if one considers
the Slotboom variable instead of the density.
A maximum principle for Fermi potentials, which implies the maximum prin-
ciple for the Slotboom variables was shown in [2] for static drift diffusion
equations without recombination term. An estimate for the Slotboom variables in
the stationary semiconductor equations (van Roosbroeck system) was established by
Markowich [3, Theorem 3.2.1]. Here, the bound depends on the Dirichlet boundary
values, but do not yield a Maximum principle as presented by us in Theorem 1. A
discretized version of this estimate for a Scharfetter-Gummel finite volume scheme
can be found in [4]. An estimate for the time dependent van Roosbroeck system
is given in [5], where it is shown that the solution of a Scharfetter-Gummel finite
volume scheme is bounded by constants depending only on boundary and initial
data.
The results in this paper were motivated by the investigation of extended drift-
diffusion equations (see e.g. [6]) containing a time derivative of the flux and the
convective derivative. This equations are hyperbolic and do therefore not satisfy
a maximum principle. Thus it is a challenge to ensure positivity of densities for
the original equation as well as for discretizations. However, the study of the drift
diffusion equation and the Scharfetter-Gummel discretization lead to a maximum
principle with improved bounds for continuous solutions. This maximum principle
and its implications are presented in Sect. 2. In Sect. 3 we show how this maximum
principle is preserved by the Scharfetter-Gummel scheme.
where n is the unknown particle density, φ is a potential, and d is a source term (e.g.
for carrier generation and recombination in semiconductors). Here we have used
without loss of generality a scaling of quantities and equations which simplifies the
formulation. In the sequel Ω ⊂ Rn will be an open domain, Ω denotes its closure
and ∂Ω = Ω \ Ω is the boundary. We split the boundary into the closed Dirichlet
boundary ΓD ⊂ ∂Ω, with ΓD = ∅, and the Neumann boundary ΓN = ∂Ω \ ΓD .
We consider the Dirichlet boundary conditions
takes into account changes of the potential. For the equilibrium one has e.g. a
constant Slotboom variable. In particular, in terms of the Slotboom variable the flux
is written as
such that (1) becomes an elliptic problem in terms of ñ(x), which allows us to
formulate a maximum principle for the Slotboom variable. One formulation could
be obtained from [8, Theorem A.1], however we will present sharper bounds here.
Theorem 1 Let n : Ω → R be a continuous solution of (1) on the open domain
Ω ⊂ Rn . For the Neumann boundary conditions (3) we require h(x) ≤ 0, x ∈ ΓN .
If d(n, x) ≥ 0, n ≥ n0 (x), x ∈ Ω, then the Slotboom variable ñ(x) satisfies
& '
−φ(x)
ñ(x) ≤ max n0 (x)e , max ñ(y) , x ∈ Ω. (6)
y∈ΓD
Proof Let us first assume that the maximum is not attained in ΓD but in x ∗ ∈ Ω
and n(x ∗ ) > n0 (x). For any sufficiently small ε > 0 0ne has n(x) > n0 (x) for
x ∈ Bε (x ∗ ) ⊂ Ω, where Bε (x ∗ ) := {y ∈ Rn : y − x ∗ < ε} is the ε-ball around
x ∗ . Furthermore, from ñ(x) ≤ ñ(x ∗ ) it follows that ν T ∇ ñ(x) ≤ 0 for x ∈ ∂Bε (x ∗ )
with ε > 0 sufficiently small. Without loss of generality we can further assume
that ν T ∇ ñ(x) < 0 on a subdomain of ∂Bε (x ∗ ) ⊂ Ω.1 Then we obtain from the
divergence theorem that
0≤ d(n(x), x) dx = ∇ T eφ(x) ∇ ñ(x) dx
Bε (x ∗ ) Bε (x ∗ )
(
= eφ(x) ν T ∇ ñ(x) dS < 0,
∂Bε (x ∗ )
which is a contradiction.
where p is the hole density, Nintr the intrinsic density, and r(n, p) is a model
dependent positive factor. That is, the assumptions of Theorem1 are satisfied with
N2
n0 (x) = p(x)
intr
.
In the complete semiconductor model n, p, and φ are solutions of the Rosen-
broeck system, consisting of two drift diffusion equations for electrons and holes as
well as the Poisson equation for the potential. However the assumption of arbitrary
p and φ might be justified in a numerical scheme, as e.g. the Gummel iteration.
From the above theorem follows immediately a stability result
Corollary 1 Let ni : Ω → R, i = 1, 2 be continuous solutions of (1), which both
fulfill Neumann boundary conditions (3) for the same arbitrary h(x). If d(n, x), is
monotonically increasing with respect to n, then the Slotboom variables satisfy
) ) ) )
)ñ1 (x) − ñ2 (x)) ≤ max )ñ1 (y) − ñ2 (y)), x ∈ Ω.
y∈ΓD
Maximum Principle for Drift-Diffusion and Scharfetter-Gummel Discretization 49
Proof Let us first assume that the maximum is attained in x ∗ ∈ Ω. Without loss of
generality we assume that ñ1 (x ∗ ) > ñ2 (x ∗ ). Analogously to the proof of Theorem 1
we conclude
(
0≤ d(n1 (x), x)−d(n2 (x), x) dx = eφ(x) ν T ∇ ñ1 (x)−ñ2 (x) dS < 0,
Bε (x ∗ ) ∂Bε (x ∗ )
which is a contradiction.
If we assume that the maximum is attained in x ∗ ∈ ΓN the proof is analogous
using that
ν T ∇ ñ1 (x) − ñ2 (x) = 0, x ∈ ΓN .
The above corollary states that small distortions of the Dirichlet boundary
conditions will result only in small distortions of the solution. Although it is a
statement for the Slotboom variable it provides also a stability statement for the
densities based on relation (4). As a particular result we obtain here the uniqueness
of the boundary value problem for the drift-diffusion equation.
For the time dependent drift-diffusion equation
∂
∂t n(x, t) = ∇ T ∇n(x, t) − n(x, t)∇φ(x, t) − d n(x, t), x, t , (7)
the situation is more involved. However, we can still show the positivity of the
densities, under suitable assumptions.
Theorem 2 Let n : Ω × (0, T ) → R be a continuous solution of (7) on the open
domain Ω × (0, T ). We require positive Dirichlet (2) and nonnegative Neumann (3)
boundary conditions, i.e., g(x, t) > 0 and h(x, t) ≥ 0, as well as positive initial
conditions n(x, 0) > 0, x ∈ Ω. If there is a δ > 0 with d(n, x, t) ≤ 0, x ∈ Ω ∪ ΓN ,
t ∈ (0, T ), n ≤ δ, then
Proof We assume there is a (x ∗ , t ∗ ) such that n(x ∗ , t ∗ ) = 0, while n(x, t) > 0 for
t < t ∗ , x ∈ Ω. As in the proof of Theorem 1 we conclude
∗
∇ T eφ(x,t ) ∇ ñ(x, t ∗ ) − d(n(x, t ∗ ), x, t ∗ ) dx > 0
Bε (x ∗ )∩Ω
for any sufficient small ε > 0. Here we have assumed without loss of generality
that ñ(x, t ∗ ) attains positive values in any neighborhood of x ∗ , which is admissible
due to g(x, t ∗ ) > 0. This implies that ∂t∂ n(x ∗ , t ∗ ) > 0, i.e., there is a t < t ∗ , with
n(x ∗ , t) < 0, which is a contradiction.
50 K. Bittner et al.
where φij = φ(xj ) − φ(xi ) and B(x) := exx−1 is the Bernoulli function. That is, we
have indeed obtained the Scharfetter-Gummel discretization (with an error of order
h2ij ). In terms of the Slotboom variables this becomes
Jij = 1
hij B(−φij ) n(xj ) e−φij − n(xi ) = 1
B(−φij ) eφ(xi ) ñ(xj ) − ñ(xi ) ,
hij
! "# $
wij
with
φ(xj ) − φ(xi )
wij = wj i = 1
−φ(x
>0
hij
e j ) − e −φ(xi )
contains all pairs of adjacent nodes, with intersecting Voronoi cells ωi and ωj . The
finite volume approach leads to
(
∇ T J (x) dx = ν T J (x) dS ≈ Aij Jij + AN
i h(xi ), i ∈ II ∪ IN .
ωi ∂ωi j :xij ∈L
where Aij and AN i are the size of ωi ∩ ωj and ωi ∩ ΓN . respectively. Note that for
interior nodes Ai = 0. That is, we obtain the equations
N
Aij wij ñj − ñi = d(ni , xi ) − AN
i h(xi ), i ∈ II ∪ IN , (8)
! "# $
j :xij ∈L
Wij
Proof We use a similar argument as for the proof of Theorem 1. Let us assume that
the maximum is attained for i ∈ II ∪ IN and ni > n0 (xi ). Then
0 ≤ d(ni , xi ) − AN
i h(xi ) = Wij ñj − ñi < 0,
! "# $
j :xij ∈L
<0
which is a contradiction.
Using analogous arguments we obtain also discrete versions of Corollary 1.
Corollary 2 Let {ni : i ∈ II ∪ IN } and {mi : i ∈ II ∪ IN } be solutions of (8),
with identical h(x). If d(n, x) is monotonically increasing with respect to n, then
the Slotboom variables satisfy
) ) ) )
)ñi − m̃i ) ≤ max )ñj − m̃j ), i ∈ I.
j ∈ID
i ∈ II ∪ IN ,
References
1. D.L. Scharfetter, H.K. Gummel, Large-signal analysis of a silicon read diode oscillator. IEEE
Trans. Electron Devices 16(3), 391–415 (1969)
2. J. Jerome, Analysis of Charge Transport: A Mathematical Study of Semiconductor Devices
(Springer, Berlin, Heidelberg, 1996)
3. P. Markowich, The Stationary Semiconductor Device Equations. Computational Microelec-
tronics (Springer, New York, 1986)
4. K. Gärtner, Existence of bounded discrete steady-state solutions of the van Roosbroeck system
on boundary conforming Delaunay grids. SIAM J. Sci. Comput. 31(2), 1347–1362 (2009)
5. M. Bessemoulin-Chatard, C. Chainais-Hillairet, A. Jüngel, Uniform L∞ estimates for
approximate solutions of the bipolar drift-diffusion system. in Finite Volumes for Complex
Applications VIII - Methods and Theoretical Aspects (Springer International Publishing,
Cham, 2017), pp. 381–389
6. Z. Kargar, T. Linn, D. Ruić, C. Jungemann, Investigation of transport modeling for plasma
waves in THz devices. IEEE Trans. Electron Dev. 63(11), 4402–4408 (2016)
7. J.W. Slotboom, Computer-aided two-dimensional analysis of bipolar transistors. IEEE Trans.
Electron Dev. 20(8), 669–679 (1973)
8. J. Jerome, T. Kerkhoven, L∞ stability of finite element approximations to elliptic gradient
equations. Numer. Math. 57(6–7), 561–576 (1990)
9. P. Farrell, N. Rotundo, D.H. Doan, M. Kantner, J. Fuhrmann, T. Koprucki, Drift-diffusion
models, in Handbook of Optoelectronic Device Modeling and Simulation, ed. by J. Piprek.
Series in Optics and Optoelectronics, vol. 2 (CRC Press, Boca Raton, 2017), pp. 733–771
10. R.E. Bank, D.J. Rose, W. Fichtner, Numerical methods for semiconductor device simulation.
IEEE Trans. Electron Dev. 30(9), 1031–1041 (1983)
Numerical Calculation of Electronic
Properties of Transition Metal-Doped
mWS2 via DFT
Abstract In this work, we use the spin-polarized density functional theory (DFT)
to study the atomic structures of transition metal-doped monolayer WS2 (mWS2 ).
The structures of doped mWS2 are simulated via atomic relaxation which moves the
ions according to the interactive force between electrons and the ions until converge
condition is reached, where the Kohn-Sham equation is solved numerically. We do
reveal not only simulation flow but also the accuracy examination for the explored
mWS2 . The estimated physical properties are further described and discussed.
1 Introduction
=εi φi , i = 1, 2, . . . , N,
where Te , Vnuclear , VH at ree , and Vxc are the kinetic energy operator of elec-
tron i, nucleus-electron potential energy, Hartree potential energy, and exchange-
correlation potential energy, respectively. pi , mi , I , ZI , e, ri , RI , εi , φi (r ), and
Numerical Calculation of Electronic Properties of Transition Metal-Doped. . . 55
N are the momentum operator, electron mass, the index of nucleus, the charges
of nucleus, the charge of electron, the position vector of electron, the position
vector of nucleus, the orbital energy, the Kohn-Sham orbital, and N non-interacting
electrons, respectively. The number of these equations depends on the number of
valence electrons which come all of the simulated atoms. The Vxc (r ) term can be
obtained by different approaches; for example, Perdew-Burke-Ernzerhof (PBE) [9]
is used as an exchange-correlation function after our intensive accuracy test. The
aforementioned equations are construct for single k-point. For sampling of Brillouin
zone, Monkhorst-pack [10] k-points centered at the Γ point (0, 0, 0) is generated:
3
ni + 1/2
k = bi , ni = 0, . . . , Ni − 1, Ni is even, (2)
Ni
i=1
where k, bi , and Ni are the vector in k-space, the basis of reciprocal lattice, and the
mesh numbers for b1 , b2 , and b3 directions.
To solve the Kohn-Sham equation, the blocked Davidson algorithm[7] has been
considered in the numerical calculation. It consists of five steps: basis initialization,
subspace construction, residual vector calculation, correction vector calculation,
and subspace expansion. In the basis initialization, a set of orthonormal basis
ψi , i = 1, 2, . . . , m, ∀m ≥ n for the lowest n states are guessed and built. In the
subspace construction, the full-size Hamiltonian matrix H is projected on a set
of sub-matrices {H˜ ij = ψiT H ψi } and solved for the eigenpairs in the subspace
(H˜ ϕ k = εk ϕ k , ∀k = 1, 2, . . . , n). Next, the calculated eigenpairs are used for
residual vector calculation. The residual vector is defined as ri = (H − εi I)ϕi ,
while εi and ϕi are the eigenvalue and eigenvector of sub-matrices. We check the
individual element in the residual vector for the convergence. If the elements are
larger than tolerance, we calculate the correction vector based on the residual vectors
and the eigenpair of sub-matrices. However, several ways evolve to calculate the
correction vector and result in different branches of Davidson algorithm.
The correction vectors {g k , k = 1, 2, . . . , n} are given by gIk = (εk −
HI I )−1 rIk , I = 1, 2, . . . , N and normalized, while N is the number of determinants
m k
of H and r k =
i=1 ϕi (H − ε )ψi . In the final step, the correction vectors
k
Gram-Schmidt process and appended in the set {ψi } if the orthonormalized norm
value is larger than a threshold said 10−3 . The resulting number of basis might
increase by a, while 1 ≤ a ≤ n. The whole process returns to subspace construction
using the updated orthonormal basis {ψi , i = 1, 2, . . . , m , m = m + a} until the
residual vectors reach convergence.
Under the Kohn-Sham formalism, the DFT was developed based on the local
and semi-local functionals, such as local density and generalized-gradient approx-
imation. However, it may not work well in describing the long-range charge
dynamics such as van der Waals interaction. This may cause a significant inac-
curacy in delineating the system energy, lattice constant, and electronic prop-
56 C.-Y. Chen and Y. Li
Fig. 1 The comparison between experimental and simulation values from different van der Waals
models. (a) The displacement of lattice parameters, a and c. (b) The displacement of bandgap, Eg
Fig. 2 (a) The ground state energies of bulk WS2 with different examined numbers of k-points.
the numbers of kx and ky
Because bulk WS2 have the same length of lattice parameters a and b,
are set equal and it of kz varies. (b) A zoom-in plot of (a)
and preserve good accuracy, we fix the bulk WS2 structure cell and alter the number
of k-points. Since the lattice parameters a and b of bulk WS2 have the same length,
the numbers of kx and ky are set to be equal. Figure 2 shows the ground state
energies of bulk WS2 versus the different numbers of k-points. The k-points of
kx and ky became stable when they are larger than 8. Notably, they are sensitive
due to the nature of the two-dimensional materials, where the kz of 12 is the most
insensitive to kx and ky numbers. The numbers of suitable k-points is related to
the lattice parameter; lengths of a and c are 3.1532 and 12.323 angstrom from
the experiment [11], respectively. The length of c is nearly 4 times to a , thus the
suitable k-points of kz will be around one-fourth to it of kx . Since the monolayer
WS2 has an additional vacuum layer and similar lattice parameter a and c compared
to bulk WS2 , the numbers of kx and ky are the same as bulk WS2 but less for kz .
The number of kz of monolayer WS2 is set to 1 due to its long lattice parameter c.
Finally, 12 × 12 × 4 and 12 × 12 × 1 k-points for bulk and monolayer WS2 are
concluded, respectively.
Our structure relaxation flow of monolayer WS2 is shown in Fig. 3a. First,
we obtain the reliable bulk WS2 from previous examined settings, and the band
structure is shown in Fig. 3b. From the results of exchange-correlation function and
k-points examinations, they indicate the importance of correct structure relaxation
thus, the atomic force along these two directions should be preserved.
along a and b;
For this reason, monolayer WS2 is built from relaxed bulk WS2 which adds a 10
angstrom thick vacuum layer along c and fixes the simulation cell. Figure 3c shows
verified calculated band structure of monolayer WS2 . Confidently, the calculated
energy bandgaps from this method are in agreement with the experiment [11]. Then,
we continuously analyze the TM-doped mWS2 . The discussed characteristics of
TM-doped mWS2 include formation energy, work function, and band structure. As
58 C.-Y. Chen and Y. Li
Fig. 3 (a) The structure relaxation flow of monolayer WS2 . The monolayer structure relaxation
is complete under fixed unit cell for maintaining same atomic force along a and b directions. The
cutoff kinetic energy is 500 eV; the force acting on each atom of relaxed structure is smaller than
0.01 eV/angstrom; the energy difference is less than 10−6 eV per atom. From the simulation, the
atom-projected band structure of (b) bulk WS2 and (c) monolayer WS2 are obtained. The color
bars indicate the weighting of band dominated by tungsten atoms [1]
shown in Fig. 4a, because the new structure is built from different materials, we
consider the formation energy formula:
Ef orm = Edoped,mW S2 − EmW S2 + ni μi , (3)
where Edoped,mW S2 and EmW S2 are the total energies of the doped mWS2 system
and the pristine mWS2 , ni and μi are the number of atom i added (-1) or removed
(+1) and the corresponding chemical potential, respectively. The four possible
doping sites are discussed, as illustrated in Fig. 4b. To study the effect result from
two different doping concentrations, 4 × 4 and 2 × 2 supercells with one doping
atom were built from pristine mWS2 . The definition of work function is the external
energy exciting an electron from the surface of solid material into the vacuum space.
It can be calculated from the energy difference between the simulated vacuum
energy and the Fermi level, i.e., Evacuum − Ef ermi .
Numerical Calculation of Electronic Properties of Transition Metal-Doped. . . 59
Fig. 4 (a) The formula of the formation energy calculation. The Edoped,mW S2 and EmW S2 are
total energies of relaxed doped- and undoped-monolayer WS2 , respectively. For the calculation of
chemical potential, μi , an atom is located in a large cell. The potential can be obtained until the
value is stable by keeping enlarging the cell size. (b) The structures of four possible doping sites.
The “TM” means the atom of doping material. The notations “I-” and “S-” mean interstitial and
substitutional sites, respectively. The structures are built from repeatedly extended monolayer cell
and add doping atom for different doping concentrations
From the atomic relaxed structures, we can plot the band structure according to the
solved eigen energies, as shown in Fig. 5a and b. Both two plots are shifted so that
the simulated Fermi energy is located at zero. Since our simulated Fermi energy
is located at the band which is occupied by the last valence electron, i.e., the band
contributed by the doping atom. For example, comparing to Fig. 5b and a show
that the Sc doping contributes additional bands between the original conduction
and valence band. The calculated formation energy of discussed doping sites with
two different concentration are summarized in Fig. 5c. The formation energies of
discussed interstitial sites are lower than that of the substitutional sites, it indicates
the structure stability. Figure 5d plots the work function of pristine and TM-doped
mWS2 with respect to two concentrations. For simplicity, only the results of I-T
doping site are shown. The titanium (Ti)-doped mWS2 has the lowest work function
with higher concentration while zinc (Zn)-doped has the highest one with both
concentrations. The Sc possesses the largest range of modulation of work function,
1.63 eV, among discussed doping species and concentrations. It implies that there is
high flexibility in tuning work function of mWS2 which is promising for the design
and fabrication of future advanced nano-devices.
60
Fig. 5 The band structure of (a) a Sc substitutional doping which a tungsten atom is replaced and (b) the undoped monolayer WS2 . The solid red lines and
dotted orange lines are of spin-up and spin-down states, respectively. (c) The calculated formation energy of discussed doping sites. (d) The work functions of
TM-doped mWS2 with respect to different TM materials and concentrations. Here only shows the results of the doping site with the lowest formation energy.
The arrows indicate how the work function changes as the doping concentration increases
C.-Y. Chen and Y. Li
Numerical Calculation of Electronic Properties of Transition Metal-Doped. . . 61
4 Conclusions
In this work, the numerical method and simulation flow for studying doped
monolayer tungsten disulfide have been described. The studies are completed with
the examined exchange potential model and k-points sampling. The key simulated
results indicate the values of work function of Sc- and Cr-doped mWS2 have
relatively large flexibility for work function modulation via doping technology.
Acknowledgments This work was supported in part by the Ministry of Science and Technology,
Taiwan, under Grant MOST 108-2221-E-009-008, Grant MOST 108-3017-F-009-001, Grant
MOST 109-2221-E-009-033, Grant MOST-109-2634-F-009-030, and Grant MOST 110-2221-E-
A49-139, and in part by the “Center for mmWave Smart Radar Systems and Technologies”İ under
the Featured Areas Research Center Program within the framework of the Higher Education Sprout
Project by the Ministry of Education in Taiwan.
References
Abstract To provide the sufficient power of trillion sensors in the era of internet-
of-things, the thermoelectric materials and devices have been of great interest
recently. In this paper, we construct a model for the periodic silicon nanowires
(SiNWs) embedded in Si0.7 Ge0.3 (SiNWs-Si0.7 Ge0.3 composite) and propose a
simulation flow for the calculation of its thermoelectric properties. The electron
band structure and phonon energy dispersion of SiNWs-Si0.7 Ge0.3 composite are
simulated by using the effective mass Schrödinger equation formulated by the Bloch
theorem and the elastodynamic wave equation, respectively. The aforementioned
equations are discretized by using the finite element method and the corresponding
eigenvalue problems are solved by the implicitly restarted Arnoldi method. Then,
the thermoelectric properties of SiNWs-Si0.7 Ge0.3 composite are estimated by
Landauer approach.
1 Introduction
S2σ T
ZT = (1)
κph + κel
As shown in Fig. 1, the direction of the carrier and phonon transports is parallel to
the x-y plane so that the nanowires can play as interface for the phonon transport and
reduce the thermal conductivity. To simplify the simulation structure, we assume
that the nanowires are periodic with the radius r, the space s between the closest two
nanowires, and the height h, as shown in Fig. 1b. For the band profile calculation,
Fig. 1 (a) The three-dimensional (3D) schematic structure of the periodic nanowires. (b) The
geometry parameter of the simulation structure. (c) The CB and VB of SiNWs-Si0.7 Ge0.3
composite [10]. (d) The definition of irreducible Brillouin zone (IBZ), where , X, and M are
in the x-y plane of the k-space. is the original point
Numerical Simulation of Thermal Conductivity of Silicon Nanowires 65
Table 1 The adopted physical parameters. m∗l and m∗t are the effective masses of electrons in the
longitudinal and transverse directions, respectively. They are used for the quantized energy band
calculation in the CB. Similarly, m∗hh and m∗lh are the effective masses of heavy holes and light
holes, respectively. They are used for quantized energy band calculation in the VB [10, 13, 14]
Electron mass Hole mass Elastic constant
Material Bandgap m∗l m∗t m∗hh m∗lh C11 C12 C44
Si 1.12 0.98 0.19 0.49 0.16 165.8 63.9 79.6
Si0.7 Ge0.3 1.00 1.14 0.12 0.41 0.10 154.6 59.2 75.8
we simplily consider the undoped situation, where the conduction and valence
bands are plotted in Fig. 1c. For an electron or a hole in a periodic potential, the
Bloch theorem [8] is used to describe the phase change; thus, the corresponding
Schrödinger equation with effective mass approximation is given by [9]
−2 i2 2 k 2
∇[ ∇
u k ] − k · ∇
u k (
r ) + [V (
r ) + ]
uk (r ) = En,k uk (r ), (2)
2m∗ m∗ 2m∗
where , m∗ , V (r ), En,k , and uk (r ) are the reduced Plank’s constant, the effective
mass, the position-dependent potential energy, quantum energy levels, and the
corresponding wave function, respectively. Notably, V (r ) is equal to the conduction
band (CB) or valence band (VB) for electrons and holes [10], respectively. In
addition, the phononic band structure is calculated by the elastodynamic wave
equation [11]
where C is the elastic constant matrix, u is the Fourier transform of the displacement
vector [12], ρ is the mass density, and ω is the eigenfrequency, respectively. The
elastic constant matrix C describes second-order strain energy density. Since Si has
cubic symmetry, the number of independent elastic constants can be reduced to
C11 , C12 , and C44 [13]. The elastic constants of Si0.7 Ge0.3 are decided by the linear
interpolation of the values from Si and Ge, as listed in Table 1.
3 Simulation Techniques
To estimate the FOM in (1), by considering the physical transparency and the
computational efficiency of the solution method, the Landauer approach has
been implemented on the TE region [15–17]. In situations close to equilibrium,
the Landauer approach is mathematically equivalent to the Boltzmann transport
66 M.-H. Chuang and Y. Li
equation under the relaxation time approximation if the mean-free-path (MFP) for
backscattering is
2vx2 τ
λ(E) = , (4)
|vx |
where v is the group velocity, τ is the momentum relaxation time, and the
subscription x represents the transport direction of the carriers or phonons [16].
Within the Landauer approach, the number of modes and the MFP for backscattering
are two important physically parameters. The calculation flows for the TE properties
related to the electrons and phonons are listed in Algorithms 1 and 2, respectively.
In Algorithm 1, the differential conductivity is given by [15]
2q 2 Me (E) ∂f0
σ (E) = λe (E) (− ), (5)
h A ∂E
where π 2 kB2 T /3h is the quantum of the thermal conductance, λph is the MFP of the
∂n0
phonon transport, Mph /A is the number of modes per area, and ( π32 )( kBET )2 (− ∂ω )
is the window function. n0 is the Bose-Einstein distribution which is related to the
phonon dispersion. The carriers are under the diffusion transport with the MFP for
back-scattering calculated by the Matthiessen rule, where the average MFP for back-
scattering without nanowire structure is extracted by setting the thermal conductivity
of 150 W/m-K from the measured data of bulk silicon [18].
For the calculation of the electronic and phononic band structures, the boundary
conditions of (2) and (3) are set periodically owing to highly periodical array of
SiNWs [6]. We solve these two discretized eigenvalue problems by the implicitly
restarted Arnoldi method [19]. The finite element method with Lagrange elements
is implemented to discretize the Schrödinger and elastodynamic equations. A finite
element is a triple including a geometry domain, a space function in this domain, and
a set of linear functionals (so-called the degree of freedom) [20]. The band structure
is calculated by sampling in k-space, more specifically, in the irreducible Brillouin
zone (IBZ) [21], as shown in Fig. 1d. The calculation flow to get the band diagrams
is listed in Algorithm 3.
Numerical Simulation of Thermal Conductivity of Silicon Nanowires 67
4: end while
Fig. 2 (a)–(c) The first ten energies of electrons in the CB. (d)–(f) The first ten energies of light
holes in the VB. Ei represents the ith energy state. For electrons, E0 is the ground state and E1 is
the first excited state
Fig. 3 (a)–(c) The energy dispersion of phonons in the SiNWs-Si0.7 Ge0.3 composite with s = 2,
15, and 50 nm, respectively
Fig. 3, the phonons meet more interfaces and the scattering rate is huge. In room
temperature, the low energy phonon plays an important role to carry heat. Thus, the
thermal conductivity can be expected to be decreased as s decreases.
Notably, the number of eigenvalues will influence the accuracy of the results.
Thus, to find an optimal sampling number with a minimal time cost, Fig. 4a shows
Numerical Simulation of Thermal Conductivity of Silicon Nanowires 69
Fig. 4 (a) The computational time versus the number of sampling points and (b) the relative error
versus the numbers of iterations with respect to different s, where the results are solved from the
Schrödinger equation. The lines are the results with s =2, 15, and 50 nm, respectively. The tested
PC is with an Intel® CoreTM i7–7500 CPU and the RAM is 16 GB
the computational time when solving (2) versus the sampling numbers with respect
to different s. For each s, we solve the first ten eigenvalues for electrons in the
CB. There are 894, 1126, and 1344 elements in our simulation with s=2, 15, and
50 nm, respectively; the sizes of corresponding matrices are 1856, 2329, and 2765,
respectively. The computational time will increase as the sampling points increase
linearly. Figure 4b shows the relative error between iterations of the implicitly
restarted Arnoldi method with respect to different s. The stopping criterion is
the relative error <10−6 . Notably, the accuracy of the computed eigenenergy is
almost the same when the matrix size increases from several thousands to ten
thousands; however, the time cost increases significantly. Not shown here, we have
the similar numerical tests when solving (3), where both the computational time and
convergence behavior are faster than that of (2).
By considering the doping effect, TE properties calculated via the Landauer
approach will vary as the Fermi level. The calculated lattice thermal conductivity
is about 2.2 W/mK in Algorithm 2, which is close to the experimentally measured
data of 3.5 W/mK [6] when the density of SiNWs is 1.6 × 1011 cm−3 (r = 5 nm and
s = 15 nm). For the SiNWs-Si0.7 Ge0.3 composite with p-type doping of 1.16×1015
cm−3 , ZT calculated from (1) is about 1.5 × 10−4 at room temperature.
5 Conclusions
In this paper, we have applied the numerical method to calculate the electronic and
phononic band structures of the silicon nanowires embedded in Si0.7 Ge0.3 by solv-
ing the Schrödinger equation and the elastodynamic wave equation. The Landauer
approach is used for the calcultion of thermoelectric properties, respectively. The
simulated thermal conductivity is close to the measurement.
70 M.-H. Chuang and Y. Li
Acknowledgments This work was supported in part by the Ministry of Science and Technology,
Taiwan, under Grant MOST 108-2221-E-009-008, Grant MOST 108-3017-F-009-001, Grant
MOST 109-2221-E-009-033, Grant MOST-109-2634-F-009-030, and Grant MOST 110-2221-E-
A49-139, and in part by the “Center for mmWave Smart Radar Systems and Technologies” under
the Featured Areas Research Center Program within the framework of the Higher Education Sprout
Project by the Ministry of Education in Taiwan.
References
18. A.I. Hochbaum, R. Chen, R.D. Delgado, W. Liang, E.C. Garnett, M. Najarian, A. Majumdar, P.
Yang, Enhanced thermoelectric performance of rough silicon nanowires. Nature 451, 163–167
(2008)
19. R.B. Lehoucq, D.C. Sorensen, C. Yang, ARPACK Users’ Guide: Solution of Large-Scale
Eigenvalue Problems with Implicitly Restarted Arnoldi Methods (Society for Industrial and
Applied Mathematics, Philadelphia, 1998)
20. J. Sun, A. Zhou, Finite Element Methods for Eigenvalue Problems (CRC Press, New York,
2017)
21. M.S. Kushwaha, P. Halevi, L. Dobrzynski, B. Djafari-Rouhani, Acoustic band structure of
periodic elastic composites. Phys. Rev. Lett. 71, 2022 (1993)
A Novel Surface Mesh Simplification
Method for Flux-Dependent Topography
Simulations of Semiconductor
Fabrication Processes
1 Introduction
a regular grid. This approach is attractive due to the robust handling of topographical
changes in a level set framework [2]. The particle flux on the semiconductor surface
denotes the number of particles interacting on the surface. One possible numerical
method for calculating the surface flux is Monte Carlo ray tracing [3]. At practically
relevant surface resolutions the flux calculation dominates the overall execution
time of an etching or deposition simulation [1]. It is thus useful to investigate
approaches that speed up the flux calculation. One promising approach is to use
temporary explicit surface meshes as there exists a large body of knowledge about
ray tracing on explicit surfaces. The marching cubes algorithm [4] is commonly
used to extract an explicit surface from the level set. However, the resulting surface
meshes typically contain very narrow and long triangles (needles) or small triangles
in flat regions that contain no geometric variation. Therefore eliminating those
surface elements reduces the total surface element count which speeds up the ray
tracing tasks, further underlining the attractiveness of an explicit surface mesh
approach.
There exist several algorithms that reduce the resolution of surface meshes with
respect to a given metric; several metrics have been proposed in literature [5–8].
However, some of these algorithms try to simplify the geometry homogeneously
[5, 6] or use computationally expensive metrics [7, 8]. The latter is particularly
relevant when considering the entire etching or deposition workflow where the mesh
simplification has to be conducted at every single time step. Mesh simplification, or
more general domain simplification, is a commonly used approach in process TCAD
simulations [9, 10]. In particular, in [11] the authors evaluate the flux on a mesh by
sampling only a sparse set of surface elements to accelerate the simulation.
In this paper we introduce a flexible and computationally lightweight simplifi-
cation method based on the local surface curvature. We evaluate the impact of our
mesh simplification method on typical process TCAD topography simulations by
using the high performance ray tracing library Embree [12] by conducting a ray
tracing performance analysis. Specifically we compare the flux calculation time for
surfaces obtained with the presented method, with the flux calculation time obtained
for surfaces generated by the reference Lindstrom-Turk algorithm [5], by comparing
the execution time of the simplification process and the performance of the flux
calculation using Monte Carlo ray tracing.
geometric properties in each region. This simplification method has been designed
to simplify regions of the mesh offering negligible geometric variation (e.g. flat
areas) to a higher degree, thus allowing to maintain a higher resolution in regions of
the mesh with high geometric variation. Furthermore, our method is not limited to
the Lindstrom-Turk simplification algorithm, hence other simplification algorithms
[6] can be used in combination with our method.
The first step in our simplification method is the detection of geometric features in
the mesh: We use the absolute mean curvature of each vertex and calculate it via a
discrete approximation of the Laplace-Beltrami operator [13] in the vertex xi
j ∈N1 (i) (cot αij − cot βij )(xi − xj )
|H (xi )| = , (1)
4Aavg
where H (xi ) denotes the mean curvature in the vertex xi and N1 (i) is the set of
all vertices adjacent to xi . The angles αij , βij are the angles of the triangles that
share the edge between xi and xj , which are opposite to this edge and Aavg is
the average area of the triangles surrounding the vertex xi . The mean curvature
is used to categorize each vertex to be either a flat or a feature vertex. In particular,
an empirical threshold is used to identify vertices with small curvature (numerical
artifacts), which are considered to be flat.
The Mesh is partitioned into the feature regions and the transition regions according
to the metrics above. The feature region encompasses the triangles of the mesh
with significant geometric variation. The transition region contains the triangles
that do not hold information about the geometric variation. This partition of the
mesh allows to simplify the transition region to a greater extent, which reduces the
overall number of mesh elements without loosing information about the geometric
variation. Furthermore, this approach allows to keep a high resolution in regions of
the mesh with high geometric variation by simultaneously limiting the overall mesh
size in terms of number of triangles. However, simplifying the flat region to a higher
degree than the feature region leads to low quality triangles (e.g. needles).
To prevent the formation of low quality elements the transition region is
simplified with linearly increasing parameters, thus creating a reasonable mesh
grading. Figure 1 schematically depicts two steps of the discussed process. At first
the whole mesh, including the feature region, is simplified until the smallest edge
has an edge length of l0 . If the feature region should not be simplified l0 is set
76 C. Lenz et al.
transition
region
→ → →
feature
region
Fig. 1 Example of the simplification process: (1) shows the mesh after it has been divided into
regions. (2) shows the simplification of the feature region. (3) shows the extension of the feature
region. (4) shows again the simplification of the transition region with an increased edge length
to 0. After this initial simplification step the transition region is simplified until
the smallest edge has an edge length of l1 = l0 + sl, where sl denotes the step
length. Next, the feature region is expanded into the transition region. Afterwards
the now smaller transition region is simplified until the smallest edge has an edge
length of li+1 = li + sl with i ∈ {0, 1, . . . , n ∈ N}. These last two steps continue
until the feature region cannot move any further into the transition region, and
thus terminates the simplification process. To avoid unwanted side effects of the
potentially large edge lengths produced by our iterative scheme, another parameter
lmax is used to terminate the refinement once the edge length li in the transition
region has reached lmax .
The parameter for the simplification of the feature region l0 , when using the level
set method, can be connected to the level-set and is chosen in concordance with
the minimal grid size Δt . When using meshes not originating from a level-set, this
parameter can be chosen by averaging the edge length of all feature vertices. We
have empirically determined that the step length sl should be approximately the
edge length of the feature region after the simplification with the parameter l0 stops.
A bigger step size increases the amount of edges that are removed. However, the
bigger the difference between the edge length of the feature region and the step
length, the worse the triangle quality of the mesh.
3 Results
The simplification method has been evaluated in the context of process TCAD
in three ways: geometric distance to the original geometry, execution time of
the simplification method, and the execution time of a subsequent surface flux
calculation by ray tracing. In this study two example geometries have been analyzed
and each example geometry has been simplified applying eight different degrees of
simplification, resulting in a reduction of vertices from 20–90%. Figure 2 shows the
A Novel Mesh Simplification Method for Flux-Dependent Topography Simulations 77
(a) (b)
Fig. 2 Process TCAD surface meshes simplified with our method. (a) Surface 1 with 78% of the
vertices of the original mesh removed by our simplification method. (b) Surface 2 with 52% of the
vertices of the original mesh removed by our simplification method
two surface meshes after they have been simplified with our method. The original
surface meshes of Surface 1 and Surface 2 have 70,831 and 175,550 vertices,
respectively. The performance benchmarks presented in the following are based on a
serial C++ implementation of our method executed on a 64bit GNU/Linux platform
equipped with an Intel Devil’s Canyon CPU.
Fig. 3 Hausdorff distance from each vertex of the original mesh to a mesh simplified with our
method and a mesh simplified using the Lindstrom-Turk algorithm. (a) Surface 1 with 78% of the
vertices of the original mesh removed by our simplification method. (b) Surface 2 with 52% of the
vertices of the original mesh removed by our simplification method
39 42 47 52 67 73 78 81
Amount of simplification in [%]
(a) (b)
Fig. 4 Average simplification time of our method and the Lindstrom-Turk algorithm. The amount
of simplification denotes the number of vertices which have been removed from the original mesh.
(a) Surface 1. (b) Surface 2
A Novel Mesh Simplification Method for Flux-Dependent Topography Simulations 79
90%
85%
80%
39 42 47 52 67 73 78 81
Amount of simplification [%]
(a) (b)
Fig. 5 Execution time of Monte Carlo ray tracing using 108 rays. The amount of simplification
denotes the number of vertices which have been removed from the original mesh. (a) Surface 1.
(b) Surface 2
on which particles move through the simulation domain (modeling the surface
flux) we use the Embree ray tracing library [12]. In Embree a bounding volume
hierarchy data structure [3, 12] is used to efficiently compute the paths on which
the particles move through space. The internal structure of the bounding volume
hierarchy depends eminently on the structure and the coarseness of the surface mesh.
Figure 5 shows the execution times measured to perform the Monte Carlo ray
tracing on the meshes with different degrees of simplification. As the simplified
meshes contain less triangles the bounding volume hierarchy data structure used for
ray tracing will have less elements than the data structure for the original mesh. As
the size of the data structure is decreased the memory footprint is reduced and this
leads to faster flux calculations because less data has to be processed and the caches
of the processor are used more effectively. Figure 5a and b show that the empirical
speedup in flux calculation depends on the shape of the surface mesh. When tracing
Surface 1, the meshes of both simplification methods perform approximately the
same and are faster than the original mesh. When tracing Surface 2, the meshes
generated by our simplification method clearly outperform the meshes simplified
with the Lindstrom-Turk algorithm and the original geometry. Surface 2 contains
deep trenches and the rays of the tracing algorithm need to travel towards the
bottom of these trenches. As the walls of the trenches do not have high curvature
the bounding volume hierarchy data structure created from the mesh simplified
with our method will be less complex within the deep trenches and hence, the
traces of the rays down the trench can be computed by performing less operations.
Also, the rays which travel towards the bottom of the trench usually reflect off the
surface many times, which makes the difference in computational effort for using
a bounding volume hierarchy from a mesh simplified with our method even more
evident. Figure 5b for Surface 2 shows a speedup of about 12% compared to the
Lindstrom-Turk algorithm for simplification levels of 52 and 67%.
80 C. Lenz et al.
4 Summary
We introduce a new surface mesh simplification method that uses the curvature of
the surface mesh to identify regions which can be simplified with different sets of
parameters depending on the local surface properties. Our approach is well suited for
meshes that are common in flux-dependent process TCAD simulations since such
meshes often contain large flat regions with high resolutions from the originating
regular grid. We have evaluated our method with respect to geometric distances and
execution times for simplification and subsequent computations of flux estimates.
The geometric distances in the experiments have improved in comparison to the
reference algorithm. In particular, the average Hausdorff distance of the investigated
geometries has improved by 20–40%. The ray tracing time in all our experiments has
been improved on average by 15%, furthermore, demanding real world geometries
from process TCAD have shown a compelling improvement of 12% of time spent
on ray tracing. The execution time of our simplification method is on average
17% slower than the reference algorithm. When considering entire topography
simulations, the accelerated ray tracing significantly exceeds the additional time
spent on our simplification method.
Acknowledgments The financial support by the Austrian Federal Ministry for Digital and
Economic Affairs and the National Foundation for Research, Technology and Development is
gratefully acknowledged.
References
Abstract A peculiar geometry for a graphene double gate field effect transistor
is proposed. It allows us to overcome the problems encountered for a standard
MOSFET geometry due to the zero gap in monolayer graphene. It is found that for
a wide range of the gate voltage the current is in an off state with a ratio current-on
over current-off of about 104 .
1 Introduction
As quoted in [1] “Graphene has changed from being the exclusive domain of
condensed-matter physicists to being explored by those in the electron-device
community. In particular, graphene-based transistors have developed rapidly and
are now considered an option for post-silicon electronics. However, many details
about the potential performance of graphene transistors in real applications remain
unclear.”
Device engineers devote considerable effort for developing transistor designs in
which short-channel effects are suppressed and series resistances are minimized.
Scaling theory predicts that a FET with a thin barrier and a thin gate-controlled
region will be robust against short-channel effects down to very short gate lengths.
The possibility of having channels that are just one atomic layer thick is perhaps
the most attractive feature of graphene for its use in transistors. Main drawback of
a large-area monolayer graphene is the zero gap. This has the consequence that the
current versus the gate voltage is no longer a monotone function and the off region
is very narrow (see [2]), making graphene not usable in a straightforward way for
transistors. Moreover, graphene on substrate suffers also from the degradation of the
mobility because of the additional interaction with the phonons of the oxide.
Here we propose a special geometry for a double gate graphene FET (DG-GFET)
which overcomes the problem related to the zero gap as will be shown by the
numerical simulations. The devices is depicted in Fig. 1. The active area is made
of just one graphene layer.
Usually the GFETs are investigated by adopting reduced one dimensional models
of the Poisson equation with some averaging procedure [3, 4]. Here a full two-
dimensional simulation is presented based on a drift-diffusion-Poisson system with
the mobilities proposed in [5].
Other approaches are based on hydrodynamical models, e.g. those deduced with
the maximum entropy principle [6–9], or the direct solution of the Boltzmann
equation [10–14] or Monte Carlo methods [15]. Thermal effects can also be
included [16–21]. Here the crystal lattice will be kept at a constant temperature
and considered as a thermal bath. For the inclusion of quantum effects the interest
reader is referred to [22, 23].
2 Mathematical Model
The mathematical model we adopt to simulate the charge transport in the graphene
layer of the DG-GFET is the bipolar drift-diffusion in 1D case,
& '
∂n 1 ∂ ∂n ∂φ
− μn k B T L − enμn = 0,
∂t e ∂x ∂x ∂x
& '
∂p 1 ∂ ∂p ∂φ
+ −μp kB TL − epμp = 0,
∂t e ∂x ∂x ∂x
Simulations of a Novel DG-GFET 85
where n(t, x), p(t, x) are the graphene electron density and hole density respec-
tively, e is the positive elementary charge, kB is the Boltzmann constant, TL is the
lattice temperature (kept constant), μn (x) and μp (x) are the mobility models for
electrons and holes respectively and φ(x, y) is the electric potential. We adopt the
mobility model proposed in [5] (for other models the interested reader is referred to
[2, 24, 25]) given by
νs
μs (x) = ,
[1 + (νs E/vsat )γ ]1/γ
where E = |∂φ/∂x| is the absolute value of the x-component of the electric field,
vsat is the saturation velocity (we take the value 0.2 µm/ps), γ ≈ 2 and
μ0
νs (x) = ,
(1 + s/nref )α
where μ0 = 0.4650 µm2 /V ps is the low field mobility, nref = 1.1 × 105 µm−2
and α = 2.2. The symbol s indicates the carrier density: s = n for electrons and
s = p for holes.
In order to determine the electric potential a 2D Poisson equation is coupled to
the drift-diffusion system
where
e(n(x) − p(x) − Nimp )/tgr if y = ygr
h(x, y) =
0 if y = ygr
being ygr the y-coordinate (see Fig. 1), Nimp = 3.5×103µm−2 the impurity density
due to the SiO2 , tgr the distance between the two layers of oxide which is assumed
to be equal to 1 nm. We remark that the charge in the graphene layer is considered
distributed in the volume enclosed by the parallelepiped of base the area of the
graphene and height tgr . Recall that n and p are areal densities. Moreover is given
by
gr if y = ygr
(x, y) =
ox if y = ygr
where gr and ox are the dielectric constants of the graphene and oxide respectively.
The source and drain contacts are assumed to be thermal bath charge reservoirs.
86 G. Nastasi and V. Romano
3 Numerical Results
Here some numerical results are presented in order to show that the proposed DG-
GFET is able to perform as a transistor. The length is 100 nm. The width of both the
oxide layers (SiO2 ) is 10 nm. The source and drain contacts are positioned in the
direction transversal with respect the graphene sheet and they occupy all the device
height (21 nm). The two gate potentials are set as equal. At the metallic contacts the
total voltage including the work function is considered equal to 0.25 V plus the bias
voltage, which is zero at source. Indeed the work function depends on the specific
material the contacts are made of.
A full 2D discretization of the Poisson equation is adopted in the whole device
by standard central differencing enforced with a Gummel iteration, while the drift-
diffusion equation is solved only in the graphene sheet as a 1D problem with
a Scharfetter-Gummel method (indeed only one row of grid points is used by
considering a kind of average in the y direction). The interested reader is referred to
[2] for the details. By numerical experiments a good resolution is already obtained
with 41×23 grid points.
In Figs. 2, 3 the shape of the electrical potential is plotted when the source-drain-
potential is 0.3 V and the gate-source potential is −1 V and 1 V respectively. In
the first case the device is off while in the second case is on. The Fig. 4 shows the
characteristic curve current versus gate voltage with source drain voltage equal to
0.2 V while Fig. 5 shows the same but in a logarithmic scale.
Fig. 2 Electrostatic potential when the gate-source potential is −1 V and the source-drain-
potential is 0.3 V
Simulations of a Novel DG-GFET 87
Fig. 3 Electrostatic potential when the gate-source potential is 1 V and the source-drain-potential
is 0.3 V
4
Vsd = 0.1V
Vsd = 0.2V
Vsd = 0.3V
Current density (A/cm)
3 Vsd = 0.4V
0
-1 -0.5 0 0.5 1
Gate voltage (V)
Fig. 4 Current versus gate voltage for several values of the bias voltage
88 G. Nastasi and V. Romano
102
Vsd = 0.1V
Vsd = 0.2V
100 Vsd = 0.3V
Current density (A/cm)
Vsd = 0.4V
10-2
10-4
10-6
10-8
-1 -0.5 0 0.5 1
Gate voltage (V)
Fig. 5 Current versus gate voltage for several values of the bias voltage in logarithmic scale
2.5
Current density (A/cm)
2
VG = 0.1V
1.5 VG = 0.3V
VG = -0.1V
VG = -0.3V
1
0.5
0
0 0.2 0.4 0.6 0.8 1
Gate voltage (V)
Fig. 6 Current versus bias voltage for several values of the gate voltage
100
Current density (A/cm)
10-5
VG = 0.1V
VG = 0.3V
VG = -0.1V
VG = -0.3V
10-10
0 0.2 0.4 0.6 0.8 1
Gate voltage (V)
Fig. 7 Current versus bias voltage for several values of the gate voltage in logarithmic scale
90 G. Nastasi and V. Romano
4 Conclusions
A novel geometry for a double gate FET with active area made of a single layer of
graphene has been proposed and simulated with a drift-diffusion model by solving
a full 2D Poisson equation for the electrostatic potential. The results are rather
encouraging because a good transistor effect is obtained at variance with of other
GFETs proposed in the literature. The simulation based on more sophisticated
models is currently under investigation by the authors in order to get a further
validation of the devised device.
Acknowledgments The authors acknowledge the support from INdAM (GNFM). The author
G.N. acknowledges the support from Progetto Giovani GNFM 2019 Modelli matematici, numerici
e simulazione del trasporto di cariche e fononi nel grafene. This work has been also supported by
the Università degli Studi di Catania, Piano della Ricerca 2018/2020 Linea di intervento 2.
References
14. M. Coco, G. Nastasi, Simulation of bipolar charge transport in graphene on h-BN. COMPEL
39(2), 449–465 (2020)
15. M. Coco, A. Majorana, V. Romano, Cross validation of discontinuous Galerkin method and
Monte Carlo simulations of charge transport in graphene on substrate. Ricerche mat. 66, 201–
220 (2017)
16. M. Coco, G. Mascali, V. Romano, Monte Carlo analysis of thermal effects in monolayer
graphene. J. Comput. Theor. Trans. 45(7), 540–553 (2016)
17. M. Coco, V. Romano, Simulation of electron–phonon coupling and heating dynamics in
suspended monolayer graphene including all the phonon branches. J. Heat Transfer. 140(9),
092404 (2018)
18. M. Coco, V. Romano, Assessment of the constant phonon relaxation time approximation in
electron-phonon coupling in graphene. J. Comput. Theor. Trans. 7(1–3), 246–266 (2018)
19. G. Mascali, V. Romano, Charge transport in graphene including thermal effects. SIAM J. Appl.
Math. 77, 593–613 (2017)
20. G. Mascali, V. Romano, Exploitation of the maximum entropy principle in mathematical
modeling of charge transport in semiconductors. Entropy 19(1), 36 (2017)
21. G. Mascali, V. Romano, A hierarchy of macroscopic models for phonon transport in graphene.
Physica A 548, 124489 (2020)
22. O. Morandi, F. Schürrer, Wigner model for quantum transport in graphene. J. Phys. A Math.
Theor. 44(26), 265301 (2011)
23. L. Luca, V. Romano, Quantum corrected hydrodynamic models for charge transport in
graphene. Ann. Phys. 406, 30–53 (2019)
24. A. Majorana, G. Mascali, V. Romano, Charge transport and mobility in monolayer graphene.
J. Math. Industry 7(4), 4 (2016).
25. G. Nastasi, V. Romano, Improved mobility models for charge transport in graphene. Commun.
Appl. Ind. Math. 10(1), 41–52 (2019)
26. G. Nastasi, V. Romano, Simulation of graphene field effect transistors, in Scientific Computing
in Electrical Engineering, SCEE 2018, Taormina, September 23–27, ed. by G. Nicosia, V.
Romano. Mathematics in Industry, vol. 32 (Springer Nature, Switzerland AG, 2020), pp. 171–
178
Part III
Computational Electromagnetics
Electric Circuit Element Boundary
Conditions in the Finite Element Method
for Full-Wave Frequency Domain Passive
Devices
1 Motivation
Many EM devices with distributed parameters and field effects specific to full-wave
(FW) or Magneto-Quasi-Static (MQS) EM field regime are connected to circuits
with lumped parameters (e.g. in measuring and control applications). For this, the
EM devices need boundary conditions compatible with external circuits (Fig. 1,left).
By definition, an isolated electric circuit has a finite number of components
connected to common terminals. Each terminal is characterized by its voltage with
respect to the ground. A non-isolated circuit, i.e. a sub-circuit with m terminal nodes
has each of these terminals characterized by a pair of scalar quantities, a current ik
Fig. 1 Left: Coupling of electric circuits and EM device models are naturally ensured by means
of terminals. Right: To ensure the coupling, “node voltages” (potentials) and electric currents of
non-isolated circuits must have a correspondent in the EM device model
entering into the sub-circuit and a “node voltage” (potential) vk (Fig. 1,right). The
power transferred to it is
m
m−1
m−1
P = ik vk = ik (vk − vm ) = ik vk (1)
k=1 k=1 k=1
if im is expressed according to Kirchhoff current law for a cutset and the terminal m
is connected to ground. This power expression shows that the state of a m-terminal
circuit is characterized by 2(m−1) independent quantities: m−1 currents and m−1
voltages. The assumption vm = 0 is not a restriction for the purpose of this paper,
which is stated at the end of Sect. 2. A natural coupling of this sub-circuit with
an EM device is possible if some connecting surfaces are defined on the device’s
boundary, for which currents and potentials are defined, in order to satisfy Kirchhoff
relationships and provide the same transmitted power formula (1) as subcircuits do.
The conditions that satisfy these requirements are the ones proposed in [10], used
in [4, 8] and called Electric Circuit Element (ECE) boundary conditions.
The ECE boundary conditions, combined with current excited terminals, are the
“realistic boundary conditions” used in [1] to solve eddy current problems with the
finite element method (FEM) using a formulation in H and an ungauged T − ϕ, ϕ
one in [2]. Similar conditions, although with a different definition for the terminal
voltages are proposed in [5] and used for A, V eddy current formulations [7].
The use of ECE in MQS problems for inductance extraction with an A, V
formulation is discussed in [9]. Our aim is to use ECE boundary conditions to
solve full-wave (FW) problems with FEM. We have successfully used ECE to model
passive on-chip components such as resistors, inductors, capacitors, interconnects or
RF-MEMS switches in FW [3], with the Finite Integration Technique as numerical
method. According to our knowledge, the ECE conditions are not available in FEM
codes which implement the formulation of microwave ports for FW. Theoretical
studies exists, e.g. in [4], based on an E, V formulation for the whole domain. In
this paper we use E strictly inside the domain and V solely on the boundary. During
the reviewing process of this paper, Hiptmair and Ostrowski released a relevant
report [6], proving the interest for this subject.
Electric Circuit Element Boundary Conditions for Passive Devices 97
Fig. 3 Each non-grounded terminal of the EM device with ECE boundary conditions can be either
current excited or voltage excited. Its hybrid transfer matrix is obtained after computing voltages
of the current excited terminals and currents of the voltage excited terminals in linear problems
by (2), and thus the ECE boundary conditions are perfectly compatible with the
power transferred through its terminals by a multipolar circuit [8, 10].
If we assume that the terminals have known potentials, then it can be proved that
the problem of EM field analysis in a linear domain with ECE boundary conditions
has a unique solution. Consequently, the terminal currents are output signals and
are obtained by solving the field problem [10]. As the domain is linear, so are
the equations, hence the device with ECE conditions is a linear system, defining
a multiple input multiple output (MIMO) type dynamic system with m − 1 inputs
and m − 1 outputs (Fig. 3).
In the frequency domain, the input-output relationship is expressed as:
T ZA T
V 1 . . . V n I n+1 . . . I m−1 = I 1 . . . I n V n+1 . . . V m−1 . (3)
BY
Z(f ) A(f )
The problem to be solved is: “Find , where f is the frequency in
B(f ) Y(f )
a given frequency range of interest, defined by its minimum and maximum values
fmin and fmax f ∈ [fmin , fmax ], from the EM field solution.” If this hybrid matrix is
known, then the “field” element can be realized with common circuit elements and
included in any circuit simulator.
3 ECE in FEM
∂Ω, either exclusively Et or Ht are known (given). The subscript t indicates the
tangential component of the vector on the surface. It is useful to denote a disjoint
partition of the boundary: ∂Ω = SE ∪ SH , SE ∩ SH = ∅, and thus Et : SE → C2 ,
Ht : SH → C2 . The imposed boundary conditions are: Et (r ) = n × (E( r ) × n ), for
r ∈ SE and Ht (r ) = n ×(H (r )× n ), for r ∈ SH . In what follows we will name them
classical boundary conditions. The uniqueness of the field solution can be proven
on the basis of the complex form of the Poynting’s theorem that gives the expression
of the transmitted power (assuming a linear field domain, with no moving parts):
( * ∗ ·D∗
+
∗ B · H E
− (Et × H∗t ) · n ds = E · J + 2jω − . (4)
∂Ω Ω Ω 2 2
The proof assumes that there exist two such fields that satisfy the same boundary
conditions. This means that the Poynting theorem in complex form is valid for
the difference field, which satisfies Maxwell’s equations (due to linearity) and
zero boundary conditions. This implies that the real part is zero which conduces
to zero difference electric field (conductivity of the domain is assumed non-zero
everywhere) and the imaginary part is zero with conduces to zero difference
magnetic field.
The second order equation is:
+ jω(σ + jωε)E = 0.
∇ × (ν∇ × E) (5)
Replacing the expression of the magnetic field strength in the right hand side we get
(
· (∇ × E ) + jω(σ + jωε)E · E
(ν∇ × E) dx = jω H × E · n ds.
Ω ∂Ω
(6)
With classical boundary conditions, the right hand side is equal to SE E t × n ·
H ds + SH n × H t · E ds. E t are essential boundary conditions that is why the
100 G. Ciuprina et al.
test functions are chosen so that E t is zero on SE . Thus, the weak equation for the
trial functions E is:
· (∇ × E ) + jω(σ + jωε)E · E
(ν∇ × E) dx = jω n × H t · E ds.
Ω SH
(7)
The boundary conditions H t are natural, they appear in the functional equation.
In conclusion, the weak formulation in E with classical boundary conditions is:
E ) = f (E ), ∀E ∈ H0 where
Find E in H , such that a(E,
E ) =
a((E, · (∇ × E ) + jω(σ + jωε)E · E dx,
(ν∇ × E) (8)
Ω
f (E ) = jω n × H t · E ds, (9)
SH
, -
H = u ∈ H (curl, Ω)| u × n ) = Et on SE ,
n × ( (10)
, -
H0 = u ∈ H (curl, Ω)| u × n ) = 0 SE .
n × ( (11)
(12)
The initial assembling is carried out for all the edges in the domain. The next step
refers to the boundary conditions. Assume that the edges were numbered in the
following order: first—the inner edges, second—the edges on the boundary SH and
finally, the edges on the boundary SE . This leads to the following partitioning:
⎡ ⎤⎡ ⎤ ⎡ ⎤
Ain−in Ain−SH Ain−SE Uin 0
⎣ ASH−in ASH−SH ASH−SE ⎦ ⎣ U ⎦ = ⎣ bSH ⎦ (13)
SH
ASE−in ASE−SH ASE−SE USE 0
Electric Circuit Element Boundary Conditions for Passive Devices 101
The group of equations that correspond to edges on the SE boundary is deleted and
the essential boundary conditions E t are translated into imposed values of electric
voltages along edges on the SE boundary. The system to be solved is
Ain−in Ain−SH Uin 0 Ain−SE
= − USE , (14)
ASH−in ASH−SH USH bSH ASH−SE
where Ic is the set of indices of current excited terminals. Similarly, we will denote
by Iv is the set of indices of voltage excited terminals. We need an equation for the
electric potential on the boundary, as well. Let’s denote the normal component of
not
the total current density in any point on the boundary as J n = (∇ × H) · n. We will
project J n onto a set of scalar test functions V :
( ( m
(ECE2)
(∇ × H) · n V ds = J n V ds = J n V ds = V k I k
∂Ω ∂Ω k=1 Sk k∈Ic
The integrand of the left hand side can be further computed by using the
integration by parts formula that involves the surface differential operators and the
102 G. Ciuprina et al.
substitution of the magnetic field with its expression with respect to the electric
field, as it follows from Faraday’s law:
( ( (
(∇ × H) · n V ds = V n · curl H ds = V div2 (H) ds =
def
∂Ω ∂Ω ∂Ω
( (
ν
= V (n × H) ds − H · grad2 V ds = curl E · grad2 V ds
∂(∂Ω) ∂Ω ∂Ω jω
Consequently it follows that the weak form of the equation on the boundary is
(
(ν∇ × E) · ∇2 V ds = j ω V k I k (16)
∂Ω k∈Ic
where
a(E, E ) = (ν∇ × E) · (∇ × E ) + jω(σ + jωε)E · E dx, f (E ) = jω V k I k ;
Ω k∈Ic
(
b(E, V ) = (ν∇ × E) · ∇2 V ds, g(V ) = jω V k I k ;
∂Ω k∈Ic
HV ,0 = {u ∈ H (grad, ∂Ω)| u = 0 on Sk , k ∈ Iv
u = constant(unkown) on Sk , k ∈ Ic }
Electric Circuit Element Boundary Conditions for Passive Devices 103
After solving, we get the unknown potentials V and Vt,c . The currents through
the terminals in Iv can be computed as a postprocessing step.
Figure 4 shows a quantitative validation for a 2D simple case, with two terminals
and with analytical solution. It is a single input single output (SISO) system, both
current and voltage excitations give accurate results. The domain is a brick that
occupies the space x ∈ [−a, a], y ∈ [0, l] and x ∈ [0, h]. One excited terminal
(in voltage or in current) is on the z = 0 boundary and the grounded terminal is on
the z = h boundary. The material inside is assumed homogeneous with ε, μ, σ .
The analytic solution can be obtained by solving the Helmholtz equations and
considering the current excited terminal (I ). The complex power absorbed by this
104 G. Ciuprina et al.
0.04
0.035
0.03
0.025
R [Ω]
0.02
0.015
0.01
Analytic
0.005
Voltage excitation
Current excitation
0
0 10 20 30 40 50 60 70 80 90 100
Frequency [GHz]
× 10 -13
5.5
Analytic
5 Voltage excitation
Current excitation
4.5
3.5
L [H]
2.5
1.5
0.5
0 10 20 30 40 50 60 70 80 90 100
Frequency [GHz]
Fig. 4 Quantitative validation of the implementation for a 2D case with analytical solution. The
problem is a rectangle with two opposite terminals, consequently the system is SISO. Both voltage
and current excitations lead to relative errors less than 2% for the whole frequency range
Electric Circuit Element Boundary Conditions for Passive Devices 105
Fig. 5 Qualitative validation for a 2D case, MIMO (3 terminals), hybrid excitation (one terminal
grounded, one is voltage excited and one is current excited)
4 Conclusions
The advantages of ECE BC for Maxwell equations are that the ports are clearly and
well defined, without ambiguity, fully compatible with the circuit terminals. There
is no restriction on the field regime (full wave, nonlinear). For MIMO systems, the
hybrid excitation is obtained in a natural way. This paper proposed a FEM algorithm
for ECE, which E strictly inside the domain and V on the boundary. The degrees of
freedom are the electric voltages on the inner edges and the potentials of the floating
nodes on the boundary (nodes outside terminals and current excited terminals). Our
next research will compare the 3 mentioned formulations.
References
1 Introduction
with e, h and d, b denoting the electric and magnetic fields and fluxes, respectively,
which are mutually related by the constitutive relations
b = μ0 h, d = 0 ∞ e + p. (2)
the high frequency limit of the relative permittivity. Further, p denotes the memory
J. Dölz
University of Twente, Enschede, Netherlands
e-mail: [email protected]
H. Egger () · V. Shashkov
TU Darmstadt, Darmstadt, Germany
e-mail: [email protected]; [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 107
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_11
108 J. Dölz et al.
We further assume throughout the paper that the susceptibility kernel χ can be
written as a superposition of simple Debye functions [4], i.e.,
i,s − i,∞
χ̂(s) = χ̂i (s) with χ̂i (s) = , (5)
i 1 + sτi
where τi denotes the relaxation time and i,s > i,∞ are the static and high-
frequency limits of the electric susceptibility of the ith component with i i,∞ =
. Such multipole Debye models have been used, e.g., for the modeling of the
∞
dielectric response of biological tissue; see [2, 5] and the references given there. In
general, the summation in (5) may be over infinitely many terms.
One of the key features of the multipole Debye model is its provable passivity,
which follows from the energy–dissipation principle [1, 12]
d 2
E =− ( τ−
i
∂t pi 2 , (6)
dt i 0 i,s i,∞ )
denotes the electromagnetic energy of the system. Due to the rational structure
of the transfer functions χ̂i , the individual polarizations pi can be characterized
equivalently by the differential equations
τi ∂t pi + pi = 0 (i,s − i,∞ )e, (8)
A Convolution Quadrature Method for Maxwell’s Equations in Dispersive Media 109
with initial values pi (0) = 0, which is the basis for various simulation methods.
Corresponding finite difference and finite element schemes have been considered,
for instance, in [1, 6, 9, 10, 12, 13, 17, 20]. Let us note that with increasing number
of internal states pi , all methods become computationally more and more expensive.
In this paper, we consider a different approach for the numerical solution of (1)–
(3), which allows us to compute the time evolution of e, h, and p without explicitly
computing the internal states pi . As indicated in [8], this can be accomplished
through discretization of the integral (4) by means of appropriate convolution
quadratures [14, 16], instead of integrating (8) with time-differencing schemes. The
complexity of every time step is then independent of the number of internal states
pi . Moreover, using ideas of [18, 19], the additional memory cost for storing the
history of the field e can be reduced to the logarithm of the number of time steps.
Mh dτ hn + C en = 0, (9)
n+1/2 n+1/2
Me dτ en+1/2 + dτ pi − CT h = 0, (10)
i
n+1/2 n+1/2
Md,i dτ pi + Mp,i pi = en+1/2 , i ≥ 1. (11)
The equations hold for all n ≥ 0 and are complemented by appropriate initial
conditions. Note that en and hn+1/2 are the approximations for e(t n ) and h(t n+1/2 )
at staggered grid points t r = rτ with τ denoting the time step size. Furthermore,
dτ en+1/2 = τ1 (en+1 − en ) and dτ hn = τ1 (hn+1/2 − hn−1/2 ) are the central
n+1/2
difference quotients, and en+1/2 = 12 (en+1 + en ), h = 12 (hn+1 + hn ) and
pn+1/2 = 2 (pn+1 + pn ) are the averages of two consecutive steps. Further note that
1
).
Eq. (11) was obtained from (8) after dividing by 0 (i,s − i,∞
For appropriate space discretization schemes, the mass matrices Mh , Me are
symmetric, positive-definite, and diagonal or block-diagonal [3, 7], such that (9)–
(11) amounts to an explicit time-stepping scheme. Moreover, the method satisfies
the following discrete equivalent of the underlying energy–dissipation identity.
Lemma 1 Set a2M = (a, a)M and (a, b)M = b T Ma, and denote by
1 n+1/2 n−1/2
En = (h ,h )Mh + en 2Me + pni 2Mp,i
2 i
110 J. Dölz et al.
the discrete energy at time step t n = nτ . Then any solution of (9)–(11) satisfies
n+1/2 2
dτ E n+1/2 = − dτ pi Md,i , n ≥ 0.
i
1
dτ E n+1/2 = (dτ hn+1 + dτ hn , hn+1/2 )Mh + (dτ en+1/2 , en+1/2 )Me
2
n+1/2 n+1/2
+ (dτ pi , pi )Mp,i .
i
Note that (a, b)M = (Ma, b) = (Mb, a) where (·, ·) denotes the Euclidean scalar
product. We then test equation (10) with en+1/2 and (11) with dτ pn+1/2 . Moreover,
we test the average of Eq. (9) for step n and n+1 with hn+1/2 . This allows to replace
all terms on the right hand side of the above formula and leads to
n+1/2 n+1/2
dτ E n+1/2 = −(Cen+1/2 , hn+1/2 ) + (CT hn+1/2 − dτ pi ,e )
i
n+1/2 n+1/2
+ (en+1/2 − Md,i dτ pi , dτ pi ).
i
Using that (Ca, b) = (CT b, a), one can see that most of the terms drop out and we
obtain the assertion of the lemma.
Remark 1 Method (9)–(11) automatically inherits the energy-dissipation principle
of the continuous problem. We therefore call it a structure-preserving discretization
scheme. The first term in the energy E can be estimated from below by
1 k+1/2 2 τ2
= hk+1/2 2Mh − τ (Cek , hk ) ≥ h Mh − Cek 2 −1 ,
2 2 Mh
and the last term can be further bounded from below under the assumption that
This CFL condition, restricting the time step τ in dependence of the space
discretization, implies stability of the scheme and allows to show that the energy
E n is a positive and symmetric quadratic functional and thus induces a norm on the
space of state vectors (hn+1/2 , en , pn1 , pn2 , . . .). Together with Lemma 1, this is the
basis for the error analysis of method (9)–(11); we refer to [11] for details.
A Convolution Quadrature Method for Maxwell’s Equations in Dispersive Media 111
The dimension of the state space and hence also the computational cost for
computing one time step of method (9)–(11) obviously increases with increasing
number of internal states pi . We will now show that e, h, and p = i pi can
be computed without explicit reference to the internal states pi , which results in
an algorithm that is independent of the number of internal states. Instead of using
Eq. (8), we directly discretize the integral (4) by a convolution sum
n
pn = ωn−k ek . (13)
k=0
This is the field of convolution quadrature, and we refer to [14, 16] for details on the
mathematical background. As illustrated in [8], a proper choice of the convolution
weights {ωn }n≥0 allows to obtain the following equivalence statement.
Lemma 2 Let {ωn }n≥0 be the coefficients of the power series
∞
2(1−ξ )
0 χ̂ τ (1+ξ ) = ωn ξ n . (14)
n=0
Then the solution {hn+1/2 , en , pn }n≥0 of the scheme (9)–(11) with e0 = p0i = 0
coincides with the solution of the convolution-quadrature method (9)–(10) and (13).
Proof For convenience of the reader, we briefly summarize the basic ideas of the
proof, which closely follows the arguments presented in [8]. We start by multiplying
equations (11) with ξ n and sum over all n ≥ 0 to obtain
Md,i ( ξ1 − 1)pni ξ n + 1
Mp,i ( 2ξ + 12 )pni ξ n = (1 + 12 )en ξ n .
n≥0 n≥0 n≥0 2ξ
with transfer function χ̂i as defined in (5). Summation over all i and using pn =
n
i pi and the definition of the weights ωn then yields the assertion.
Remark 2 According to the above lemma, the convolution quadrature (CQ) method
defined by (9)–(10) and (13)–(14) has the same passivity and stability properties as
the underlying difference scheme (9)–(11). Let us note that instead of the internal
states {pni }i≥0 , the CQ approach utilizes the history {ek }k≤n of the electric field
values to compute the memory part pn of the polarization.
Before closing this section, we briefly comment on the practical computation of
the weights {ωn }n≥0 and the efficient realization of the proposed CQ approach.
112 J. Dölz et al.
Remark 3 Following [14, 15], also see [8], the convolution weights {ωn }n≥0 can be
computed with high accuracy using fast Fourier transforms, i.e.,
L−1
2 1−ρeiφ
ωn ≈ 1
Lρ n χ̂ τ 1+ρeiφ
e−inφ
, φ
= 2π
/L,
=0
and the quadrature error can be controlled by appropriate choice of the parameters
L and ρ; see [14–16] for details. The computation of all weights {ωn }N n=0 with
machine precision requires O(N) evaluations of χ̂. If the material parameters are
inhomogeneous, then the weights ωn will also depend on the spatial variable.
Remark 4 A direct implementation of the CQ approach requires the storage of
the complete history {ek }k≤n to compute the polarization pn via (13), and a naive
computation of the N convolution sums {pn }n≤N is of O(N 2 ) complexity; this
can be reduced to O(N log2 N) by FFT [19]. Using fast and oblivious convolution
quadrature (FOCQ), the required storage can be reduced to O(log N) field vectors
[18, 19]. The basic idea of these approaches is to split the sum (13) into subsums
with exponentially growing number of summands
n L B
−1 L
ωk en−k = ωk en−k =: Un
,
k=0
=0 k=B
−1
=0
4 Numerical Illustration
For our computational tests, we consider a plane wave setting, in which the fields
are of the form e = (ex , 0, 0), h = (0, hy , 0), and pi = (px,i , 0, 0), and only
depend time t and the propagation direction z. Then (1)–(4) leads to a one–
dimensional wave propagation problem for unknown fields ex , px and hy . As
computational domain, we consider the interval (−1, 1) and we impose periodic
boundary conditions for the electric and magnetic field. The initial values are
described by ex,0 (z) = px,i,0 (z) = 0 and hy,0 (z) = 10e−10z . All quantities are
2
given in SI-units.
For the spatial discretization, we utilize piecewise linear finite elements for ex
and px,i , and piecewise constants to represent hy . Numerical integration by the
vertex rule is used for the assembling of the mass matrices Me , Mp,i , and Md,i , which
leads to a diagonal structure, and the matrix Mh is diagonal automatically. In the case
of piecewise constant material properties only one scalar convolution weight ωn has
to be stored per time step n and per subdomain covered by a dispersive material.
In Fig. 1, we display the magnetic field component hy for the two schemes
presented in Sects. 2 and 3 for some selected time steps. As predicted, the numerical
solutions cannot be distinguished by visual inspection; in our computations, the
maximal difference, caused by inexact computation of the weights ωn , was in the
order of 10−12 , and thus much smaller than the discretization errors. We tested both,
the classical CQ with O(N 2 ) complexity and the FOCQ approach with O(N log N)
Fig. 1 Snapshots of the component hy of the numerical solution restricted to the interval [0, 1] at
different time steps. The solution of the leapfrog method (9)–(11) is drawn in red while that of the
convolution-quadrature method (9)–(10) and (13) is depicted in black. The gray area indicates the
location of the dispersive medium. (a) t = 0.977 · 10−9 . (b) t = 2.930 · 10−9 . (c) t = 4.883 · 10−9 .
(d) t = 6.836 · 10−9
114 J. Dölz et al.
cost; see Remark 4. Both approaches lead to almost identical results. The latter was
however substantially faster, in particular for a large number N of time steps.
From the results in Fig. 1, one can also recognize the basic physical behavior: In
the initial phase, the pulse propagates through air and the total energy of the system
is conserved exactly. When impinging on the air-tissue interface, a part of the pulse
gets reflected and the rest penetrates into the dispersive medium. Propagation in
the medium is substantially slower and, moreover, energy is dissipated according to
Lemma 1. We were able to reproduce this energy balance up machine precision.
5 Summary
Acknowledgments The authors are grateful for support by the German Research Foundation
(DFG) via grants TRR 146 project C03, TRR 154, project C04, and Eg-331/1-1 and through grant
Center for Computational Engineering at TU Darmstadt.
References
1. V.A. Bokil, N.L. Gibson, Convergence analysis of Yee schemes for Maxwell’s equations in
Debye and Lorentz dispersive media. Int. J. Numer. Anal. Model. 11, 657–687 (2014)
2. J. Clegg, M. Robinson, A genetic algorithm for optimizing multi-pole Debye models of tissue
dielectric properties. Phys. Med. Biol. 57, 6227–43 (2012)
3. G. Cohen, Higher-Order Numerical Methods for Transient Wave Equations (Springer, Berlin,
2002)
4. P. Debye, Polar Molecules (Chemical Catalogue Company, New York, 1929)
5. S. Gabriel, R.W. Lau, C. Gabriel, The dielectric properties of biological tissues: III. Parametric
models for the dielectric spectrum of tissues. Phys. Med. Biol. 41, 2271–93 (1996)
6. O.P. Gandhi, B.-Q. Gao, J.-Y. Chen, A frequency-dependent finite-difference time-domain
formulation for general dispersive media. IEEE Trans. Microw. Theory Tech. 4144–4, 658–
665 (1993)
7. H. Egger, B. Radu, A mass-lumped mixed finite element method for Maxwell’s equations
(2018). arXiv:1810.06243. to appear in Proceedings of SCEE 2018
8. H. Egger, K. Schmidt, V. Shashkov, Multistep and Runge–Kutta convolution quadrature
methods for coupled dynamical systems. J. Comput. Appl. Math. 387, 112618 (2020)
9. D. Jiao, J.-M. Jin, Time-domain finite-element modeling of dispersive media. IEEE Microw.
Wireless Components Lett. 11, 220–222 (2001)
A Convolution Quadrature Method for Maxwell’s Equations in Dispersive Media 115
10. M.J. Jenkinson, J.W. Banks, High-order accurate FDTD schemes for dispersive Maxwell’s
equations in second-order form using recursive convolutions. J. Comput. Appl. Math. 336,
192–218 (2018)
11. P. Joly, Variational methods for time-dependent wave propagation problems, in Topics in
Computational Wave Propagation. LNCSE, vol. 31 (Springer, Berlin, 2003), pp. 201–264
12. S. Lanteri, C. Scheid, Convergence of a discontinuous Galerkin scheme for the mixed time-
domain Maxwell’s equations in dispersive media. IMANUM 33, 432–459 (2012)
13. J. Li, Error analysis of finite element methods for 3-D Maxwell’s equations in dispersive media.
J. Comput. Appl. Math. 188, 107–120 (2006)
14. C. Lubich, Convolution quadrature and discretized operational calculus. I. Numer. Math. 52,
129–145 (1988)
15. C. Lubich, Convolution quadrature and discretized operational calculus. II. Numer. Math.
52(4), 413–425 (1988)
16. C. Lubich, A. Ostermann, Runge-Kutta methods for parabolic equations and convolution
quadrature. Math. Comp. 60(201), 105–131 (1993)
17. R. Luebbers, F.P. Hunbserger, K.S. Kunz, R.B. Standler, M. Schneider, A frequency-dependent
finite-difference time-domain formulation for dispersive materials. IEEE Trans. Electromag.
Compat. 32, 222–227 (1990)
18. J. Roychowdhury, Reduced-order modeling of time-varying systems. IEEE Trans. Circuits
Syst. II 46, 1273–1288 (1999)
19. A. Schädle, M. López-Fernandez, C. Lubich, Fast and oblivious convolution quadrature. SIAM
J. Sci. Comput. 28, 421–438 (2006)
20. S. Shaw, Finite element approximation of Maxwell’s equations with Debye memory. Adv.
Numer. Anal. 2010, 923832 (2010)
21. W.A. Strauss, Partial Differential Equations. An Introduction (Wiley, New York, 1992)
On the Stability of Harmonic Coupling
Methods with Application to Electric
Machines
1 Introduction
Electric drives naturally consist of different subdomains, i.e. the stator and rotor,
which move relative to each other. The time-varying geometry and nonlinearities
caused by saturation effects formally require a time-domain analysis, which is
often realized by solving a sequence of quasi-stationary problems at different
working points. Several strategies have been proposed for the simulation of the
corresponding equations of magnetostatics and, in particular, for the coupling of the
fields across the air gap between stator and rotor. As it is common practice, see e.g.
[13, 14, 17], we consider a two dimensional regime, in which the unknown fields are
described by the axial component of the magnetic vector potential. The governing
system then consists of two Poisson-like problems for the stator and the rotor, which
can be coupled via Lagrange multipliers. Such domain decompositions of mortar
H. Egger ()
Numerical analysis and Scientific Computing, Technische Universität Darmstadt, Darmstadt,
Germany
e-mail: [email protected]
M. Harutyunyan · M. Merkel · S. Schöps
Computational Electromagnetics, Technische Universität Darmstadt, Darmstadt, Germany
e-mail: [email protected]; [email protected];
[email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 117
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_12
118 H. Egger et al.
methods, which couple subdomains via Lagrange multipliers, have been investi-
gated intensively in the literature [2, 3, 5, 20]; see [8, 13] for results concerning
electric machines. It is well-known that a careful choice of approximation spaces is
required to obtain stable discretization schemes for underlying saddlepoint problems
[6, 18]; appropriate stabilization [15] could be used as an alternative approach.
In this paper, we investigate the stability of mortar discretizations using trigono-
metric functions as Lagrange multipliers, called harmonic coupling methods in
[4, 13]. We discuss in detail the discrete inf-sup condition which is necessary
and sufficient to guarantee the stability of such approximations. We provide a
simple criterion for the maximal number of harmonics used as Lagrange multipliers
depending on the mesh size and polynomial degree of the subdomain discretizations
which guarantees the stability of the scheme. Our analysis applies to the harmonic
coupling of various discretization methods, e.g. obtained by isogeometric analysis
(IGA) [4, 7, 16], and can in principle be extended to other Lagrange multiplier
spaces.
The remainder of this note is organized as follows: In Sect. 2, we introduce the
model problem to be considered and we summarize some well-known results about
its analysis and discretization. In Sect. 3, we then turn to the harmonic stator-rotor
coupling, and we state and prove our main results. Section 4 is concerned with
numerical tests, in which we demonstrate the validity of our stability criterion for
low and high order discretizations based on IGA.
2 Model Problem
−div(ν
∇u
) = j
, in Ω
, (1)
u
= 0, on Σ
, (2)
where u denotes the z-component of the magnetic vector potential, ν the magnetic
reluctivity, and j = js +divm⊥ a generalized current density with js denoting the z-
component of the source currents and m⊥ = (my , −mx ) the rotated magnetization
vector of the permanent magnet. The corresponding in-plane components of the
magnetic flux density and field strength are given by b = (∂y u, −∂x u) = ∇ ⊥ u
Inf-Sup Stability of Harmonic Coupling Methods 119
stator
windings
rotor
permanent
magnets
Fig. 1 Typical structure of a 6-pole permanent magnet synchronous machine (left) and the coarsest
mesh of the stator domain as used in our numerical tests (right)
and h = νb, respectively. The coupling of the fields across the interface Γ is
accomplished by the conditions
u1 = u2 , on Γ, (3)
n · (ν1 ∇u1 ) = n · (ν2 ∇u2 ), on Γ, (4)
which correspond to the conditions for the normal continuity of b and the tangential
continuity of h, respectively; see e.g. [4, 13]. Here n = n2 is the unit normal vector
at Γ pointing from Ω2 to Ω1 . In the context of electric machines, it is natural to
assume that Ω
are bounded domains with piecewise smooth boundaries Σ
and Γ ,
having non-zero measure. Moreover, we can assume that ν is bounded from above
and below by positive constants ν, ν, i.e. ν ≤ ν(x) ≤ ν for all x ∈ Ω1 ∪ Ω2 .
The weak formulation of the interface problem (1)–(4) then reads as follows:
Find u ∈ V = {v ∈ H 1 (Ω1 ∪ Ω2 ) : v|Σ
= 0} and λ ∈ M = H −1/2 (Γ ) such that
Lemma 1 For any js ∈ L2 (Ω) and m ∈ L2 (Ω)2 , the variational problem (5)–(6)
with j = js + divm⊥ has a unique solution (u, λ) ∈ V × M and there holds
uH 1 (Ω1 ∪Ω2 ) + λH −1/2 (Γ ) ≤ C js L2 (Ω1 ∪Ω2 ) + mL2 (Ω1 ∪Ω2 ) )
μ, [v]Γ
inf sup ≥ β > 0. (7)
μ∈M v∈V μH −1/2 (Γ ) vH 1 (Ω1 ∪Ω2 )
Following [18], condition (7) can be proven as follows: Let z1 ∈ H 1 (Ω1 ) be the
weak solution of the mixed boundary value problem
Then by standard arguments for elliptic problems [1, 2], one can show that
(ν∇uh , ∇vh )Ω1 ∪Ω2 + λN , [vh ]Γ = j, vh Ω1 ∪Ω2 ∀vh ∈ Vh , (9)
[uh ], μN Γ = 0 ∀μN ∈ MN . (10)
Following the usual convention, we assume that Vh and MN are finite dimensional.
Inf-Sup Stability of Harmonic Coupling Methods 121
where (u, λ) is the solution of the continuous variational problem (5)–(6) and the
constant C depends only on β in (11), the bounds for ν, and the geometry.
Remark 3 All conditions required for the proof of the corresponding result on the
continuous level, except the inf-sup stability condition, are inherited by the Galerkin
approximation. The existence of a unique solution can thus again be deduced from
Brezzi’s saddlepoint theory [6]. The error estimate (12) follows from Galerkin
orthogonality and standard arguments; we refer to [5, 6] for details. Hence any
choice of approximation spaces Vh , MN that allows to prove the discrete inf-sup
stability condition (11) will lead to a well-posed discrete problem.
%N )
MN = F (M and Vh = Vh |Ω1 ∪ Vh |Ω2 ⊂ F (P k (Th )) ∩ V . (13)
Σ̂1
Ω1 Σ1
Ω̂1
Γ̂
Γ
Ω2
Ω̂2
Σ̂2 Σ2
Fig. 2 Sketch of a subset of the rectangular reference domain Ω̂ and its mesh (left) and the
physical domain Ω = F (Ω̂) and mesh obtained after mapping. The boundaries on the left and
right are only introduced for the illustration but not present in our application
Theorem 1 Assume that there exists a linear operator Πh : V |Ω1 → Vh |Ω1 such
that
Nh/k ≤ 1 − . (17)
πh λN , [zh ]Γ = πh λN , [Πh z]Γ = πh λN , πh [z]Γ = πh λN , [z]Γ ,
where we used property (15) and the orthogonality of the L2 -projection πh . Together
with the previous estimate and employing condition (14), we thus obtain
β
πh λN , [zh ]Γ ≥ βzH 1 (Ω1 ∪Ω2 ) πh λN H −1/2 (Γ ) ≥ zh H 1 (Ω1 ∪Ω2 ) πh λN H −1/2 (Γ ) .
c3
Inf-Sup Stability of Harmonic Coupling Methods 123
and the last term can be bounded with the approximation error estimate (16) by
In the second estimate, we here used an inverse inequality for the finite dimensional
Lagrange multiplier space MN . In summary, we thus obtain
4 Numerical Results
Table 1 Discrete inf-sup constants obtained for n gridpoints at the interface Γ and harmonic order
N = cn of the Lagrange multipliers for different refinement levels
and scaling parameters c
c \
1 2 3 4
1/4 0.135237 0.135556 0.135676 0.135693
1/3 0.135237 0.135556 0.135661 0.135684
3/8 0.135237 0.135536 0.135611 0.135684
1/2 3.526e–08 2.532e–08 2.401e–08 2.401e–08
Table 2 Discrete inf-sup constants for n spline degrees of freedom on the interface Γ and
harmonic order N = cn of the Lagrange multipliers for polynomial degree k and scaling
parameter c
c \k 2 3 4 5
1/4 0.135721 0.135723 0.135723 0.135723
1/3 0.135721 0.135722 0.135723 0.135723
3/8 0.135720 0.135723 0.135723 0.135723
1/2 3.652e–08 0 8.082e–08 1.825e–08
The largest possible constant β such that the second estimate of (18) remains true
for all μ ∈ H −1/2 (Γ ) can then be characterized by the minimal eigenvalue of
μ, S1 3
μH −1/2 (Γ )×H 1/2 (Γ ) = β 2 (μ, μ̃)2H −1/2 (Γ ) ∀μ̃ ∈ H −1/2(Γ ).
Acknowledgments This work is supported by the ‘Excellence Initiative’ of the German Federal
and State Governments and by the Graduate School of Computational Engineering at Technische
Universität Darmstadt and the grants TRR 154 project C04 and TRR 146 project C03.
Inf-Sup Stability of Harmonic Coupling Methods 125
References
1. I. Babuška, The finite element method with Lagrangian multipliers. Numer. Math. 20, 179–192
(1973)
2. F. Ben Belgacem, The mortar finite element method with Lagrange multipliers. Numer. Math.
84, 173–197 (1999)
3. C. Bernardi, Y. Maday, A.T. Patera, A new nonconforming approach to domain decomposition:
the mortar element method, in Nonlinear Partial Differential Equations and their Applications.
Pitman Research Notes in Mathematics Series, vol. 299 (1994), pp. 13–51
4. Z. Bontinck, J. Corno, S. Schöps, H. De Gersem, Isogeometric analysis and harmonic stator-
rotor coupling for simulating electric machines. Comput. Meth. Appl. Mech. Engrg. 334,
40–55 (2018)
5. D. Braess, W. Dahmen, C. Wieners, A multigrid algorithm for the mortar finite element
method. SIAM J. Numer. Anal. 37, 48–69 (1999)
6. F. Brezzi, On the existence, uniqueness and approximation of saddle-point problems arising
from Lagrangian multipliers. RAIRO Anal. Numer. 8, 129–151 (1974)
7. E. Brivadis, A. Buffa, B. Wohlmuth, L. Wunderlich, Isogeometric mortar methods. Comput.
Methods Appl. Mech. Engrg. 284, 292–319 (2015)
8. A. Buffa, Y. Maday, F. Rapetti, A sliding mesh-mortar method for a two dimensional currents
model of electric engines. ESAIM Math. Model. Numer. Anal. 35, 191–228 (2001)
9. A. Buffa, E.M. Garau, C. Giannelli, G. Sangalli, On quasi-interpolation operators in spline
spaces, in Building Bridges: Connections and Challenges in Modern Approaches to Numerical
Partial Differential Equations. Lecture Notes in Computational Science and Engineering, vol.
114 (Springer, Berlin, 2016), pp. 73–91
10. D. Chapelle, K.J. Bathe, The inf-sup test. Comput. Struct. 47(4/5), 537–545 (1993)
11. P. Clément, Approximation by finite element functions using local regularization. RAIRO 9,
77–84 (1975)
12. C. de Falco, A. Reali, R. Vázquez, GeoPDEs: a research tool for Isogeometric Analysis of
PDEs. Adv. Eng. Softw. 42, 1020–1034 (2011)
13. H. De Gersem, T. Weiland, Harmonic weighting functions at the sliding interface of a finite-
element machine model incorporating angular displacement. IEEE Trans. Magn. 40, 545–548
(2004)
14. G. Gyselinck, L. Vandevelde, P. Dular, C. Geuzaine, A general method for the frequency
domain FE modeling of rotating electromagnetic devices. IEEE Trans. Magn. 39, 1147–1150
(2003)
15. P. Hansbo, C. Lovadina, I. Perugia, G. Sangalli, A Lagrange multiplier method for the finite
element solution of elliptic interface problems using non-matching meshes. Numer. Math. 100,
91–115 (2005)
16. T.J.R. Hughes J.A. Cottrell, T. Bazilevs, Isogeometric analysis: CAD, finite elements, NURBS,
exact geometry and mesh refinement. Comput. Meth. Appl. Mech. Eng. 194, 4135–4195
(2005)
17. E. Lange, F. Henrotte, K. Hameyer, A variational formulation for nonconforming sliding
interfaces in finite element analysis of electric machines. IEEE Trans. Magn. 46, 2755–2758
(2010)
18. P.A. Raviart, J.M. Thomas, Primal hybrid finite element methods of 2nd order elliptic
equations. Math. Comp. 31, 391–413 (1977)
19. L.R. Scott, S. Zhang, Finite element interpolation of nonsmooth functions satisfying boundary
conditions. Math. Comp. 54, 483–493 (1990)
20. B.I. Wohlmuth, A mortar finite element method using dual spaces for the Lagrange multiplier.
SIAM J. Numer. Anal. 38, 989–1012 (2000)
Multifidelity Uncertainty Quantification
for Optical Structures
1 Introduction
N. Georg ()
Institut für Dynamik und Schwingungen, Technische Universität Braunschweig, Braunschweig,
Germany
Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
e-mail: [email protected]
C. Lehmann · R. Schuhmann
Theoretische Elektrotechnik, Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]; [email protected]
U. Römer
Institut für Dynamik und Schwingungen, Technische Universität Braunschweig, Braunschweig,
Germany
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 127
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_13
128 N. Georg et al.
with r(j ω) = S11 (j ω) the reflection coefficient at the input port. Note that we omit
the frequency dependency of S in the following to enhance the readability.
For the discretization we apply the efficient Finite Integration Technique (FIT)
time-domain algorithm [6]. It relies on a three-dimensional Cartesian mesh and
allows calculating broadband results with single transient simulation runs (using
Discrete Fourier Transform on the signals). The calculation of scattering parameters
proceeds in two steps: First, the two-dimensional eigenvalue problem of the port
apertures is analyzed to obtain the field patterns and cutoff-frequencies of the so-
called waveguide modes. Note that a lossfree model is considered here, and the array
of SRRs is transversally terminated by perfect electric and magnetic boundary con-
ditions. Second, these mode patterns and their well-known orthogonality properties
are used to both excite the three-dimensional structure and to extract the amplitudes
of the out-going waves at the ports. From one simulation run, one column of the
scattering matrix can be obtained. For further details on the FIT we refer to the
literature.
A technique to reduce the computational cost in the analysis of periodic
structures is to decompose the SRR array into its single unit cells and to calculate
separate scattering matrices S(i) for each of them. The final concatenation of these
single-cell results can be accomplished by switching to the transfer matrices T(i)
which map the wave amplitudes of the right hand side of each cell to the left hand
side (rather than from input to output quantities as with S). For a system with 2
ports:
* + * + * + * + * +
−1 −1
b1 a1 b1 a2 S12 − S11 S21 S22 S11 S21
=S ↔ =T with T = −1 −1 .
b2 a2 a1 b2 −S21 S22 S21
Extended formulas for larger S, T, which take several port-modes into account, can
easily be derived. Using transfer matrices, the total system behavior of N cells is
simply given by a matrix multiplication T = T(1) · . . . · T(N) . This approach has
been used previously in [2, 7] and is referred to as SMA.
This procedure has the intrinsic weakness that the coupling between the unit
cells is not governed by a single waveguide mode alone, but an unknown number
of higher modes may contribute. Of course, the coupling of modes at frequencies
below their cutoff-frequency decreases rapidly with increasing spatial distance of
the single SRRs. However, especially if there are resonances within the frequency
range of interest (which clearly is the case for the SRRs as one of their working
principles), this systematic error may become significant. In theory an extension of
the SMA to an arbitrary number of coupling modes is straight-forward. However, the
required number (and/or selection) of modes is sometimes hard to estimate a-priori,
and the calculation of the extended transfer matrix increases the computational cost.
Our approach removes any possible systematic error introduced in the coupling, by
treating the SMA-based predictions as low-fidelity data and by correcting them with
a couple of time domain solutions of the entire structure.
130 N. Georg et al.
C
S(j ) (ycell,j ) ≈ Suc;C (ycell,j ) := S(j ) y(i)
cell,j Ψi (ycell,j ) (1)
i=1
where uc is short for unit cell and j = 1, . . . , N refers to an arbitrary unit cell of the
structure. Also, {y(i)
cell,j }i=1 ⊂ Ξ denotes a set of collocation points, e.g. Chebyshev
C
We also emphasize that (2) can be evaluated with negligible computational cost. In
order to highlight the efficiency of the proposed combination of SMA and spectral
surrogates for the unit cell, we give a few comments on the alternative approach,
i.e. spectral approximation of the full structure. Due to spectral convergence
properties, global polynomial approximations can be highly efficient, even up to a
moderately large number of parameters (e.g., up to 10–20) using adaptive sparse
approximations, see e.g. [1]. However, these methods still suffer from the so-
called curse-of-dimensionality, i.e. the rapid growth of computational cost w.r.t.
the number of parameters. As the full structure has a significant larger number of
parameters, i.e. by a factor of N, this would quickly result in a very large number
of simulation runs. Additionally, the computational cost for each model evaluation
would also be significantly larger, when the full structure is considered instead of a
single unit cell.
MFMC generalizes the multilevel Monte Carlo approach, which was recently used
in [8] for a high-frequency application. MFMC simulation combines low-fidelity
models of different kinds, without quantified model errors, into an efficient sampling
framework. By sampling the high-fidelity model at least one time, the MFMC
approach provides an unbiased estimator. Moreover, a low variance and hence, a
low root-mean-square error, is realized through optimal model management and the
resulting estimator is typically much more efficient than the standard Monte Carlo
Multifidelity Uncertainty Quantification for Optical Structures 131
(MC) estimator. The MFMC methodology was introduced in a series of papers [3, 9]
and is now well-established. Hence, in the following we limit ourselves to the key
aspects and refer to the literature for a more complete introduction into the field.
We adopt a probabilistic approach to represent uncertainty, where y represents
a realization of a random vector Y. Let g(Y) denote an output quantity derived
from the simulated frequency response. MC simulation is then based on a sample
{Yi , g(Yi )}K
i=1 , which can be used to estimate for instance the mean value of the
model output. The mean value approximation and its mean-square error read
1
K
V[g(Y)]
E[g(Y)] ≈ ĝK := g(Yi ), E[|E[g(Y)] − ĝK |2 ] = . (3)
K K
i=1
high-fidelity model, and g (i) for i ≥ 2 represent low-fidelity models, obtained for
instance by SMA in combination with surrogate modeling. The MFMC estimator
samples all models and combines the results into a single estimator as
M
(1) (i) (i)
E[g] ≈ ĝMFMC = ĝK (1) + αi ĝK (i) − ĝK (i−1) ,
i=2
(i)
where ĝK (i) denotes the standard MC estimator based on the sample
(i)
{Yj , g (i) (Yj )}K
j =1 and 0 < K
(1) ≤ K (2) ≤ . . . ≤ K (M) .
σ12 1 1 2 2
M
E[|ĝMFMC − E[g(Y)]| ] = (1)
2
+ − (αi σi − 2αi ρ1,i σ1 σi ).
K K (i−1) K (i)
i=2
(4)
4 Numerical Examples
70 nm
90 nm
320 nm
L~ (305, 335) nm
(a) (b)
Fig. 1 Numerical model of SRR array. Depicted is only one cell out of seven. (a) Unit cell of size
1 µm × 0.6 µm × 0.6 µm. Red boundaries indicate the ports. (b) Geometry specification. Thickness:
20 nm. Uncertain longitudinal length L of SRR element
0
Magnitude |S11 | [dB]
− 10
− 20
− 30
80 100 120 140 160 180 200 220
Frequency f [THz]
Fig. 2 Broadband scattering parameter for different realizations of SRR array. Dashed vertical
lines indicate respective limit frequencies of considered bandgaps. Dotted line refers to −3 dB line
We consider a SRR array with N = 7 cells. The employed Cartesian grid as well
as the geometric dimensions (taken from [4], except for the enlarged cell size) are
presented in Fig. 1, where we consider an uncertain longitudinal length L(j ) of each
SRR element in the range of 320 nm ± 15 nm. Hence, the random vector Y is given
as (L(1) , . . . , L(N) )T , where L(j ) , j = 1, . . . , N are assumed to be independent
and identically uniformly distributed. Figure 2 presents a broadband scattering
parameter, in particular the fundamental reflection coefficient |S11 |, for different
realizations of the structure. Two bandgaps can be observed, which can be defined
by their limit frequencies, where the scattering parameter drops below −3 dB. The
corresponding bandwidths bi and center frequencies fc,i , where i ∈ {1, 2} refers
to the first or second bandgap, can be computed from S11 in a post-processing
step. For brevity, we restrict ourselves to the computation of the mean value of the
center frequencies E[fc,i ] in the following. However, very similar findings hold for
the bandwidths bi as well. We further note that for some parameter sample points
some additional resonances within the second bandgap appear which are due to
the slightly detuned resonances in the series of SRRs. This effect is ignored in
Multifidelity Uncertainty Quantification for Optical Structures 133
Table 1 Employed numerical models of SRR array for MFMC study. The last two columns show
the estimated correlation coefficients for both bandgaps
Symbol Model Cost wi ρ1,i for fc,1 ρ1,i for fc,2
g (1) Full model (FIT, 2 · 105 time-steps) 197.50 s 1.000000 1.000000
g (2) Full model (FIT, 2 · 104 time-steps) 11.25 s 0.999236 0.998035
g (3) SMA (FIT, 1 port-mode) 9.64 s 0.999943 0.968376
g (4) SMA (FIT, 2 port-modes) 115.47 s 0.999998 0.999998
g (5) SMA + unit cell surrogate (1 port-mode) 0.006 s 0.999943 0.967540
g (6) SMA + unit cell surrogate (2 port-modes) 0.026 s 0.999998 0.999886
the following evaluation of the MLMC algorithm and only the outer limits of this
bandgap are considered.
An overview of the employed numerical models as well as the corresponding
computational costs (measured in computation time for an in-house MATLAB
implementation on a standard workstation) is given in Table 1. For the full FIT
model g (1) we terminate the time stepping procedure if either the energy decays to
−120 dB or a maximum number of 2 · 105 time-steps is reached. The low-fidelity
model g (2) is obtained by restricting the maximum number of time-steps to 2 · 104 .
The low-fidelity models g (3) and g (4) are obtained by the SMA approach. For g (3)
only the propagating fundamental TEM mode is considered, while g (4) additionally
takes the evanescent first TM mode into account. The selection of suitable models is
based on a pilot run (with a small sample) and model selection techniques, see also
[3, 9].
The construction of the respective unit cell surrogate models for g (5) and g (6) in
the offline-phase is based on C = 7 Chebyshev nodes, which are well-established
non-equidistant interpolation nodes. Note that other choices are equally feasible,
Gauss-Legendre nodes for instance. Surrogate modeling requires some additional
computational effort, which, however, only needs to be invested once. Also, in
this case, even a single model evaluation of g (1) requires a larger computational
effort than constructing the surrogate models. Hence, we will neglect this cost
here, for simplicity. We further note that the evaluation times of all models scale
approximately linear w.r.t. to an eventually increased number of cells N, while
the offline-cost for the surrogate models is independent of N. Accordingly, similar
MFMC results, as presented in the following for N = 7, are also expected for SRR
arrays with a different number of cells. Exemplarily, this has been confirmed for
N = 14 numerically. However, we note that for larger models some care has to
be taken regarding the concatenation within the SMA, since the multiplication of
transfer matrices can become numerically unstable.
In order to evaluate the performance of the proposed methodology for the
considered benchmark problem, we draw an input sample {Yi }K̃ i=1 of size K̃ =
(j )
500 and employ each model g to compute the corresponding output samples
i=1 , j = 1, . . . , 6. The correlation coefficients with the high-fidelity
{g (j ) (Yi )}K̃
134 N. Georg et al.
10− 2 10− 2
10− 3 10− 3
10− 4 10− 4
10− 5 10− 5
103 104 105 106 107 103 104 105 106 107
Budget B [s] Budget B [s]
MC: g(1) MFMC: all models MFMC: g(1) , g(2) MFMC: g(1) , g(3)
MFMC: g(1) , g(4) MFMC: g(1) , g(5) MFMC: g(1) , g(6) MC: g(6)
Fig. 3 Estimated RMSE for different MC and MFMC variants, see Table 1
model g (1) are then estimated as shown in Table 1. It can be observed that all low-
fidelity models show a strong correlation with the high-fidelity model.
We employ an MFMC implementation which is based on the open-source Matlab
library github.com/pehersto/mfmc, see [9]. In the following, we will compare the
root-mean-square-errors (RMSEs) of MC and MFMC for given computational bud-
gets B, which can both be accurately estimated based on the samples {g (j ) (Yi )}K̃
i=1 ,
as explained in the following. The RMSE of standard MC on the high-fidelity model
g (1) is obtained by (3), where K is given by wB1 and the variance is replaced by the
MC estimate for the variance using {g (1) (Yi )}K̃
i=1 . This is shown in Fig. 3 in blue
color. Similarly, the RMSE of MFMC can be estimated according to (4), as shown
in black color in Fig. 3. We note that the proposed approach yields speedups by
several orders of magnitude w.r.t. standard MC (for a fixed accuracy).
We note that the MFMC algorithm sorts out some models, as, for example, g (2)
and g (3) have a smaller correlation with the high-fidelity model than the surrogate
model g (6) but a higher computational cost. For completeness, we additionally show
the convergence of MFMC using only g (1) and g (j ) , j ∈ {2, . . . , 6} with dashed
lines in Fig. 3. As expected, in all cases this approach performs better than MC
but worse than the combination of models chosen by the MFMC algorithm. It can
be observed that, for both bandgaps, mainly the proposed unit cell surrogate models
lead to the tremendous efficiency gains. While for the first bandgap considering only
one port-mode could also be sufficient, for the second bandgap it is clearly necessary
to consider two port-modes for the SMA. This is expected as the first bandgap is
mainly governed by the fundamental resonance of the SRRs itself, whereas for the
second one the mutual coupling between the cells play a larger role.
Finally, we show that the high-fidelity model evaluations within the MFMC
framework are indeed required to remove the biasing error. If one would apply
a standard MC method on the surrogate model g (6) solely (instead of g (1) )
Multifidelity Uncertainty Quantification for Optical Structures 135
the associated error is represented by the dotted red line in Fig. 3. Both error
contributions, the sampling and the biasing error, are estimated again with a Monte
Carlo sample.
5 Conclusions
Acknowledgments The work of Niklas Georg is supported by the DFG grant RO4937/1-1, the
Excellence Initiative of the German Federal and State Governments and the Graduate School of
Computational Engineering at TU Darmstadt. Christian Lehmann’s work is funded by the DFG
grant SCHU1157/11-1.
References
1 Introduction
Every high-voltage device has to pass dielectric type tests, in which a large voltage
is applied to the device. The test is passed if no dielectric breakdown occurs. A
breakdown usually starts from an electrode-surface with high dielectric stress, and
then propagates through the volume along a field-line of the electric field E towards
the opposite electrode, see Fig. 1. The propagation stops if the electric field along
C. Münger ()
Seminar for Applied Mathematics, ETH Zürich, Zürich, Switzerland
e-mail: [email protected]
S. Börm
Math. Seminar, Christian-Albrechts-Univ. Kiel, Kiel, Germany
e-mail: [email protected]
J. Ostrowski
ABB Corporate Research, Baden, Switzerland
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 137
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_14
138 C. Münger et al.
Fig. 1 The electric field strength on the surface of a disconnector and possible breakdown paths
along the field lines
this breakdown path γ is not strong enough. For details see [1]. An inception of a
streamer, i.e. the initial state of a breakdown only occurs if the criterion
αeff (|E|)ds > Kstr (1)
γ
is fulfilled. Here αeff is the effective ionization function that depends on the strength
of the electric field |E|, and Kstr is the (gas-specific) streamer constant. The
prediction of a dielectric breakdown during a type test relies on the evaluation of
this criterion along the most probable breakdown paths. It requires the computation
of the electric field at all surface points and along field lines in the volume.
Simulation-based dielectric design became a standard procedure because a user-
friendly, i.e., fast, robust, reliable, and easy-to-use computational method was
developed, see [2]. In the following we will first describe this boundary-element-
based method and then introduce how general-purpose graphics processing units
(GPGPUs) can be used to massively reduce computing times.
Dielectric Breakdown Prediction with GPU-Accelerated BEM 139
2 BEM Formulation
for the electric scalar potential ϕ in each of these subdomains. The permittivity is
denoted by . We use an indirect formulation with a single-layer potential
σ (y)
ϕ(x) = ΨSL [σ ](x) = dSy , (3)
∂Ω 4π|x − y|
and search for the unknown scalar virtual surface charge density σ that is related
to the physical surface charge density σs . Each conductor, i.e. each separated
conducting part with electrical conductivity σel > 0, is on a constant electrical
potential. If a conductor is connected to an electric potential V0 , like Ω0 in Fig. 2,
then it holds
ϕ(x) = V0 ∀x ∈ Ω0 . (4)
ϕ(x) = V ∀x ∈ Ω1 , (5)
and is to be determined by a charge neutrality condition, see [4]. The total charge Q
of the floating conductor can be derived from the Gauss law as
Q= σs dS = D · n dS. (6)
∂Ω1 ∂Ω1
Here ε+ denotes the permittivity of the exterior domain. The Neumann trace of the
single layer potential and can be expressed with help of the adjoint double layer K
1
grad ΨSL [σ ] · n = σ + K σ , with (8)
2
x−y
K (σ )(x) = · n(x)σ (y)dSy . (9)
∂Ω 4π|x − y|
3
We model thin floating conductive sheets only by a single surface. Then the electric
fields from both sides (±) need to be considered for charge neutrality, since
σs = n · (D+ − D− ) $⇒ (11)
1 +
(ε + ε− )σ (y) + (ε+ − ε− )(K σ )(y)dSy = 0. (12)
∂Ω1 2
1 +
(ε + ε− )σ (x) + (ε+ − ε− )(K σ )(x) = 0 ∀x ∈ ∂Ω2 . (13)
2
So for our simple but quite general example of Fig. 2 we have to solve the following
set of equations:
σ (y)
dSy = V0 ∀x ∈ ∂Ω0 (14)
∂Ω 4π|x − y|
σ (y)
dSy − V = 0 ∀x ∈ ∂Ω1 (15)
∂Ω 4π|x − y|
1 +
ε σ (y) + ε+ (K σ )(y)dSy = 0 (16)
∂Ω1 2
1 +
(ε + ε− )σ (x) + (ε+ − ε− )(K σ )(x) = 0 ∀x ∈ ∂Ω2 (17)
2
Dielectric Breakdown Prediction with GPU-Accelerated BEM 141
The solution of the system of Eqs. (14)–(16) yields the virtual surface charge
distribution from which the electric field can be compute at any point in space as
x−y
E(x) = σ (y)dSy ∀x ∈ R3 . (18)
∂Ω 4π|x − y|3
3 Discretization
n
σh (y) = uj ψj (y),
j =1
4 GPGPU Quadrature
D >η·R (20)
Fig. 3 Near-singular assembly: The gray triangles are near-singular and are computed after the
regular and singular triangles
Dielectric Breakdown Prediction with GPU-Accelerated BEM 143
then it is a regular pair. We used the scaling parameter η = 1.2. All other pairs are
near-singular. Circumcenters and circumradii of the triangles can be precomputed.
All three types of pairs can be integrated by using the well-known Duffy-
transformation, see [5]. This is straightforward for regular and singular pairs, only
the near-singular pairs need to be integrated adaptively for accuracy. They frequently
occur, e.g. in cases with narrow gaps in the geometry, or during the postprocessing,
when a point near the surface is to be evaluated. In this near-singular case we first
compute the point of the triangle that is closest to the collocation point. Next we
subdivide the triangle into smaller triangles such that this closest point is a corner-
node of a subdividing triangle. Then we again employ the Duffy-transformation to
integrate over all smaller triangles. This yields an adaptive quadrature with increased
accuracy around the closest point. The adaptivity can impact the performance of a
GPU-computation badly if no attention is paid, because then there is divergence in
the control flow on the GPU. In order to minimize this divergence, we first compute
the regular and singular pairs, and deal with the near-singular integrals later.
The categorization into regular, near-singular and singular pairs is carried out
on the fly during the iteration through the triangles. If a pair is marked as near-
singular, then it will be marked as not processed, see Fig. 3. They are computed in
parallel by using the subdivision method after the regular and singular cases have
been completed. This strategy allows that all three types of integrations are carried
out in parallel, without the need to mix operations.
The full matrix may not fit in the memory of one GPU for larger problems.
Therefore, but also to speed up the computation we use multiple GPUs. Due to
the independence of matrix rows we split the matrix into multiple blocks of rows
that can be computed and stored independently on different GPUs.
5 Numerical Experiments
In this chapter we show some examples that were computed with the novel GPU-
implementation that is based on the H2Lib package, see [6]. We first validated our
implementation for an axial-symmetric case. We compared the results of the H2Lib
with the results of the already existing simulation tools Polopt (3D) see [3], and Elfi
(2D) see [2]. Next we compared the performance of the new GPU-parallel H2Lib
implementation with the performance of the existing MPI-parallel Polopt tool.
5.1 Validation
of the other (floating) sheets are unknown and are to be determined. The sheets
are treated as single surfaces according to Eq. (12). Their potentials were computed
with all three solvers. POLOPT and H2Lib use the same mesh with 3’526 nodes.
The results agree very well, see Table 1. The small remaining differences are due to
the use of different quadratures.
5.2 GPU-Acceleration
Fig. 5 Cumulative times for assembly, solving and surface electric field computation for POLOPT
and H2LIB
Fig. 6 Cumulative times for assembly, solving and surface electric field computation for the
H2Lib only
number of GPUs were chosen such that the matrix fitted in the combined memory
of all GPUs in single precision for H2Lib. Again the GPU-parallel implementation
clearly outperforms the CPU-parallel version. We also evaluated the breakdown
146 C. Münger et al.
criterion along the most probable breakdown paths (field lines), see Table 3. The
acceleration seems even higher, however the evaluation is only implemented in
serial in Polopt.
References
Abstract Empirical studies are presented on a certain radio frequency (RF) struc-
ture that has not yet been well understood. The coaxial structure provides almost
ideal conditions to approximate high-pass filter functions. It has been investigated
by the aid of numerical simulations accompanied by the search for appropriate
equivalent microwave networks. A particular feature is a finite transmission zero
which allows not only for maximally flat and Chebyshev approximations but also
the synthesis of elliptic filter functions. The synthesis is drawn by means of two
examples taking into account the topology of the equivalent circuit.
1 Introduction
Coaxial microwave filters have been applied for decades to damp unwanted resonant
modes in accelerating and deflecting type cavities operating at tens of megahertz up
to few gigahertz while the extracted power may reach the level of 1 kW in particular
cases [1–4]. These so-called higher-order mode couplers are essentially high-pass
or pseudo-high-pass filters consisting of coaxial lines and certain discontinuities
in between. Early design procedures were focused on the implementation of
narrow-band band-pass filters using reactance-coupled λ/2 resonators [5, pp. 528].
However, such semi-analytical approaches provide only rough estimates for the
K. Papke ()
CERN, Geneva, Switzerland
University of Rostock, Rostock, Germany
e-mail: [email protected]; [email protected]
F. Gerigk
CERN, Geneva, Switzerland
e-mail: [email protected]
U. van Rienen
University of Rostock, Rostock, Germany
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 149
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_15
150 K. Papke et al.
geometrical parameters of coaxial microwave filters. Since the 1990s, the filter
design was more and more based on numerical simulations which permit the precise
evaluation of scattering properties associated with arbitrarily shaped microwave
structures and their systematic adaptation according to individual requirements.
Still, the selection of a suitable topology, as chosen prior to the numerical analyses,
is very much in the realm of intuition and experience [1]. Even for a specified
topology, the applied numerical optimization scheme may not be able to converge
against the best solution, given a certain set of requirements, as too many variables
may be involved.
This paper proposes a generally applicable procedure to systematically design
coaxial microwave filters on the basis of filter or transfer functions; this further
implies the most suitable topology for the given problem. The synthesis of a filter
function rests on the idea that scattering properties of discontinuities in coaxial
guides are well described by lumped elements within the interesting frequency
range, i.e. by equivalent circuits. A large variety of microwave structures with
certain filter characteristics and appropriate equivalent circuits has been worked
out already until the 1950s [5, 6]. Still, scattering properties of coaxial microwave
structures with multiple discontinuities being relatively close to each other, so that
evanescent modes may interact, are partially unexplored by means of equivalent
circuits. The structure sketched in Fig. 1 is such an example. In the limit of
vanishing coaxial lines, it may equivalently be described by a canonical network
realization of third-order high-pass filter functions. A particular feature is the
transmission zero at finite, non-vanishing frequency which allows not only for
maximally flat and Chebyshev approximations but also the synthesis of elliptic filter
functions. The equivalent circuit can be considered as surrogate system, significantly
cheaper to evaluate than the three-dimensional field problem, and with excellent
approximation properties within a certain frequency range. This together with the
fact that equivalent circuit parameters allow for large adjustment ranges as shown
2rfix 1 2rfix 2
α
θ1 θ1 θ2 θ2
Fig. 1 Cross-sectional and side view of a coaxial structure with two cylindrical fixings of radii
rfix1 , rfix2 between the inner and outer conductor of radii ri and ro , respectively. Both fixings are
rotated against each other in the transverse plane by the angle α. The parameters Δ1 and Δ2
represent the electric thicknesses of the corresponding fixings. The inner conductor is interrupted
at the center by a distance dgap . The sections of coaxial guides are described by the lengths lν
and characteristic impedances Zν with ν = 0, 1, 2. Terminal planes are denoted as θμ or θμ with
μ = 1, 2
Empirical Analysis of a Coaxial Microwave Structure with Finite Transmission Zero 151
in the following, makes the microwave structure in Fig. 1 particularly interesting for
the synthesis of high-pass filter functions with finite transmission zeros.
After introducing the equivalent network in Sect. 2, individual circuit elements
are further investigated in Sect. 3, some of which provide unexpected and qualita-
tively new behavior for the microwave circuit theory. Finally, the synthesis is drawn
in Sect. 4 by means of two examples.
2 Equivalent Circuit
S = w S w (1)
where the diagonal matrix w = diag{eiβl1 , eiβl2 } invokes the inward phase shifts
along each terminal translation given the propagation constant β and lengths l1 and
l2 which are per se not known due to the finite thickness of the obstacles in the
structure. Consequently, the elements sij ; i, j = 1, 2 of the scattering matrix S are
considered as functions of these lengths and the frequency.
Let l1 + Δ1 be the distance from the terminal plane θ1 to the center of the left
fixing in Fig. 1 which is a priori known. Similar, let l2 + Δ2 be the distance from the
terminal plane θ2 to the center of the right fixing. Since Δ1 and Δ2 define half of
the electric “thickness” of the individual fixing, the sum l0 + dgap + Δ1 + Δ2 must
correspond to the distance between the centers of both fixings. Linear transforms
{Δ1 , Δ2 }
→ {l0 , l1 , l2 } are introduced to reduce the number of length variables as
well as to confine their variation to the vicinity of the corresponding fixing.
1Numerical simulations are mostly carried out using CST STUDIO SUITE® software [7] for the
present work. In part, they are verified with COMSOL Multiphysics [8].
152 K. Papke et al.
iωL0
Z1 , l1 Z0 , l20 Z0 , l20 Z2 , l2
iωL1 1 iωL2
iωC0
θ1′ θ1 θ2 θ2′
Fig. 2 Equivalent circuit model composed of lumped elements and transmission lines
Fig. 3 Approximation of the numerically simulated RF reflection and transmission with respect to
the terminal planes θ1 and θ1 by means of the equivalent circuit model according to Figs. 1 and 2.
The structure is assumed to be symmetric, hence, Z1 = Z2 , l1 = l2 , and rfix1 = rfix2 , with ri = 5 mm,
ro = 22.5 mm, and rfix1 = 3 mm. The cross section of the coaxial guide in between the fixings is
identical to those of the input and output regions. The fixings are separated by a distance of d =
22.5 mm while the inner conductor is separated by a distance of dgap = 0.3 mm. Circuit parameters
L0 , C0 , L1 are derived from the minimization problem (6). (a) Transmission and reflection power
gains |s12 |2 , |s11 |2 . (b) Real and imaginary part of the reflection at the terminal plane θ1 . The R 2
value reveals very good approximation in the considered frequency range f ≤ 2 GHz
The equivalent circuit in Fig. 2 admits a transmission matrix Tmodel between the
terminal planes θ1 and θ2 whose elements tij ; i, j = 1, 2 are given by
t12 1 ω
t11 = + cos βl0 − sin βl0 , (2)
iωL2 2Z0 C0 ω02 − ω2
1 ω βl0
t12 = i cos2 + iZ0 sin βl0 , (3)
C0 ω02 − ω2 2
t12 1 ω
t22 = + cos βl0 − sin βl0 , (4)
iωL1 2Z0 C0 ω02 − ω2
t11 t22 − t12 t21 = 1. (5)
Empirical Analysis of a Coaxial Microwave Structure with Finite Transmission Zero 153
The necessary condition for the frequency response of the microwave structure
being approximated by the equivalent circuit can be formulated as
min ||T(ωk ) − Tmodel (ωk )||2 , (6)
Δ1 ,L1 ,Δ2 ,L2 ,Z0 ,C0
k
where T results from the simulated and phase shifted scattering matrix S sampled
at the frequencies ωk . The relationship between the scattering matrix S and
transmission matrix T can be found in [9, p. 192]. The sufficient condition requires
the residual to become small and, thus, defines the applicable frequency range for
the model. Since the resonant frequency of the LC resonator is directly obtained
from the simulated transmission power gain |s12 |2 only one parameter, either L0 or
C0 is involved in the nonlinear least-square problem (6). It can be solved by iterative
minimization schemes, such as a constrained BFGS algorithm.2
3 Analyses
2 Broyden-Fletcher-Goldfarb-Shanno algorithm.
154 K. Papke et al.
Fig. 4 (a) Capacitance of the parallel LC resonator normalized to the definition C0∗ = ε0 πri2 /dgap
and (b) ratio of series and shunt inductances, both as functions of the distance between the fixings,
d. Furthermore, it is ri =5 mm, ro=22.5 mm, rfix1=rfix2=3 mm, dgap=0.3 mm
4 Application
where c0 and ε are scalars and Dn (iω) is the filter function, the systematic approach
to derive a microwave circuit being able to approximate this frequency response is
defined as synthesis. In accordance to the equivalent circuit in Fig. 2, any rational
third-order high-pass filter function may be considered. Particularly interesting are
elliptic filter functions as they yield the steepest transition between passband and
stopband given certain attenuation limits in both frequency bands [10, pp. 207]. The
synthesis of elliptic filters is drawn in the following.
Consider the frequency map f : Ω
→ ω. It maps a normalized frequency
space associated with a low-pass to the frequency space of a high-pass according
√
to the definition ω = ωp ωs /Ω, with the passband and stopband edges ωp and ωs ,
respectively. The filter function of a normalized elliptic low-pass of odd order n is
defined as [11]
(n−1)/2
Ω 2 − Ω0ν
2
Dn (iΩ) = c1 iΩ 2 Ω2 − 1
, (8)
ν Ω0ν
The transfer function H is directly related to |s12 |2 taking into account the
impedance normalization to Z1 and Z2 at the corresponding terminal planes in
Fig. 1 [10, pp. 163]. Lossless two-ports admit unitary scattering matrices, hence,
SH S = I. They further fulfill reciprocity, so that s12 = s21 . Both properties are used
to derive S from |s12 |2 , only. The corresponding impedance matrix is obtained via
1 1
Z = P 2 [I + S] [I − S]−1 P 2 , (9)
Fig. 5 (a) Optimized microwave structure to approximate a third-order elliptic high-pass filter
characteristics. (b) Insertion loss as given by the predefined transfer function H (iω) in black and
the approximation by the optimized structure in blue. Deviations are caused by transmission lines
156 K. Papke et al.
Fig. 6 (a) Optimized microwave structure to approximate a fifth-order elliptic high-pass filter
characteristics. (b) Insertion loss as given by the predefined transfer function H (iω) in black and
the approximation by the optimized structure in blue. Deviations are caused by transmission lines
fixings. Another example using the same passband and stopband edges but higher
order is shown in Fig. 6. It results from a cascade of two structures each adjusted
as a third-order filter. The subsequent connection requires significant changes of the
rotation angles in order to achieve the attenuation curve shown in Fig. 6b.
5 Conclusions
To the authors’ best knowledge, this work contains three new scientific contri-
butions. First, the systematic design of coaxial microwave filters on the basis of
abstract filter or transfer functions was demonstrated. It enables both the design of
coaxial high-order mode couplers under completely new aspects and fundamental
predictions about the topology prior to any computational refinement. The synthesis
is based on equivalent circuit models based on a finite cascade of lumped, lossless
two-ports and transmission lines whose parameters are fitted according to simulated
scattering functions. One structure was elaborated and is particularly suitable for the
synthesis of rational high-pass filter functions. A second important finding is that the
ladder network topology for the equivalent circuit remains valid even in the presence
of evanescent mode coupling between adjacent discontinuities of the coaxial guide.
Finally, the empirical studies on the considered microwave structure, i.e. the nature
of its transmission zero at finite, non-vanishing frequency, open up new research
topics for the microwave circuit theory and await field theoretical analyses.
Empirical Analysis of a Coaxial Microwave Structure with Finite Transmission Zero 157
References
1. E. Haebel, Couplers, tutorial and update. Part. Accel. 40, 141–159 (1992)
2. Q. Wu, B. Sergey, I. Ben-Zvi, et. al., Operation of the 56 MHz superconducting rf cavity in
RHIC with higher order mode damper. APS 22, 102001 (2019)
3. J. Mitchel, Higher Order Modes and Dampers for the LHC Double Quarter Wave Crab Cavity.
Ph.D. thesis, University of Lancaster, 2019
4. K. Papke, F. Gerigk, U. van Rienen, Comparison of coaxial higher order mode couplers for the
CERN Superconducting Proton Linac study. APS 20, 060401 (2017)
5. G.L. Matthaei, L. Young, E.M.T. Jones, Microwave Filters, Impedance-matching Networks,
and Coupling Structures (Artech House, Norwood, 1980)
6. N. Marcuwitz, Waveguide Handbook (McGraw-Hill Book Company, New York, 1951)
7. CST - Computer Simulation Technology Ver. 2016. CST AG, Darmstadt, Germany (2016)
8. COMSOL Multiphysics Ver. 5.3. COMSOL Multiphysics GmbH, Stockholm, Sweden (2017)
9. D.M. Pozar, Microwave Engineering, 4th edn. (Wiley, New York, 2012)
10. O. Wing, Classical Circuit Theory (Springer, New York, 2008)
11. R. Saal, E. Ulbrich, On the design of filters by synthesis. IRE Trans. Circ. Theory 5, 284–327
(1958)
Frequency-Domain Non-intrusive Greedy
Model Order Reduction Based
on Minimal Rational Approximation
1 Introduction
u(μ) = f(μ),
A(μ) (1)
A(μ) = A0 + μA1 + μ2 A2 ,
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 159
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_16
160 D. Pradovera and F. Nobile
let {
j }Sj=1 ⊂ PS−1 (C) be the Lagrangian basis associated to the sample points;
2. build the Vandermonde matrix V ∈ CS×S and the diagonal weight matrix D:
&
'
dS−1
1 dS−1
S
(V)ij = ψj (μi ) and D = diag , . . . , ∈ CS×S ;
dμS−1 dμS−1
S−1
Q ∈ PS−1 (C), Q(μ) = (
q )i ψi (μ);
i=0
Frequency-Domain Non-intrusive Greedy MOR Based on Minimal Rational. . . 161
where (A): i denotes the i-th column of matrix A; then the full minimal rational
approximation 3 u(μ) = W˚
u ≈ u can be found as 3 u(μ).
A common feature of all the techniques cited above is that a “sufficiently large”
number of samples is needed to guarantee the accuracy of the surrogate model;
in the particular case of frequency-domain problems, there exist lower bounds [3]
for the number of samples required to achieve reasonable accuracy. Unfortunately,
such number depends on the unknown spectral properties of A, and on the
approximability of f. For RB and MRI, one can identify adaptively the correct
number of samples by relying on the so-called greedy algorithm, which can be
summarized as follows:
1. Initialize a set V = { u1 , . . . , uS0 } with some preliminary snapshots at
μ1 , . . . , μS 0 .
2. Build a surrogate model (e.g., by MRI) based on V .
3. Choose a measure r(μ) of the discrepancy between exact and surrogate solution,
and find its maximal point % μ: r = r(μ) ≤ r(% μ) for all μ.
4. If r(%μ) is smaller than a prescribed tolerance, terminate.
5. Compute a snapshot at % μ, add it to V , and go to 2.
The main difficulty in setting up the greedy algorithm is choosing a good r.
Given the presence of resonances, it is standard [3] to use as a posteriori estimator
the residual of (1), namely, given some suitable norm · ,
r(μ) = A(μ)3
u(μ) − f(μ) . (3)
In an intrusive framework, an efficient way to compute (3) has been known in the
RB literature for quite a while, see e.g. [2], assuming f(μ) to depend affinely on μ,
i.e.
Nf−1
f(μ) = θi (μ)fi ,
i=0
162 D. Pradovera and F. Nobile
with fi ∈ Cn and θi : C → C for all i. Then, as long as the matrices Ai , the vectors
fi , the weights θi (μ), and the reduced surrogate solution ˚
u(μ) are available, we can
evaluate the residual at μ as
Nf−1
⎛ ⎞
2
r(μ)2 = θi (μ)θj (μ)fj , fi + ˚
u(μ)∗ ⎝ μi μj Aj W, Ai W ⎠ ˚
u(μ)
i,j =0 i,j =0
⎛⎛N −1 ⎞ ⎞
f
2
−2Re ⎝⎝ θi (μ)μj Aj W, fi ⎠ ˚
u(μ)⎠
i=0 j =0
(I)
in O((S + Nf)2 ) operations. This idea can be employed in MRI as well, at the cost
of making the procedure intrusive. However, we propose here some alternatives.
In [7] it was observed that, if 3u is the MRI of u with samples at {μj }Sj=1 and
denominator Q, and both A(μ) and f(μ) depend at most linearly on μ (i.e. A2 = 0
and f(μ) = f0 + μf1 ) or μ2 (i.e. A1 = 0 and f(μ) = f0 + μ2 f2 ), then
c
S
r(μ) = |μ − μj |, (4)
|Q(μ)|
j =1
For this last indicator, as long as the greedy iterations continue, the extra snapshot
does not go wasted, since it is precisely the one which gets added to V in step 5: on
Frequency-Domain Non-intrusive Greedy MOR Based on Minimal Rational. . . 163
the whole, this procedure computes only one “extra” snapshot, at the final greedy
iteration, with respect to the two previous versions of the algorithm. Actually, one
can adjust the greedy algorithm so as to employ even the extra snapshot in the final
surrogate model: it suffices to build an updated MRI using all the samples, including
the last one, once the termination condition has been satisfied.
We remark that the last two strategies rely on (4), which is valid only under some
strong assumptions (linear dependence on the parameter) on A and f. However, (4)
can still be used for general parametric problems (1), and will give a reasonable
d2 d2
estimation of the residual as long as dμ 2 A and dμ2 f are small.
3 Numerical Examples
Here, through two practical examples, we showcase the usefulness of the greedy
MRI procedure, as well as the effectiveness of the three termination strategies based
on (I), (R), and (C).
and build a surrogate for u using MRI. Then, our estimates for the eigenvalues will
be the roots of the MRI denominator Q.
As a first MOR method, we apply greedy MRI: in particular, we employ
indicator (I) with relative tolerance 10−2 , and the initial snapshots are at the S0 = 30
shifted roots of unity1 {μ0 − 0.175e2iπj/S0 }Sj =1
0
.
1 The shifted roots of unity are chosen as sample points because they allow for very stable and
efficient interpolation schemes, relying on Fast Fourier Transform. We refer to [1] for a more
detailed discussion of their properties.
164 D. Pradovera and F. Nobile
Fig. 1 Results of standard (left) and greedy (right) MOR. The exact and approximate eigenvalues
are pluses and crosses, respectively, whereas the sample points are full dots. The contour plots show
the logarithm of the greedy residual indicator; the dashed line represents the locus {μ : r(μ) = tol},
i.e. the boundary of the set where the prescribed tolerance is not satisfied
Table 1 Timing results of greedy MOR (average over 3 simulations with the same parameters for
each method). All simulations were carried out on a single node of the Fidis cluster at EPFL [6]
& 5 '−1
1 − (μc /μ0 )2 ∗ −1
S : C μ
→ I − 2 I + iμ F (K − μ M) F
2
∈ C3×3 , (5)
1 − (μc /μ)2 ! "# $
U(μ)∈C90,258×3
where we set μc = 6.56 GHz, μ0 = 10 GHz, and the state matrix U(μ) has one
column for each port of the waveguide.
We build a surrogate for U using greedy RB and MRI, employing indicators (I)
and (R) with relative tolerance 10−2 , and (C) with tol = 10−4 . The reduced
tolerance for (C) can be justified by the considerably different nature of the indicator.
To obtain an approximation of S, we just replace the exact state with the surrogate
one in (5).
The results are summarized in Table 1 and visually depicted in Fig. 2. We
remark that, by construction, the snapshot “history” of MRI is independent of
the indicator, i.e. the parameter value % μ which is selected at a given iteration is
the same: the only effect of the choice of the indicator, besides timing, is the
number of greedy iterations which are carried out before termination. In this regard,
we observe that MRI+(I) and MRI+(R) yield exactly the same indicator, whereas
MRI+(C) terminates one snapshot sooner,2 causing some slight instability in the
approximations of the scattering parameters for low frequencies, noticeable mostly
in S13 .
A comparison of the surrogate S obtained by MRI+(I) and RB+(I) shows that
the two methods yield very similar approximations, and reconstruct well the exact
values. In fact, the approximated scattering parameters for RB are not included
in Fig. 2, as they are almost indistinguishable from those obtained with MRI+(I).
2 Here we are discarding the final extra snapshot used to check the termination condition (C). If it
had been included, we would have recovered the same surrogate model as MRI+(I)/(R).
166 D. Pradovera and F. Nobile
0
|Si j | [dB]
− 100
|S11 | |S12 |
|S13 | |S22 |
− 200 |S23 | |S33 |
− 50
r(μ ) [dB]
− 100
− 150 RB+(I)
MRI+(I)
Fig. 2 Results of greedy MOR. On top the surrogate scattering parameters: the points are
measurements from the original problem (5), whereas full and dotted lines are the surrogates
obtained with MRI+(I) and MRI+(C), respectively. On the bottom the relative residual at the end
of the greedy iterations for RB+(I) and MRI+(I); the points indicate the snapshot positions
However, RB requires one more snapshot, and the locations of the snapshots (and
the residual profiles) for RB and MRI are quite different.
In terms of computing time, the efficiency of MRI seems quite remarkable,
particularly for the two “least intrusive” indicators: the overhead time needed for
the evaluation of indicators (R) and (C) is just a fraction of the time required for
computation of (I) in RB.
4 Conclusions
Acknowledgments This work has been funded by the Swiss National Science Foundation through
FNS Research Project number 182236.
Frequency-Domain Non-intrusive Greedy MOR Based on Minimal Rational. . . 167
References
1. A.P. Austin, P. Kravanja, L.N. Trefethen, Numerical algorithms based on analytic function
values at roots of unity. SIAM J. Numer. Anal. 52, 1795–1821 (2014)
2. P. Benner, S. Gugercin, K. Willcox, A survey of projection-based model reduction methods for
parametric dynamical systems. SIAM Rev. 57, 483–531 (2015)
3. V. De La Rubia, M. Mrozowski, A compact basis for reliable fast frequency sweep via the
reduced-basis method. IEEE Trans. Microw. Theory Tech. 66, 4367–4382 (2018)
4. A.C. Ionita, A.C. Antoulas, Data-driven parametrized model reduction in the loewner frame-
work. SIAM J. Sci. Comput. 36, A984–A1007 (2014)
5. M.N. Kooper, H.A. Van Der Vorst, S. Poedts, J.P. Goedbloed, Application of the implicitly
updated Arnoldi method with a complex shift-and-invert strategy in MHD. J. Comput. Phys.
118, 320–328 (1995)
6. SCITAS: EPFL Fidis cluster webpage. Online document (2020). http://www.epfl.ch/research/
facilities/scitas/hardware/fidis. Cited 11 Feb 2020
7. D. Pradovera, Interpolatory rational model order reduction of parametric problems lacking
uniform inf-sup stability. SIAM J. Numer. Analy. 58(4), 2265–2293 (2020)
A Comparison Between Different
Formulations for Solving Axisymmetric
Time-Harmonic Electromagnetic Wave
Problems
1 Introduction
When treating a problem exhibiting axial symmetry, a Fourier expansion along the
azimuthal direction can be exploited in order to restrict the computation to a two-
dimensional (2D) angular cross section of the geometry, while still considering a
fully three-dimensional (3D) solution [1]. Therefore, these methods are also referred
to as quasi-3D or 2.5D methods. Let us consider a cylindrical coordinate system
(r, ϕ, z), and let us expand the electric field e(r, ϕ, z) into a Fourier series along ϕ:
⎡ ⎤ ⎛⎡ m ⎤ ⎡ −m ⎤⎞
er0 (r, z) ∞ er (r, z) cos(mϕ) er (r, z) sin(mϕ)
e(r, ϕ, z) = ⎣ eϕ0 (r, z) ⎦ + ⎝⎣ eϕm (r, z) sin(mϕ) ⎦+⎣ eϕ−m (r, z) cos(mϕ) ⎦⎠ ,
0
ez (r, z) m=1 ezm (r, z) cos(mϕ) ez−m (r, z) sin(mϕ)
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 169
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_17
170 E. Schnaubelt, N. Marsic, and H. De Gersem
T
where the Fourier coefficients en (r, z) = ern , eϕn , ezn with n ∈ Z are functions of
the radial and axial coordinates only. Furthermore, by exploiting the orthogonality
of the trigonometric functions, we can write the Maxwell eigenvalue problem for an
axisymmetric cavity V with perfect electric conducting boundaries as [1]:
⎧
⎪
⎨ For a given mode n ∈ Z, find the eigenpairs (en , ω2 ) with en ∈ S n (Ω) :
−1 n ω2
⎪
⎩ μ r curln
e n
· curln
e dΩ − 2
εr en · en dΩ = 0 ∀en ∈ S n (Ω),
Ω c0 Ω
(1)
A first approach consists in taking the unknown fields eϕ,n = reϕn ∈ H 1 (Ω) and
n = e n , e n T ∈ H (curl, Ω) [2] together with non-classical discrete conditions at
erz r z
the symmetry axis [3, Section 4.4]. By following this strategy, all integrals are well-
posed but exhibit singular integrands, hence requiring either i) a classical Gaussian
quadrature with a large number of quadrature points or ii) specialized quadrature
rules [3, Section 5.1] which differ from element to element, thus preventing the
use of fast assembly techniques [4]. In what follows, this approach will be further
referred to as transformation “TA”.
Comparison of Different Formulations for Axisymmetric EM Wave Problems 171
n n eϕn n n
U n = erz + r̂ r α U n = nerz
n + grad (ren )
rz ϕ U n = erz
r r r
Let us start our comparison by determining if the methods discussed previously can
avoid spurious modes. As we search the azimuthal unknown (eϕ∗,n for TA and un
for TB, TC and TD) in a finite subspace2 of H 1 (Ω) of polynomial order q and the
in-plane unknown (erz for TA and U n for TB, TC and TD) in a finite subspace (see
footnote 2) of H (curl, Ω) of polynomial order p, the dimension of each subspace
must be selected with care. In particular, in order to satisfy the exactness of the
discrete de Rham sequence [10], one must impose that q = p + 1 [11].
In order to validate this choice, we ran multiple numerical tests with the different
transformations, different modes n and different values for p and q. As a result, we
observed that, apart from TD, all eigenspectra were free of spurious modes when
q = p + 1. Interestingly, we also observed no spurious modes when q > p + 1. On
the other hand, spurious modes were systematically observed when q < p + 1, and
when transformation TD was used with |n| > 1 (for all possible values of p and q).
For this reason, TD will not be investigated further. As an illustration, Fig. 1 shows
a part of the numerical spectrum of a pillbox cavity for n = 1 and different mesh
densities. It was computed with TB, once for q = 3, p = 2 and once for q = p = 2.
When n = 0, the in-plane and azimuthal unknowns are decoupled from each
other [6, Section 1.6]. Therefore, q and p can be chosen independently.
1 See https://gitlab.onelab.info/gmsh/small_fem/blob/master/simulation/Quasi3D.cpp.
2
In this paper, a finite subspace of H 1 (Ω) resp. H (curl, Ω) is built using grad-conforming
(resp. curl-conforming) finite elements from [9, Chapter 4.5]. In particular, we consider complete
subspaces of H (curl, Ω) with both irrotational and rotational functions.
Comparison of Different Formulations for Axisymmetric EM Wave Problems 173
TE123
TM120 1.4 ·1010
TE114
1.3 ·1010
TE122
TM113
1.2 ·1010
TE121
1.1 ·1010
TE113
TM112 1 ·1010
2 4 8 16 32 2 4 8 16 32
# of mesh elements per wavelength of the TM112 mode (− )
(a) (b)
Fig. 1 Part of the spectrum of a pillbox cavity obtained with TB and n = 1. (a) Polynomial order
q = 3, p = 2. (b) Polynomial order q = 3, p = 3
10− 1
Rel. error eigenfreq. (− )
TA TB TC(1, 1)
G=4
10− 3 G=6
G=7
G = 12
10− 5
G = 13
G = 16
10− 7 G = 19
4
1
2 4 8 16 32
# mesh elements/wavelength (− )
Fig. 2 Convergence results when computing the eigenfrequency of the TE111 mode of a pillbox
cavity with TA, TB and TC(1, 1), using q = 3, p = 2 and G Gauss-Legendre quadrature points
of 0.5, (ii) their sum must be an integer and (iii) β ≥ 1.5 for n = 0 and
α ≥ 0.5, β ≥ 0.5 for n = 0.
Let us now focus on the transformation TC(α, β), and let us carry out a convergence
test similar to the previous section. However, now, the influence of the parameters
α and β (chosen according to Table 2) on the convergence rate will be investigated.
The results of this numerical experiment are displayed in Fig. 3. As it can be
observed directly, while all choices converge towards the sought eigenvalue, only
particular pairs (α, β) exhibit the expected convergence rate. This behavior has been
observed for other choices of (n, p, q) with q = p + 1 as well.
This behavior can be easily explained if we assume that en ∈ C ∞ in the vicinity
of the symmetry axis. This assumption is of course restrictive, but applies to the
pillbox cavity [8], and gives already a good insight into the underlying numerical
mechanisms. In what follows, only the case n = ±1 will be discussed, but the same
methodology applies to the other cases.
Fig. 3 Convergence rate of TC(α, β) for different values of α and β, as allowed by Table 2.
The symbol “∗” in (a) indicates that the result is independent of p due to the decoupling of the
in-plane and azimuthal unknowns (see Sect. 3.1). (a) TM111 mode with q = 4, p = 3. (b) TE022
mode with q = 4, p = ∗
Comparison of Different Formulations for Axisymmetric EM Wave Problems 175
Let us start by expanding e±1 into a Taylor series in the vicinity of r = 0 and
z = z0 . As ez±1 = 0 at r = 0 (see [1]), we have:
⎧
⎪ ±1 z rr 2 zz 2 rz
⎨ eϕ (r, z) = a0 + a1 r + a1 (z − z0 ) + a2 r + a2 (z − z0 ) + 2a2 r(z − z0 ) + . . . ,
r
±1 2 rz
r rr
ez (r, z) = b1 r + b2 r + 2b2 r(z − z0 ) + . . . ,
⎪
⎩ ±1
er (r, z) = c0 + c1r r + c1z (z − z0 ) + c2rr r 2 + c2zz (z − z0 )2 + 2c2rz r(z − z0 ) + . . . .
(2)
Restricting the further analysis to the axial component, we then have that:
(3)
Uz±1 = r −α ± ez±1 + r(a1z + 2a2zz (z − z0 ) + 2a2rz r + . . . ) ,
(2)
= r −α r b1r + b2rr r + 2b2rz (z − z0 ) + · · · + a1z + 2a2zz (z − z0 ) + 2a2rz r + . . . ,
def 1−α
=r f (r, z),
4 Conclusion
Acknowledgments The authors would like to express their gratitude to Abele Simona for his
valuable advice and the fruitful discussions on axisymmetric problems.
References
6. S. Cambon, Méthode d’éléments finis d’ordre élevé et d’équations intégrales pour la résolution
de problème de furtivité radar d’objets à symétrie de révolution. Ph.D. Thesis, Institut National
des Sciences Appliquées de Toulouse, 2012
7. M. Oh, de Rham complexes arising from Fourier finite element methods in axisymmetric
domains. Comput. Math. Appl. 70(8), 2063–2073 (2015)
8. T.P. Wangler, RF Linear Accelerators. 2nd, Completely Revised and Enlarged edition. (Wiley-
VCH Verlag GmbH & Co. KGaA, Weinheim, 2008)
9. S. Zaglmayr, High Order Finite Element Methods for Electromagnetic Field Computation.
Ph.D. Thesis, Johannes Kepler Universität Linz, 2006
10. A. Simona, L. Bonaventura, C. de Falco, S. Schöps, Isogeometric approximations for
electromagnetic problems in axisymmetric domains (2019). arXiv preprint arXiv:1912.08570
11. L. Demkowicz, P. Monk, L. Vardapetyan, W. Rachowicz, de Rham diagram for hp finite
element spaces. Comput. Math. Appli. 39(7–8), 29–38 (2000)
12. S. Sauter, hp-finite elements for elliptic eigenvalue problems: error estimates which are explicit
with respect to λ, h, and p. SIAM J. Numer. Analy. 48(1), 95–108 (2010)
The Magnetization Analysis of Motor
Magnet and Its Influence on Cogging
Torque
1 Introduction
The acoustic performance is one of the most important indicators to evaluate the
comfort of automobile, therefore, the acoustics of electric machines as a common
device in vehicles is a critical point that needs to be considered in the machine design
process.
However, the increasing precision requirements for the accuracy of prediction
of noise-exciting forces of electric machines pose a significant challenge to the
assumptions and idealizations applied in motor design process today. In order
to meet higher precision requirements, it is necessary to adapt influences from
statistical geometry variations, material fluctuations and the manufacturing process
of the machine. As an increasingly used electric machine in automobile, the acoustic
and vibration performance of the BLDC motor (Brushless Direct Current Motor) is
an important quality indicator of the machine. In the no load operation, it is mainly
determined by its cogging torque, which is highly influenced by the motor geometry
and magnetic field in the air gap [1–3]. Hence, instead of using an idealized
magnetic field, a more accurate and realistic description of the magnetic field of
C. Wang · S. Kurz
Technical University Darmstadt, Darmstadt, Hessen, Germany
e-mail: [email protected]
M. Willig · K. Gutmann ()
Robert Bosch GmbH, Buehl, Baden-Wuerttemberg, Germany
e-mail: [email protected]; [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 179
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_18
180 C. Wang et al.
The motor considered in this analysis is a BLDC motor in outer rotor configuration.
The topology of the motor is 12/8 i.e. the stator has 12 slots and the rotor has 8
magnet poles. In the motor used for this analysis, these 8 poles are magnetized on a
magnet ring as indicated by the vectors of the magnetic flux density B in Fig. 1.
The generation of cogging torque in the motor is determined by the interaction
of the magnet poles of the rotor and the slots of the stator. Based on the co-energy
in the system, the calculation of the motor cogging torque is given by the following
equation, which is the fundament of the torque calculation in the FEA-tool [2, 3].
HB
dW ∂
T = = [ ( B(H )dH )dV ] , (1)
dθ ∂θ V 0
where W is the magnetic coenergy, θ is the rotor position, H the magnetic field,
B is the flux density, HB is the magnetic field in operating point, V the integration
volume and T the calculated torque. For the applied method of virtual work, the
change in the coenergy of the system (and therefore the virtual torque) is given by
the change in the coenergy of the virtually distorted finite elements.
Fig. 2 The cogging torque of BLDC motor with 12 slots and 8 poles
By using the 3D FEA-tool to simulate the magnetization process, the input data
required for this method is the magnetization curve in the first quadrant of B − H
coordinate system, as shown in Fig. 3. This curve was measured using a standard
Permagraph measurement method and extrapolated to the point of the maximum
excitation field.
Technical ferrites are usually manufactured to have a main axis for the preferred
direction of the magnetic flux. In this case the magnetization process not only
depends on the magnetizing field but also on the axis in which the field is active.
Those effects of anisotropy and isotropy on the simulation results will be shown
later in this paper.
182 C. Wang et al.
With the defined magnetization curve, the hysteresis effect of the magnet material
needs to be introduced in analysis, in order to evaluate the remanence of the magnet
after magnetization. However due to the limitations of the FEA-tool used, the
commonly used hysteresis models like Preisach model and Jiles Atherton model
are not supported in the simulation. Instead, the software supports method, which is
called “classic approach” or “linear approach” to describe the hysteresis [6].
The linear approach is based on the approximated linearity in the descending
branch of the hysteresis loop of magnet. When the magnetization field is removed,
the magnetic polarization J descends with constant slope, which is identical to the
slope in the saturated region [7], as indicated by the dashed line in the Fig. 3. The
intersection of the descending line and vertical curve is the remanence point, which
is around 415mT in the analyzed magnet. The slope of the descending part is equal
to the saturated permeability, which is μ0 μm for the curve of the magnetic flux
density B, and μ0 (μm − 1) for the curve of magnetic polarization J . For the ferrite
magnet used in this motor, the relative permeability in saturated region is μm ≈
1.05.
In addition, because of the linearity of the descending branch, only the operating
point with maximum excitation field is necessary to be simulated. Hence, the
magnetization analysis in the FEA-tool can be significantly simplified to a single
magnetostatic simulation with the maximum excitation field.
The Magnetization Analysis of Motor Magnet and Its Influence on Cogging Torque 183
Both the magnetization and cogging torque analyses are executed in the 3D
environment in FEA-tool. The ring magnet is magnetized in the magnetization
analysis first and then its remanent field will be transmitted into the motor model
to calculate the cogging torque.
A full 3D model of the magnetization unit is built and the magnet inserted as
shown in Fig. 4, where a 90◦ slice of the whole model is depicted.
The excitation current in the windings is generated by the discharge of a
connected capacitor. However, in the analysis only the peak value of the current
impulse is needed when the linear approach is used. Moreover, the possible effects
of eddy currents due to the transient current impulse are not considered in this
analysis.
Figure 5 shows a 2D magnetic field distribution in the middle cross section of the
model, with the maximum excitation current applied. It is obvious that the magnet is
not uniformly magnetized due to the field distribution imposed by the magnetizing
unit. There are different areas in the magnet (particularly areas close to the pole
transition zones) that reach different parts of the virgin curve and therefore will
have different remanent inductions after the magnetizing field is removed.
After the magnetization analysis the magnetized magnet is available for cogging
torque analysis of the motor via an internal datalink in the FEA software. The motor
model is built up as in Fig. 1. The magnetization of the magnet in the motor is
identical to the magnetization achieved in the magnetization calculation.
Since the cogging torque is evaluated at no load conditions the coils of the motor
are omitted in the analysis to reduce the number of finite elements and therefore
computing time. Moreover, due to the symmetry of the motor model, only a section
of 30◦ is necessary to be analyzed.
Fig. 5 The B field distribution in middle section with maximum excitation current
5 Result of Analysis
The motor cogging torque was calculated for three different magnet models:
1. A calculated magnetization profile based on isotropic magnet material.
2. A calculated magnetization profile based on anisotropic magnet material.
3. An ideal magnetization profile.
The 3D FEA solver of the tool applied uses tetrahedral mesh elements. A fine
mesh was applied to the magnet in the magnetization as well as the motor analysis.
In each model a mesh of approximately 60k tetrahedra for the magnet was achieved
and a maximum energy error of 0.5% for the whole system was set as solver
criterion.
In the isotropic magnet model, the pre-defined magnetization curve is valid for
all magnetization directions, but only valid in the radial direction for the anisotropic
model. The response of the materials to a magnetizing field in the simulation can
be seen in Fig. 6. Whereas the resulting magnetization of the isotropic material is a
vector that is parallel to the applied magnetizing field, the resulting magnetization
of the anisotropic material is a vector representing the component of the applied
magnetizing field in the preferred direction of the material.
The Magnetization Analysis of Motor Magnet and Its Influence on Cogging Torque 185
Compared to the calculated magnetization fields in the last section, the ideal
field is defined as a purely radial and homogeneous magnetic field with constant
magnitude. The magnetic fields of these three cases are shown in Fig. 7.
Results of the cogging torque analysis for the three cases mentioned above are
shown in Fig. 8
From the comparison, it can be seen that the difference among the curves for all
different magnet settings is slight. The curve of the anisotropic magnet is closer to
the curve under ideal condition, the reason is the similarity between both magnetic
fields. Both, the anisotropic field and ideal field only have a radial component of the
field vector, the only difference between both fields is the magnitude in the transition
zone.
For a detailed analysis, the curves can be transformed into frequency domain by
using FFT analysis. The result is depicted in Fig. 9.
From this figure, it can be seen that the differences appear mainly in the 24th
and 48th harmonics, which are up to 25% difference in the amplitude of order 48.
Compared to the torque curves in time domain, the difference in frequency domain
is much more prominent. The 24th harmonic is the main order of the torque curve.
It is caused by the interaction between the first harmonic of the magnetic field in the
air gap and the stator teeth.
Consequently, using the magnetic field from the magnetization analysis can
improve the accuracy of the motor optimization, because the cogging torque
fundamental and harmonics are mainly responsible for the coast down noise.
186 C. Wang et al.
Fig. 7 3D and 2D figures of the remanence field of all three cases. (a) 3D Field of the isotropic
magnet. (b) 2D Field of the isotropic magnet (Z=0, 0 ≤ ϕ ≤ 90◦ ). (c) 3D Field of the anisotropic
magnet. (d) 2D Field of the anisotropic magnet (Z=0, 0 ≤ ϕ ≤ 90◦ ). (e) 3D Field of the ideal
magnetization. (f) 2D Field of the ideal magnetization (Z=0, 0 ≤ ϕ ≤ 90◦ )
The Magnetization Analysis of Motor Magnet and Its Influence on Cogging Torque 187
6 Summary
The calculation of the magnetization of the permanent magnet poles of an outer rotor
BLDC motor results in a more realistic field distribution in the motor simulation
and therefore allows a more accurate prediction of the cogging torque of the motor.
This supports the overall design and optimization process of these machines. For
proprietary reasons in this paper, the method is described using a ring magnet
motor design. With real motor samples, an improved agreement of prediction (3D
188 C. Wang et al.
field simulation) and measurement was found. It has to be pointed out that the
ferrites used are usually manufactured be anisotropic i.e. to have a main axis
and a lateral axis that respond differently to a magnetizing field. However due to
material imperfections and variances in the manufacturing process the material is
not perfectly anisotropic but has isotropic regions as well.
Therefore, to further improve the accuracy of the cogging torque analysis, an
improved material definition that accounts for anisotropic as well as isotropic
material properties needs to be developed.
References
1. I. Coenen, M. van der Giet, K. Hameyer, Manufacturing tolerances: estimation and prediction of
cogging torque influenced by magnetization faults. IEEE Trans. Mag. 48(5), 1932–1936 (2012)
2. M. Flankl, A. Tüysüz, J.W. Kolar, Cogging torque shape optimization of an integrated generator
for electromechanical energy harvesting. IEEE Trans. Ind. Electron. 64(12), 9806–9814 (2017)
3. C. Breton, J. Bartolome, J.A. Benito, G. Tassinario, I. Flotats, C.W. Lu, B.J. Chalmers, Influence
of machine symmetry on reduction of cogging torque in permanent-magnet brushless motors.
IEEE Trans. Magn. 36(5), 3819–3823 (2000)
4. Z.Q. Zhu, D. Howe, C.C. Chan, Improved analytical model for predicting the magnetic field
distribution in brushless permanent-magnet machines. IEEE Trans. Mag. 38(1), 229–238 (2002)
5. J.R. Hendershot, T.J.E. Miller, Design of Brushless Permanent-Magnet Machines (Motor Design
Books, Venice, 2010). ISBN 0-98406-870-8, 9-780-98406-870-8
6. ANSYS Maxwell support: Using the Hysteresis Model-Based Magnetization Approach
7. ANSYS Maxwell support: Compute Remanent Br from B-H curve
Part IV
Mathematical and Computational Methods
A Combination of Model Order
Reduction and Multirate Techniques
for Coupled Dynamical Systems
Abstract Coupled dynamical systems are often encountered in the field of circuit
simulation. To drastically reduce the simulation cost of these systems a coupling
of model order reduction and multirate techniques is applied. The subject of this
method is a nonlinear coupled thermal-electrical system. By applying a combination
of the slowest first multirate technique with the nonlinear proper orthogonal
decomposition model order reduction the system is solved. Results yield a decrease
in simulation time whilst maintaining accuracy.
1 Introduction
M. W. F. M. Bannenberg ()
Bergische Universität Wuppertal, Wuppertal, Germany
STMicroelectronics, Catania, Italy
e-mail: [email protected]
A. Ciccazzo
STMicroelectronics, Catania, Italy
M. Günther
Bergische Universität Wuppertal, Wuppertal, Germany
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 191
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_19
192 M. W. F. M. Bannenberg et al.
Equations (PDAEs). Where DAEs and partial differential equations, describing the
spatially distributed elements and effects, are coupled via source terms or boundary
conditions. Both physical and structural characteristics of these PDAEs can be
exploited to increase the simulation efficiency. For instance by applying techniques
such as Multirate (MR) time integration and Model Order Reduction (MOR), as
will be presented in the following sections. Circuit simulation has been a driving
force for the application of MOR and MR techniques, see for instance [3, 10].
Much less attention has been given to the combination of these two techniques,
[13], and only with respect to linear model order reduction. In this paper a twofold
approach is presented in which the PDAEs are integrated using MR time integration
and parts of the system are reduced. This is done to increase the computational
efficiency, whilst maintaining accuracy. In Sect. 2, the mathematical methodology
is formulated for the circuit simulation and the multirate and MOR techniques
are described. Section 3 presents the experimental setup and the numerical results
obtained from the implementations of the previous two sections. Conclusions are
drawn and an outlook is given in Sect. 4.
2 Methodology
In this section the different mathematical concepts and techniques that are needed
for the simulation of electronic circuits are presented. Although most equations are
purposely stated in their most general form, some of them will be restricted by
assumptions with the specifics combination of MR and MOR in mind.
The Modified Nodal Analysis (MNA) approach for modelling electronic circuits
yield time-dependent systems of DAEs,
d
A C q + AR r(ATR e) + ALiL + AV iV + AI i(t) = 0, (1)
dt
d
φ − ATL e = 0, (2)
dt
v(t) − A TV e = 0 (3)
q − qC (A TC ) = 0, (4)
φ − φL (iL ) = 0. (5)
Combined MR-MOR for Coupled Systems 193
Where e, iL , V are the node voltages and branch currents through inductors and
voltage sources, and the charges and fluxes q, φ. The functions r, qC and φL are
predetermined. Independent current sources iI and voltage sources vV may appear.
The incidence matrices AC , AL , AR , AV , AI follow from the topology of the circuit.
This system can be written in the general semi-explicit DAE form, [7, 12].
f : Rn × Rm × I → Rn , g : Rn × Rm × I → Rm . (6)
y˙ = f(
y , z, t), y(0) = y0 , (7)
0 = g(
y , z, t), z(0) = z0 . (8)
L : D × I × V → Rm , L (
x , t, u) = 0. (9)
y˙ = f(
y , z, u, t), y(0) = y0 , (10)
0 = g(
y , z, u, t), z(0) = z0 , (11)
u˙ = h(
y , z, u, t), u(t0 ) = u 0 . (12)
2.2 Multirate
x˙F = fF (
xF , zF , xS ), xF (0) = xF,0 , (13)
x˙S = fS (
xF , zF , xS ), xS (0) = xS,0 , (14)
0 = gF (
xF , zF , xS ), zF (0) = zF,0 . (15)
With differential variables xF ∈ RnF , xS ∈ RnS and algebraic variables zF ∈ RnZ ,
subscripts {F, S} indicating fast or slow dynamics, for t ∈ [t0 , t1 ] with consistent
194 M. W. F. M. Bannenberg et al.
gFz (
xF , zF , xS ) is invertible (16)
With l = 0, . . . , m − 1 for the micro grid and the coupling variables denoted by x¯F ,
z¯ F , x¯s . The coupling strategy is chosen to be the Coupled-Slowest-First approach as
this is shown to have a consistency of order 1 for the problem posed in [14]. First
the whole system is solved for the macro-step.
∗
xF,n+1 = xF,n + H fF ( ∗
xF,n+1 ∗
, zF,n+1 , xS,n+1 ), (20)
Where the step size H is chosen according to the slow dynamics. From this it follows
∗
that the fast solutions, xF,n+1 ∗
and zF,n+1 , are not accurate and discarded. Following
the micro-step integration the fast solutions are computed for l = 0, . . . , m − 1,
using linearly interpolated values for the slow variables.
Applying a spatial discretization to the PDE can result in large nonlinear ODE
systems. To reduce the computational effort needed in each time step to solve
this system MOR techniques are used. Due to the nonlinearity of the ODE most
conventional MOR techniques can be discarded as they are only applicable to linear
Combined MR-MOR for Coupled Systems 195
systems. Hence the chosen method for this system is a reduction by a Galerkin
projection, with a basis constructed by Proper Orthogonal Decomposition (POD),
[4]. This is then extended by the application of the Discrete Empirical Interpolation
Method (DEIM), [5], using a QR selection procedure (Q-DEIM), [6]. By using
a Galerkin projection a reduced model is constructed, [6]. Let Vr denote an r-
dimensional subspace spanned by the columns of V ∈ RnS ×r . The full state of
the slow subsystem xS is then approximated by xS ≈ V xS,r using model reduction
basis V . The reduced model of (13)–(15) is then defined by
x˙F = fF (
xF , zF , V xS,r ), xF (0) = xF,0 , (23)
x˙S,r = fS,r (
xF , zF , xS,r ), xS,r (0) = xS,r,0, (24)
0 = gF (
xF , zF , V xS,r ), zF (0) = zF,0 . (25)
With fS,r (
xF , zF , xS,r ) = V T fS (
xF , zF , V xS,r ). The reduced basis V is
constructed through POD. First a numerical simulation of the full system is
performed. From the numerical results of this simulation snapshots x1 , xi , . . . , xNS
are obtained, with xi = x(ti ) ∈ RnS for i = 1, . . . , NS . Then the POD snapshot
matrix is
X = ZΣY T , (27)
fS,r (
xS,r ) ≈ V T U (SU )−1 ST fS (V xS,r ). (29)
196 M. W. F. M. Bannenberg et al.
Using the interpolation of general nonlinear functions, outlined in Sect. 3.5 of [5],
a general nonlinear function can be represented as
[F ( y ) = Fi (
y )]i = Fi ( y (ji )),
yj i , yj i , . . . , yjni ) = Fi ( (30)
1 2 i
To maximise the effectiveness of the MR and MOR combination the following steps
are taken:
• Perform a benchmark simulation using a very large number of time steps to
obtain a very accurate snapshot matrix X.
• The reduced bases V and U are then constructed by taking the appropriate
columns of Z obtained through POD, and selection matrix S is constructed by
the Q-DEIM approach.
• Using the reduced bases, the reduced order system is integrated through time
using the Coupled-Slowest-First MR approach.
The computational approach of this is done by first using the scheme of (17)–(19)
with fS replaced by fS,r , as in (23)–(25), and then incorporating the Coupled-
Slowest-First approach. The coupling for the fast intermediate time-step is done by
using linear interpolated values. As these values don’t change during the Newton
iteration of solving the faster subsystem, computation time can be saved. By
computing the coupling values once for the first and last value and interpolating
between these values, expensive function evaluation of the lifted state vector can be
avoided.
3 Results
To test the accuracy and convergence of the MR-MOR integration scheme the
system is simulated in three settings:
• The full system with singlerate time integration.
• The full system with multirate time integration.
• The system with a reduced slow part and multirate time integration.
This is done for with intermediate micro-steps m = 5. Furthermore the POD-
QDEIM reduction factors r and g are chosen to be equal to the number of largest
singular values with σi > 1e − 15. The step sizes are obtained by integrating
the system with Nt = [8 16 32 64 128 256 512 1024]. For the simulation a
thermodynamic discretisation is chosen to have N = 101. The circuit is simulated
over a time interval from t0 = 0 to tN = 0.01125 seconds. The input signal v(t)
πt
is set to sin( 2.5e−3 )mV . The rest of the circuit and thermal settings are set to the
values as described in [2]. The reference solution is obtained from an SR integration
with N = 32, 000.
In Fig. 1 we see the difference between the reference solution and the simulated
solution in the final time-step. This is done for the output node u3 of the thermal-
electrical circuit. It clearly shows that the MR scheme outperforms the SR, as
for the same order of error the MR approach has a slower computation time.
-11
SR, MOR, MR-MOR comparison
10
SR
MR
MR-MOR
10 -12
Error in output
-13
10
-14
10
10 -2 10 -1 10 0 10 1
Computation time
4 Conclusion
From the numerical results it shows that the multirate implicit Euler scheme
combined with POD/Q-DEIM model order reduction results in an accurate solution
with a reduced computation time. The approximation errors seem to converge along
with the MR errors. An expected positive result is that for similar computation
times the application of MR improves the accuracy of the solution. Furthermore,
the additional application of POD/Q-DEIM reduction has a trivial impact on the
approximation error whilst even further reducing the computation time. Although
these results are positive a side note should be made. The reduction in computation
time of the POD/Q-DEIM reduction shows to be decreasing for smaller time steps
with much larger systems. This is likely due to the coupling structure of the test
problem but further investigation is needed. Other next steps will focus on numerical
analysis for a proof of convergence for the MR-MOR scheme and the extension to
general integration schemes. Besides the further investigation of DAE-ODE coupled
systems, first steps have been made towards a MR-MOR scheme for a DAE-DAE
coupled system.
Acknowledgments The authors are indebted to the funding given by the European Union’s
Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant
Agreement No. 765374.
References
1. E. Anderson et al., LAPACK Users’ Guide, vol. 9 (SIAM, New York, 1999)
2. A. Bartel, M. Günther, From SOI to abstract electric-thermal-1D multiscale modeling for first
order thermal effects. Math. Comput. Modell. Dynam. Syst. 9(1), 25–44 (2003)
3. P. Benner, M. Hinze, E.J.W. Ter Maten (eds.) Model Reduction for Circuit Simulation
(Springer, Berlin, 2011)
4. G. Berkooz, P. Holmes, J.L. Lumley, The proper orthogonal decomposition in the analysis of
turbulent flows. Annu. Rev. Fluid Mech. 25(1), 539–575 (1993)
5. S. Chaturantabut, D.C. Sorensen, Nonlinear model reduction via discrete empirical interpola-
tion. SIAM J. Sci. Comput. 32(5), 2737–2764 (2010)
6. Z. Drmac, S. Gugercin, A new selection operator for the discrete empirical interpolation
method—improved a priori error bound and extensions. SIAM J. Sci. Comput. 38(2), A631–
A648 (2016)
7. D. Estévez Schwarz, C. Tischendorf, Structural analysis of electric circuits and consequences
for MNA. Int. J. Circ. Theory Appl. 28(2), 131–162 (2000)
8. M. Galassi, GNU scientific library (2002). http://www.gnu.org/software/gsl/
Combined MR-MOR for Coupled Systems 199
Idoia Cortes Garcia, Jonas Pade, Sebastian Schöps, and Caren Tischendorf
Abstract Motivated by the task to design quench protection systems for super-
conducting magnets in particle accelerators we address a coupled field/circuit
simulation based on a magneto-quasistatic field modeling. We investigate how a
waveform relaxation of Gauß-Seidel type performs for a coupled simulation when
circuit solving packages are used that describe the circuit by the modified nodal
analysis. We present sufficient convergence criteria for the coupled simulation of
FEM discretised field models and circuit models formed by a differential-algebraic
equation (DAE) system of index 2. In particular, we demonstrate by a simple
benchmark system the drastic influence of the circuit topology on the convergence
behavior of the coupled simulation.
1 Introduction
Lumped circuit models, such as modified nodal analysis (MNA), are well-
established in electrical engineering. However, they neglect the spatial dimension
and therefore distributed phenomena like the skin effect. For certain devices, this
may lead to inaccuracies of unacceptable magnitude in the simulation, e.g. for
electric machines [14] or the quench protection system of superconducting magnets
in particle accelerators [1]. These cases call for field/circuit coupling [2, 16].
To solve such coupled systems, it is often advisable to use waveform relaxation
(WR) [7], since this iterative method allows for dedicated step sizes and suitable
solvers for the different subsystems, and even for the use of proprietary blackbox
solvers. The coupled field/circuit model considered here is a DAE in the time
domain after space discretisation of the field system. It is well-known that WR
I. C. Garcia · S. Schöps
Technical University of Darmstadt, CEM Group, Darmstadt, Germany
e-mail: [email protected]; [email protected]
J. Pade () · C. Tischendorf
Department of Mathematics, Humboldt University of Berlin, Berlin, Germany
e-mail: [email protected]; [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 201
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_20
202 I. C. Garcia et al.
can suffer from instabilities for DAEs unless an additional contraction criterion is
satisfied [7, 12]. This work presents coupled field/circuit models, which are DAEs
of index 2 [5], for the case where WR is convergent and the case where it diverges.
Furthermore, generalizing a convergence criterion of [12], a topological and easy-
to-check criterion is provided. Finally, we present numerical simulations verifying
the topological convergence criterion.
2 Field/Circuit Model
The first Eq. (1) represents the space-discrete field model based on the matrices
(M)ij = σ ωi · ωj dV, (K(a))ij = ν(a)∇ × ωi · ∇ × ωj dV , (3)
Ω Ω
which follow from the Ritz-Galerkin approach using a finite set of Nédélec basis
functions ωi [10] defined on the domain Ω; σ denotes the space-dependent
electric conductivity and ν(a) the magnetic reluctivity that can additionally depend
nonlinearly on the unknown magnetic vector potential a. The current through the
field device is described by im . The excitation matrix is computed from a winding
density function χj modelling the j -th stranded conductor [15] as
(X)ij = χj · ωi dV . (4)
Ω
M ȧ k + K(a k )a k − Xim
k
= 0, X ȧ k = vck−1 , (6)
E(x k )ẋ k + f (t, x k ) = P im
k
, P x k − vck = 0. (7)
The coupling variables are the current through and the voltage over the field device
im and vc , where im k is computed in (6) and is then given to (7) as input, and vice
k
versa for vc . The superscript k denotes the iteration index. A common choice for the
initial guess vc0 is constant extrapolation of the initial value.
We shall proceed as follows:
1. Lemmata 4 and 6 provide a DAE-decoupling of the EM field DAE (1) and the
MNA DAE (2), respectively.
2. Definition 5 introduces the concept of parallel CVR paths. Assuming their
existence and exploiting the previous decoupling Lemmata, Lemma 7 yields a
DAE-decoupling of the coupled WR iteration (6)–(7). Notably, it reveals the
structure of its inherent ODE, given by φ in Eq. (11).
3. The convergence Theorem 8 is a simple consequence of the previous Lemmata;
it shows that the existence of parallel CVR paths guarantees convergence of the
WR scheme (6)–(7).
For visual reasons, we shall write column vectors as (a, b, c).
Lemma 4 Let Assumption 2 hold. Then, for a given source term vc , there exists a
coordinate transformation (w, u) = T −1 a and a system of the form
u̇ + A1 u = A2 vc , w = Bu, im = G1 u + G2 vc (8)
such that (a, im ) solves Eq. (1) if and only if (u, w, im ) solves Eq. (8).
Proof For better readability and shortness we present the proof only for the slightly
more restrictive case where X M = 0, which is usually satisfied.
We equivalently transform the field DAE with new coordinates T α = a:
With α = (w, u) and u = (u1 , u2 ), the transformed DAE (9) has the detailed
form
Tker K(T α)Tker w + Tker K(T α)(X T⊥ )u = 0,
X X u̇1 = vc .
The underlined matrices are nonsingular due to Assumption 2, and Eq. (8) is
obtained by inversion and insertion.
Definition 5 A CVR path in a circuit is a path which consists of only capacitances,
voltages sources and resistances. An element has a parallel CVR path, if its incident
nodes are connnected by a CVR path.
Lemma 6 Let Assumption 3 hold. Then, for a given source term im , there exists a
coordinate transformation (y, z1 , z2 ) = T −1 x and a system of the form
Proof We apply Lemmata 4, 6 to the iterated subsystems (6), (7). This yields an
equivalent system
Since each field element has a parallel CVR path, z2k = g(t) does not depend on uk
anymore.
We insert vck−1 = P x k−1 = P T (y k−1 , z1k−1 , z2k−1 ) and z1k−1 and z2k−1 therein
to obtain, with g̃1 (t, y k−1 , uk−1 ) = g1 (t, y k−1 , g2 (t), ġ2 (t), uk−1 ),
u̇k = φ2 (t, uk , y k , uk−1 , y k−1 ) := −A1 uk + A2 P T (y k−1 , g̃1 (t, y k−1 , uk−1 ), g2 (t)).
Remark 9 The convergence result holds for arbitrary continuous initial guesses x 0
and for bounded intervals of arbitrary size, see e.g. [7, 11].
Remark 10 The MNA decoupling given in Lemma 6 shows that g1 depends on
z2 and the derivative ż2 . Hence, the system is most sensitive to perturbations of z2 .
The input of the EM field subsystem in the WR scheme is in fact a perturbation.
Therefore, the condition QP = 0 from Lemma 6 is crucial to derive Theorem 8.
If at least one EM field element has no parallel CVR path, then QP = 0. Then,
analogously to Lemma 7 and its proof, we find ṡ k = φ(t, s k , s k−1 , ṡ k−1 ), which is
guaranteed to converge only if φ is contractive in ṡ k−1 , see [7, 11].
4 Numerical Examples
EM EM
G L G L
C C
qv qi qv qi
n3 n3
(a) (b)
Fig. 2 Field/circuit coupling with model from Fig. 1 (CVR path is dashed). (a) Convergent case.
(b) Divergent case
mon mon
k=1 k=1
200 k=2 200 k=2
Potential e / V
Potential e / V
100 100
0 0
Fig. 3 Monolithic (“mon”) and WR solution for k = 1, 2 iterations. (a) Convergent case. (b)
Divergent case
5 Conclusions
Acknowledgments This work is supported by the ‘Excellence Initiative’ of the German Federal
and State Governments, the Graduate School of CE at TU Darmstadt and DFG grant SCHO1562/1-
2. Further, we acknowledge financial support under BMWi grant 0324019E and by DFG under
Germany’s Excellence Strategy – The Berlin Mathematics Research Center MATH+ (EXC-2046/1,
ID 390685689).
WR for Field/Circuit DAEs of Index 2 209
References
1 Introduction
The idea of operator splitting methods is based on the splitting of a complex problem
into a sequence of simpler sub-problems. Usually, one exploits some structural
properties of the separated operators belonging to the sub-problems, for example,
the linear behavior, the symmetric behavior or the stiff behavior that allows the
application of efficient integration methods to the sub-problems, see for instance
[8, 10–12]. For dynamical problems like ODEs or parabolic PDEs, additive operator
splitting are well established and appropriate. However, for constrained problems an
additive operator splitting method would usually fail. This becomes obvious when
comparing the simple problems
u = Au = A1 u + A2 u and Ax = A1 x + A2 x = b.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 211
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_21
212 M. Diab and C. Tischendorf
2 Circuit Modeling
In contrast to standard circuit modeling using the modified nodal analysis [9] we
consider the branch oriented loop-cutset modeling [3, 4]. It allows us to split the
operators in a natural way exploiting physical properties.
For a given circuit graph G with n node and b branches, select any tree and
remove all its links. Then replace each link once at a time, it will form a loop that is
called as fundamental loop. We select an orientation of the loop to coincide with that
of the link completing it. On the other hand, a fundamental cutset with reference to
a tree is a cutset formed with one tree branch and remaining links. The orientation
of a cutset is the same of that of the tree branch.
Definition 1 The fundamental loop matrix B ∈ Rb−(n−1)×b is defined by its entries
⎧
⎪
⎪
⎨1, if the branch j has the same orientation of fundamental loop i
bij = −1, if the branch j has the opposite orientation of fundamental loop i
⎪
⎪
⎩0, else.
Lemma 1 (Loop Equations, KVL [3]) Let v be the vector of branch voltages in
an electric network, then we have
Bv = 0 (1)
In general, matrix B is arranged such that the first columns correspond to entries of
links and the columns correspond to entries of tree branches, therefore
B = Bl Bt = I Bt
Splitting Methods for Linear Circuit DAEs of Index 1 in port-Hamiltonian Form 213
Lemma 2 (Cut-set Equations, KCL [3]) Let i be the vector of branch currents in
an electric network, then we have
Qi = 0 (2)
and similar to the columns re-arrangement of B, we get Q = Ql Qt = Ql I .
Theorem 1 (Orthogonality Relation [3]) For a given connected graph G , the
orthogonality relation between the fundamental loop matrix B and the fundamental
cutset matrix Q is given by BQ = 0.
The circuit equations consist of the loop equations (1) and cutset equations (2)
reflecting the Kirchhoff’s laws together with elements constitutive equations
For simplicity, we consider only RLC circuits since our focus is to demonstrate the
new splitting approach. We assume that all resistances, conductances, capacitances
and inductances show a globally passive behavior, i.e. their corresponding matrices
R, G, C and L are positive definite. In addition, the independent functions vs and
is for voltage and current sources are assumed to be continuously differentiable.
Notice that, we used in our approach the conductive description for all resistances
that belong to the tree and the resistive description for all resistances that does not
belong to the tree, see below.
An index-1 circuit DAE models a circuit network that does neither have an LI-
cutset nor a CV-loop, see [5]. Then we can construct a tree as follows [14]:
1. All capacitive elements and voltage sources belong to the tree.
2. All inductive elements and current sources do not belong to the tree.
3. Split resistors in such a way that all G-resistances belong to the tree and all R-
resistances do not belong to the tree.
214 M. Diab and C. Tischendorf
L
0 BLC 0 BLG R BRG
D= 0
0 C , J = QCL 0 , M= QCR 0 , S= QGR G
and
0 BI C 0 BI G BLV vs BRV vs BI V vs
Kx = QV L 0 , Ky = QV R 0 , rx = − QCI is , ry = − QGI is , rz = − QV I is .
with the positive definite diagonal matrix S1 and the skew-symmetric matrix S2
since BRG = −Q GR . Consequently, S is not symmetric (unless BRG = 0) but
positive definite and hence non-singular. Furthermore, we see that system (4) is a
port-Hamiltonian DAE in the sense of the definitions given in [6] and [13]. For
[13], one can choose x̃ = (x, y), z̃(x̃) = x̃, ỹ := −z and ũ = (is , vs ) where the
Splitting Methods for Linear Circuit DAEs of Index 1 in port-Hamiltonian Form 215
tilde notation refers to the variables in [13]. For [6], one can choose the space V =
{(x, y, z) : z + Kx x + Ky y = rz } with x̄ = (x, y, z), z̄(x̄) = (x, y), ȳ := B̄ z̄(x̄)
and ū = (is , vs ), where the bar notation refers to the variables in [6]. Since (4c)
can be interpreted as output equation for z, we consider only the reduced DAE
system (4a)–(4b) in the following.
Regarding the fact that additive splitting makes no sense for solving the con-
straints (4b), we propose a splitting approach based on the inherent ODE. Therefore,
we rewrite the DAE system (4a)–(4b) equivalently as
Next, we reformulate (6) with (5b) back as DAE and obtain the following splitting
approach (SADAE) for circuit index-1 DAEs.
1. Initialize x2 (t0 ) := x0 and n = 0.
2. Solve on [tn , tn+1 ] the first subsystem
The first subsystem (splitDAE 1) is in fact a Hamiltonian ODE system with the
Hamiltonian
1 1 1
H (x) = x Dx = vC CvC + iL LiL =: H (vC , iL ) (7)
2 2 2
216 M. Diab and C. Tischendorf
describing the total energy stored in the capacitors and inductors. We have
d
H (x) = x Dx = −x J x = 0
dt
since J is skew-symmetric. Obviously, H is a quadratic form. Consequently, we can
apply symplectic numerical methods to (splitDAE 1). They have the advantage to
preserve the total energy H stored in the capacitors and inductors [7].
The second subsystem (splitDAE 2a)–(splitDAE 2b) leads to non-symmetric but
positive definite linear systems after time discretization that allows the exploitation
of suitable iterative methods [2].
In order to verify the convergence of DAE operator splitting method, one has to rely
on the convergence of the ODE operator splitting method. For this reason, we define
the non-homogeneous Cauchy problem
where the initial condition u0 and the source function r are bounded. Let Δt denotes
the time step such that the following stability condition is satisfied
After time discretization, apply the following operator splitting algorithm (OSA)
u1 (t) = A1 u1 (t), t ∈ [tn , tn+1 ] and u1 (tn ) = unsp
u2 (t) = A2 u2 (t) + r(t), t ∈ [tn , tn+1 ] and u2 (tn ) = u1 (tn+1 )
Theorem 2 (See [1]) Under the boundedness and stability conditions formulated
above, the approximated splitting solution obtained from the operator splitting
algorithm (OSA) converges to the exact solution of the ODE (8).
If we denote by T (tn ) the solution operator of (8) at the n-th time step, and by Ts (tn )
the splitting solution operator, then we have: ||T (tn )u0 −Ts (tn )u0 || → 0 as Δt → 0.
Regarding the equivalence of the DAE system (splitDAE 2a)–(splitDAE 2b) to the
system
4 Numerical Simulation
We use a small RLC circuit example in order to demonstrate the operator splitting
approach for DAEs. It operates in a GHz regime as often used in chip design.
Using the tree in Fig. 1, we get for the circuit DAE system (4a)–(4c) the matrices
⎛ L1 0 0 0 0 0 ⎞ ⎛ ⎞ ⎛1
0 0 0 1 1 1 0 ⎞ ⎛ −vs ⎞
0 L2 0 0 0 0 0 0 0 0 −1 0 0 1 0
⎜ 0 0 L3 0 0 0 ⎟ ⎜ 0 −1 ⎟ ⎝0 −1 ⎠ ,
D=⎝ 0 0 0 C1 0 0 ⎠ , J = ⎝ −1
0
0
0
0
0
0
0
0 0 ⎠, M= 0 0 rx = ⎝ 0
0
⎠
0 0 0 0 C2 0 −1 1 0 0 0 0 0 0 0
0 0 0 0 0 C3 −1 0 1 0 0 0 0 0 0
and
& ' & '
G1 0 0
S= , Kx = −1 0 0 0 0 0 , Ky = 0, ry = , rz = 0.
0 G2 0
C1
R1 L1
C2 L2
V
R2
C3 L3
Fig. 1 Benchmark RLC-circuit. The dashed branches form the tree considered for the model
equations
218 M. Diab and C. Tischendorf
Fig. 2 Reference solution for inductive currents for circuit in Fig. 1(left). Error for numerical
solution of the three simulation variants with time stepsize h = 1e − 11 (right)
time stepsize h = 1e − 11 and the reference solution. The results show that the
solution of the DAE splitting approach (variant 2) is almost the same as for the non-
splitted solution (variant 1). It means that the error caused by splitting is neglectable
in comparison with the numerical discretization error. The use of the DAE splitting
approach with the symplectic Euler method (variant 3) gives the best results and is
even faster than the other variants since the symplectic Euler method for the first
subsystem (5a) is an explicit method.
In this paper, we extended the operator splitting method from ODEs to circuit
linear DAEs. Followed by the topological decoupling of circuit DAEs of index 1
in loop-cutset formulation, we were able to construct a suitable decomposition of
the matrices so that a natural port-Hamiltonian DAE structure is visible and can be
exploited for a convergent splitting approach that is explicit and energy preserving
in the dynamic part.
Acknowledgments This project has received funding from the European Union’s Horizon 2020
research and innovation program under grant agreement No 76504. Furthermore, we acknowledge
financial support by DFG under Germany’s Excellence Strategy – The Berlin Mathematics
Research Center MATH+ (EXC-2046/1, ID 390685689).
References
1. M. Bjórhus, Operator splitting for abstract cauchy problems. IMA J. Numer. Anal. 18, 419–443
(1988)
2. A.T. Chronopoulos, s-step iterative methods for (non)symmetric (in)definite linear systems.
SIAM J. Numer. Anal. 28(6), 1776–1789 (1991)
Splitting Methods for Linear Circuit DAEs of Index 1 in port-Hamiltonian Form 219
3. L.O. Chua, C.A. Desoer, E.S. Kuh, Linear and Nonlinear Circuits (McGraw-Hill, Singapore,
1987)
4. C.A. Desoer, E.S. Kuh, Basic Circuit Theory. International student edition (McGraw-Hill, New
York, 1984)
5. D. Estévez Schwarz, C. Tischendorf, Structural analysis of electric circuits and consequences
for MNA. Int. J. Circ. Theory Appl. 28(2), 131–162 (2000).
6. M. Günther, A. Bartel, B. Jacob, T. Reis, Dynamic iteration schemes and port-Hamiltonian
formulation in coupled circuit simulation, 2020. arXiv:2004.12951v1
7. E. Hairer, G. Wanner, C. Lubich, Geometric Numerical Integration, vol. 31. Springer Series
in Computational Mathematics (Springer, Berlin, Heidelberg, 2002)
8. E. Hansen, A. Ostermann, Dimension splitting for quasilinear parabolic equations. IMA J.
Numer. Anal. 30(3), 857–869 (2010)
9. C.-W. Ho, A. Ruehli, P. Brennan, The modified nodal approach to network analysis. IEEE
Trans. Circ. Syst. 22(6), 504–509 (1975)
10. M. Hochbruck, T. Jahnke, R. Schnaubelt, Convergence of an adi splitting for maxwell’s
equations. Numer. Math. 129(3), 535–561 (2015)
11. H. Holden, K. Karlsen, K. Lie, N. Risebro, Splitting Methods for Partial Differential Equations
with Rough Solutions (European Mathematical Society, Zürich, 2010)
12. W. Hundsdorfer, J.G. Verwer, A note on splitting errors for advection-reaction equations. Appl.
Numer. Math. 18(1), 191 – 199 (1995)
13. V. Mehrmann, R. Morandin, Structure-preserving discretization for port-Hamiltonian descrip-
tor systems (2019) arXiv:1903.10451v1
14. R. Riaza, Differential-Algebraic Systems: Analytical Aspects and Circuit Applications (World
Scientifc, Singapore, 2008)
15. C. Tischendorf, R. Lamour, R. März, Differential-Algebraic Equations. A Projector Based
Analysis (Springer, Hamburg, 2012)
Reduced Order Modelling for Wafer
Heating with the Method of Freezing
Abstract Accurate and real-time temperature control for wafer heating is one of
the main challenges in semiconductor manufacturing processes. With reduced-order
modelling (ROM), the computational complexity of the mathematical model can be
decreased in order to solve the model quickly at a low computational cost, while
still maintaining the computational accuracy. However, the translating temperature
profile, due to moving sources, render the standard reduction approaches to be
ineffective. We propose to invoke the concept of the “Method of Freezing” and
use it in conjunction with the standard ROM approaches to obtain an effective low-
complexity model. We finally assess the effectiveness of the proposed approach on
the 2-dimensional heat equation with moving heat loads. Numerical results clearly
show the potential of the proposed approach over the standard one in terms of
computational accuracy and the dimension of the resulting reduced-order model.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 221
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_22
222 E. J. I. Hoeijmakers et al.
the desired temperature at every place on the wafer [1]. However, this remains a
challenge since standard numerical methods take a lot of computational time, and
the increased resolution requirements due to the reduced feature sizes slow the
model down.
Reduced-order modelling (ROM) reduces the model complexity and aids in real-
time prediction of the quantity of interest. Translating temperature profiles, due to
moving sources, render the standard ROM approaches ineffective [2]. Hence, we
propose to invoke the “Method of Freezing” along with standard ROM approaches
in order to obtain an effective low-complexity model computable in real-time.
The concept of the “Method of Freezing” has been applied on parabolic and
hyperbolic problems in the past [3]. However, [4] is the only work which so far
exploits the “Method of Freezing” for non-linear reduced basis approximations.
This work considers a numerical experiment, which falls in the realm of hyperbolic
problems, namely the parameterized Burgers-type problem in 2D (without source
terms).
The main contribution of this work is to use the “Method of Freezing” in
conjunction with standard ROM approaches to facilitate accurate and real-time
prediction of the temperature. The “Method of Freezing” relies on an ansatz
that decomposes the original dynamics into shape and travelling dynamics. The
resulting shape dynamics is amenable for an efficient basis generation. We then
use these generated bases to apply Proper Orthogonal Decomposition (POD) on
the shape dynamics and, ultimately, obtain a reduced-order model. We finally
assess the performance of the combined approach of the “Method of Freezing” and
reduced basis approximations on a test-case of practical relevance, and discuss the
computational merits of the proposed ROM approach over the standard one.
The paper is organized as follows. In Sect. 2.1, we introduce the 2-dimensional
heat equation and discuss the numerical method for its discretization. We invoke the
idea of the “Method of Freezing”, reformulate the model problem and present the
corresponding discretized representation in Sect. 2.2. A Galerkin-type projection-
based ROM is performed on a semi-discrete model representation in Sect. 3. A
numerical case-study is presented in Sect. 4 to showcase the effectiveness of the
proposed approach. Finally, Sect. 5 ends with conclusions and future works.
2 Theory
In this section, we first introduce the model and the numerical method employed
for the spatio-temporal discretization. We then introduce the idea of the “Method of
Freezing” and present a model reformulation and its discrete representation.
Reduced Order modelling for Wafer Heating with the Method of Freezing 223
To model the wafer heating, we use the well-known heat equation in two-
dimensions. As the height of the wafer is one order of magnitude less than the
length and the width of the wafer, the temperature gradient along the thickness
of the wafer is very small. This makes the 2-dimensional heat equation a good
approximation of the real situation. The 2-dimensional heat equation is governed
by:
& '
∂u ∂ 2u ∂ 2u
−α + 2 = Q(x, y, t), (x, y) ∈ Ω, t ∈ [0, tf ], (1)
∂t ∂x 2 ∂y
u(x, y, t = 0) = u0 , (2)
∂u ∂u
nx + ny = 0 on ∂Ω, (3)
∂x ∂y
where u represents the wafer temperature, u0 stands for a constant initial tempera-
ture, Ω stands for the spatial domain of interest, n = (nx , ny ) denotes the normal
to the boundary ∂Ω, tf indicates the final simulation time, and α is the thermal
diffusivity constant. The thermal diffusivity constant can be expressed with the
thermal conductivity k, the specific heat capacity Cp and the density ρ of the wafer
in the form α = ρCk p . Here, a moving heat load Q(x, y, t) is assumed to be of the
non-affine form:
2
x−cx t y−cy t 2
− 12 − 12
Q(x, y, t) = e σx σy
, (4)
where cx and cy are the speeds of the heat load in the x- and y-direction, respectively
and, the variance of the Gaussian distribution along the x- and the y-direction is
given by σx2 and σy2 , respectively.
After multiplying (1) by a smooth test-function w, integrating over the domain
and invoking Green’s theorem, a weak formulation of the 2-dimensional heat
equation can be constructed, resulting in:
& '
∂u ∂u ∂w ∂u ∂w ∂u
wdA + α · dA + · dA − α wds = QwdA,
Ω ∂t Ω ∂x ∂x Ω ∂y ∂y ∂Ω ∂n Ω
(5)
where dA = dxdy and ds is a boundary surface element. Using (3), the fourth term
on the left-hand-side of (5) cancels out[5].
In order to solve (5) numerically, discretization in space and time is necessary.
We discretize the domain such that the structured mesh aligns with the orientation
of the features which need to be printed. We then employ a finite element method to
discretize in space. We approximate the solution with a summation over B-spline
224 E. J. I. Hoeijmakers et al.
N
basis-functions φi , u = i=1 ui (t)φi (x) [6]. Here, N is the number of finite
elements used in the domain discretization and ui is the weight of every basis
function. To discretize in time, the first-order backwards Euler method is applied
as is also used in Chap. 8 of [5]. Discretizing in both space and time results in the
following equation:
where M is the mass matrix, D is the diffusion matrix, Q̃ is the source vector
representative of the moving heat loads and Δt indicates the time-step. Equation (6)
needs to be solved for every time instant k + 1.
The numerical solution will be at most first-order accurate if the first-order
backwards Euler method is applied in conjunction with the higher-order spatial
discretization. However, in this paper, we are not concerned about the order of
accuracy of the numerical solution, but intend to show the potential of the “Method
of Freezing”. To this end, the first-order temporal discretization is representative
enough for quantifying the numerical performance, while being simple to imple-
ment. The implementation of a higher-order temporal discretization is deferred to
future works.
We will now discuss a change of coordinates or so-called “Method of Freezing”
that we propose to use in conjunction with standard ROM techniques to obtain an
effective complexity reduction for problems with moving heat load(s).
This modified heat equation is quite similar to the original equation given in (1),
except the additional second and third term on the left-hand side which represent an
Reduced Order modelling for Wafer Heating with the Method of Freezing 225
extra convection term. The weak formulation of (8) under zero Neumann boundary
conditions is given by:
∂v ∂v ∂v
wdξx dξy − cx wdξx dξy − cy wdξx dξy
Σ ∂t Σ ∂ξx Σ ∂ξy
& '
∂v ∂w ∂v ∂w
−α dξx dξy + dξx dξy = Q(ξx , ξy )wdξx dξy ,
Σ ∂ξx ∂ξx Σ ∂ξy ∂ξy Σ
(9)
where Σ represents the transformed domain as per the coordinate transformation.
Discretizing (9) in space and time yields:
where M and D are, respectively, the mass and diffusion matrix, and C is the
convection matrix.
Although we consider constant cx and cy , the “Method of Freezing” can handle
time-dependent speeds by adding an ingredient known as phase conditions; see [3].
In this section, we build a reduced-order model both via the standard and the
proposed ROM approach. The standard and the proposed ROM approach, built upon
a Galerkin type projection-based ROM methodology [7], is discussed in Sects. 3.1
and 3.2, respectively.
where Dred = P T DP and Mred = P T MP are the reduced diffusion and mass
matrices, respectively.
226 E. J. I. Hoeijmakers et al.
The proposed novel ROM approach employs the “Method of Freezing” in conjunc-
tion with standard projection-based reduction techniques. We again employ SVD.
However, in this proposed framework, the SVD is performed on the v snapshot
matrix, instead of the u snapshot matrix. We now obtain a projector LT : Vh →
Vr where Vh is a h-dimensional high-fidelity space and Vr is a r-dimensional
reduced space spanned by the functions obtained from a truncated singular value
decomposition of the v snapshot matrix. Finally, the proposed (frozen) reduced-
order model is:
k+1 k+1 k+1
Mred,p vred + αΔtDred,p vred − ΔtCred,p vred − LT Δt Q̃k+1 = Mred,p vred
k
,
(12)
4 Numerical Results
Fig. 1 Singular value decay behavior for the proposed and the standard approach
decay behavior is known to give a good expectation about the possible reduction
in the dimensionality of the full-order model. In Fig. 1, the singular value decay
behavior for the proposed and the standard ROM approach is shown. It can be
observed that incrementing the number of POD modes by one yields a sharp
initial decrease in the singular values both for the proposed and the standard ROM
approach. However, post the sharp decay, we can see that the singular values
corresponding to the proposed approach decay faster than the one corresponding to
the standard approach. An initial sharp decrease is attributed to the fact that only a
single mode is representative enough to capture the mean temperature on the silicon
wafer. Other modes are required to accurately determine the change (with respect
to the mean) in the temperature due to the moving heat loads. The observed decay
behavior clearly indicates a possibility of an effective dimensionality reduction if
the “Method of Freezing” is used together with the standard ROM techniques.
Further computational benefits of the proposed approach over the standard one
can be clearly seen in Fig. 2, which shows the behavior of the reduced-order
modelling (ROM) error for increasing dimensions of the reduced-order model.
We assess the error of the standard and proposed approaches in the (absolute)
L2 -norm in space and time. The error via the standard approach corresponds
to the difference between the finite-element based numerical solution u and the
reconstructed solution obtained by lifting the standard reduced-order solution
ured , obtained in (11), to the high-dimensional problem space. And, the error via
228 E. J. I. Hoeijmakers et al.
Fig. 2 ROM error for the proposed and the standard approach versus varying dimensions of the
reduced-order model
smaller than the counterpart obtained using the standard reduction approach in order
to have the same accuracy.
Acknowledgments G. van Zwieten, J. van Zwieten, C. Verhoosel, E. Fonn, T. van Opstal, & W.
Hoitinga. (2019, June 11). Nutils (Version 5.0). Zenodo. https://doi.org/10.5281/zenodo.3243447.
References
1. M. Rabus, A.T. Fiory, N.M. Ravindra, P. Frisella, A. Agarwal, T. Sorsch, J. Miner, E. Ferry, F.
Klemens, R. Cirelli et al., Rapid thermal processing of silicon wafers with emissivity patterns.
J. Electron. Mater. 35(5), 877–891 (2006)
2. M. Ohlberger, S. Rave, Reduced basis methods: Success, limitations and future challenges, in
Proceedings of the Conference Algoritmy (2016), pp. 1–12
230 E. J. I. Hoeijmakers et al.
3. W.J. Beyn, V. Thummler, Freezing solutions of equivariant evolution equations. SIAM J. Appl.
Dynam. Syst. 3(2), 85–116 (2004)
4. M. Ohlberger, S. Rave, Nonlinear reduced basis approximation of parameterized evolution
equations via the method of freezing. Comptes Rendus Mathematique 351(23–24), 901–906
(2013)
5. T.J.R. Hughes, The Finite Element Method: Linear Static and Dynamic Finite Element Analysis
(Courier Corporation, North Chelmsford, 2012)
6. L. Piegl, W. Tiller, Curve and surface constructions using rational B-splines. Comput.-Aided
Des. 19(9), 485–498 (1987)
7. P. Benner, W.H.A. Schilders, S. Grivet-Talocia, A. Quarteroni, G. Rozza, M. Silveira Luís,
Snapshot-Based Methods and Algorithms. Model Order Reduction, vol. 2 (De Gruyter, Berlin,
Boston, 2020)
8. R. Hull, Properties of crystalline silicon. No. 20. IET (1999)
9. M. Barrault, Y. Maday, N.C. Nguyen, A.T. Patera, An ‘empirical interpolation’ method:
application to efficient reduced-basis discretization of partial differential equations. Comptes
Rendus Math. 339(9), 667–672 (2004)
Multirate DAE-Simulation
and Its Application in System Simulation
Software for the Development of Electric
Vehicles
M. Kolmbauer ()
MathConsult GmbH, Linz, Austria
e-mail: [email protected]
G. Offner · R. U. Pfau
AVL List GmbH, Graz, Austria
e-mail: [email protected]; [email protected]
B. Pöchtrager
Radon Institute for Computational and Applied Mathematics (RICAM), Austrian Academy of
Sciences, Linz, Austria
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 231
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_23
232 M. Kolmbauer et al.
2 Problem Formulation
1 https://www.avl.com/de/cruise-m.
2 http://www.dynasim.com.
3 http://www.plm.automation.siemens.com.
Multirate DAE-Simulation of Electric Vehicles 233
Electric Network
We consider an electric network NE = {R, C, L, V , I, N, G, B} that is
composed of resistors R, capacitors C, inductors L, voltage sources V , current
sources I , nodes N, grounds G and batteries B. The DAE for the network in NE in
input-output form is given by: For given continuous inputs (uTR , uTC , uTB )T find the
potentials e = (eNT , e T )T , the currents j = (j T , j T , j T , j T , j T )T and the outputs
G R C L V B
y = (yR ) , such that
T T
AR jR + AC jC + AL jL + AV jV + AV jB + AI j¯I = 0
r(uR )jR − ATR e = 0
d(c(uC )ATC e)
jC − =0
dt
djL (1)
l − ATL e = 0
dt
ATV e = v̄V
ATB e = v̄B (jB , uB )
yR = jR ATR e
dTSw
mSw cp,Sw = ASw,H tS HH tS + ASw,H s HH s + ASw,H su uH sS
dt
0 = ALw,H tS HH tS + ALw,H s HH s + ALw,H su uH sS
HH tS = cH tS ATSw,H tS TSw + ATLw,H tS TLw + ATT b,H tS TT b + ATT bu ,H tS uT bS
yH tS = AT bu ,H tS HH tS
(2)
234 M. Kolmbauer et al.
for given boundary conditions HH s = H̄H s and TT b = T̄T b and given positive
definite coefficient matrices mSw , cp,Sw and cH tS . The coupling variables are
expressed as the energy fluxes uH sS and uT bS and the temperatures ySw , yLw and
yH tS .
Fluid Network
We consider a fluid network NF = {P I, P U, DE, V J, LJ, RE, H T , T B} that
is composed of pipes P I , pumps P U , demands DE, volume junctions V J , lumped
junctions LJ , reservoirs RE, heat transfers H T and temperature boundaries T B.
The DAE for the network NF in input-output form is given by: For given continuous
inputs (uTH sF , uTT bF )T , find the pressures (pLj
T , pT )T the mass flows (q T , q T )T ,
Vj Pi Pu
the temperatures (TVj , TLj ) , the heat fluxes (HHT tF , HPT u , HPT i )T and the outputs
T T T
T , y T , y T )T , such that
(yVj Lj H tF
dqP i
= c1,P i ATJ c,P i pJ c + ATRe,P i pRe + c2,P i diag (|qP i |) qP i + c3,P i
dt
fP u (qP u ) = ATJ c,P u pJ c + ATRe,P u pRe
Multi-Physical Model
The multi-physical model is derived by combining (1), (2) and (3) with appropri-
ate coupling conditions. The coupling conditions describe the relation between the
inputs and outputs of the individual models. For the model used in Sects. 4 and 5,
the following coupling conditions are used, see e.g. [9].
⎛ ⎞ ⎛ ⎞⎛ ⎞
uR 0 CR,Sw 0 0 0 0 0 yR
⎜ u ⎟ ⎜ 0 ⎟⎜y ⎟
⎜ C ⎟ ⎜ CC,Sw 0 0 0 0 0 ⎟ ⎜ Sw ⎟
⎜ u ⎟ ⎜ 0 ⎟⎜y ⎟
⎜ B ⎟ ⎜ CB,Sw 0 0 0 0 0 ⎟ ⎜ Lw ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ uH sS ⎟ = ⎜CH sS ,R 0 0 0 0 0 CH sS ,H tF ⎟ ⎜ yH tS ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ uT b S ⎟ ⎜ 0 0 0 0 CT bS ,Vj 0 0 ⎟ ⎜ yVj ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝uH sF ⎠ ⎝ 0 0 0 CH sF ,Vj 0 0 0 ⎠ ⎝ yLj ⎠
uT b F 0 CT bF ,Sw 0 0 0 0 0 yH tF
(4)
The connectivity equation (4) represents the electro-thermal coupling of the electric
network and the cooling systems. Combining all subsystems and their connectivity
equations (4) yields a DAE:
Find
such that
F (ż, z, t) = 0. (5)
In our multirate approach the full DAE (5) is partitioned due to the physical
background to n ∈ N subsystems (typically n ' 2). Each subsystem is index
reduced according to the available literature, cf. [4–6]. Since in the global network
the individual subsystems are interacting with each other, i.e. inputs and outputs
are connected according the connectivity equation (4), it is necessary to put it into
an input-output form. For this purpose, each subsystem i = 1, . . . , n classifies its
inputs ui , state variables xi , algebraic variables ai and outputs yi . To conclude, this
approach yields a coupled system of n semi-explicit DAEs in input-output form of
(differential) index 1. For inputs ui given by Eq. (4), find xi , ẋi , ai and yi , such that
ẋi = fi (xi , ai , ui , t)
0 = ri (xi , ai , ui , t) (6)
yi = gi (xi , ai , ui , t)
Synchronization time
macro-step
micro-step
...
tk = tki 0 tki 1 tki 2 tki n = tk+1
Fig. 2 Schematic representation of a BEV with cooling system in AVL CRUISE™M. The
corresponding results are displayed in Fig. 3 and Table 1
238 M. Kolmbauer et al.
Fig. 3 Comparison of elapsed time of a multirate case against a single solver case
Table 1 Comparison of singlerate and multirate approach corresponding CPU-time and average
real time factor (RTF)
Case CPU-time Avg RTF
Singlerate 144.98 0.805447
Multirate 21.03 0.116853
adaptive explicit solvers [7]. Hence the step size of the single solver is limited to
the minimum step size of all subdomains, while the multirate approach is limited
to the synchronization time or to the characteristic of its own domain. Here the
synchronization times are after each macro-step of 20 ms.
The simulation time of a singlerate case (in red) is compared with those of a
multirate case (in blue) using AVL CRUISE™ M, cf. Fig. 3. A significant speed up
in the calculation time can be achieved, while the accuracy of the solution is still
sufficiently high due to the adaptivity of the individual solvers (Table 1).
Fig. 4 Comparison of elapsed time of a multirate case against a single solver case
Table 2 Comparison of singlerate and multirate approach corresponding CPU-time and average
real time factor
Case CPU-time Avg RTF
Singlerate 303.38 606.76734
Multirate 37.37 74.73045
In total this example consists of 178 equations which are spread over 20 solvers.
A fluid circuit, seven gas circuits and eleven thermal circuits are responsible for
modeling the cooling. In the multirate scheme each circuit is solved individually
with one scheme. For all of them an explicit fixed step method with a step size of
1ms is used. On the other hand the electric network is solved by its own scheme as
well. Again an explicit fixed step method is used, whereby the chosen step size is
now 1μs. The information exchange takes place after each macro-step of 1ms. This
model is of special interest, since the electric network and the fluid network run on
completely different time scales (of order O(1000)). Again significant speed up in
the calculation time can be achieved (Fig. 4 and Table 2).
240 M. Kolmbauer et al.
6 Conclusion
Acknowledgments Part of this work has been supported by the government of Upper Austria
within the programme Innovatives Oberösterreich.
References
1. A. Bartel, M. Günther, PDAEs in refined electrical network modeling. SIAM Rev. 60, 56–91
(2018)
2. A. Bartel, M. Günther, Multirate Schemes - An Answer of Numerical Analysis to a Demand
from Applications. IMACM Preprint, No. 2019–12, University of Wuppertal (2019)
3. A. Bartel, M. Günther, Inter/extrapolation-based multirate schemes – a dynamic-iteration
perspective (2020). Available at https://arxiv.org/abs/2001.02310
4. A.-K. Baum, M. Kolmbauer, G. Offner, Topological solvability and DAE-index conditions for
mass flow controlled pumps in liquid flow networks. Electr. Trans. Num. Anal. 46, 395–423
(2017)
5. D. Estévez Schwarz, C. Tischendorf, Structural analysis of electric circuits and consequences
for MNA. Int. J. Circ. Theor. Appl. 28, 131–162 (2000)
6. S. Grundel, L. Jansen, N. Hornung, T. Clees, C. Tischendorf, P. Benner, Model order reduction
of differential algebraic equations arising from the simulation of gas transport networks, in
Progress in Differential-Algebraic Equations (Springer, Berlin, 2014), pp. 183–205
7. A. Hindmarsh, P. Brown, K. Grant, S. Lee, R. Serban, D. Shumaker, C. Woodward, SUNDIALS:
suite of nonlinear and differential/algebraic equation solvers. ACM Trans. Math. Softw. 31, 363–
396 (2005)
8. L. Jansen, C. Tischendorf, A unified (P)DAE modeling approach for flow networks, in Progress
in Differential-Algebraic Equations (Springer, New York, 2014), pp. 127–151
9. M. Kolmbauer, G. Offner, B. Pöchtrager, Topological index analysis and its application to multi-
physical systems in system simulation software (2020). Available at https://www.ricam.oeaw.ac.
at/files/reports/20/rep20-22.pdf
A Hysteresis Loss Model for Tellinen’s
Scalar Hysteresis Model
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 241
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_24
242 J. Kühn et al.
The outline of this work reads: First, we introduce Tellinen’s model [8]. Then,
we define the loss model for a steady state and apply it as an approximation for
any nearly steady state. Based on properties of the model, we justify the use of
steady state approximation also with numerical results. At the end, conclusions and
an outlook are given.
d + +
B (h) ≥ μ0 > 0 , lim d Bsat (h) = μ0 . (2)
dh sat |h|→∞ dh
Any current state of the material (h0 , b0 ) has to belong to the loop region I
, ) + -
I = (h, b) ∈ R2 ) Bsat −
(h) ≤ b ≤ Bsat (h) . (3)
±
Fig. 1 Left: Example of Bsat and path starting from demagnetized state. Right: Schematic of
Tellinen’s model. Defining the values on the boundary and interpolate in between
A Hysteresis Loss Model for Tellinen’s Scalar Hysteresis Model 243
Now, the set of all possible states I can be split into disjoint sets I = , I < and I >
with
) )
I = {(h, b) ∈ I ) μ+ − )
diff (h, b) μdiff (h, b)} = {(h, b) ∈ I λ λ }
eq
(9)
)
= {(h, b) ∈ I ) beq (h) b} for all ∈ {=, <, >}.
±
Fig. 2 Two examples with different saturation curves Bsat and the resulting equilibrium curve beq
≶
as well as the sets I . We remark that the right example is a very academic version
There exists several hysteresis loss models for different kinds of hysteresis descrip-
tions [1–3, 7]. A prediction of the losses is already presented in the original
Tellinen model [8]. But the used method for the hysteresis losses is based upon
a posteriori evaluation of the simulated fields b, h. Our model differs in this
respect and provides a method for calculating losses at runtime. An overview and
classification of different hysteresis loss models is e.g. given in [4]. In principle, the
loss model presented below will work for other hysteresis models, too. However,
it is particularly well suited for Tellinen’s model [8] with the respective thermal
extension [5] due to its structure and properties. First, we define the loss model for
the steady state and then we extend it to almost steady state situations.
Idea We follow the approach of distributed simulation (Co-Simulation). A sim-
ulation of the heat equation describes the behavior of the temperature. The
presented loss model provides corresponding source terms. Often in applications,
e.g. electric machines, the rate of changes in the magnetic fields are several orders
of magnitude faster than changes in temperature. In this setting, the assumption of a
constant-temperature while handling magnetic fields is often exploited in distributed
simulation techniques.
For a simple, closed loop in the bh-plane, the enclosed area represents an energy
density (J/m3 ) and the material specific volumetric heat capacity cV (J/(m3 K))
provides the conversion into a temperature change ΔT (K). In a steady state, the
material periodically passes through the same phases over and over again and, for
this reason, runs on a closed bh curve. For memory reasons and the fact, that a priori
the stable loop is unknown, we do not want to save the complete history of the curve,
but calculate it from within the simulation on-the-fly, i.e., at runtime. To this end,
we reverse the hysteresis model computation to predict the return path of the curve
from a turning point at the same time as we compute the forward, see e.g. Fig. 4
right, where the curves p+ (forward) and p− (backward) will be computed at the
same time.
A Hysteresis Loss Model for Tellinen’s Scalar Hysteresis Model 245
b b
h h
Fig. 3 Left: An example with minor hysteresis loops. This case is excluded by simple excitation.
Right: An example with an intermediate intersection
(h1 , b1 ) (h1 , b1 )
b b
p− h p− h
p+ p+
(h0 , b0 ) (h0 , b0 )
Fig. 4 Left: paths p+ and p− form a simple loop without intersection. Right: an incomplete cycle
and its incremental section of the area. Notice, p− is traversed in reverse direction
Prerequisites First, we consider a simple excitation such that the magnetic field
strength is monotonously increased from h0 to h1 and then monotonously decreased
back to h0 . This ensures that there is no minor hysteresis loop (see left of
Fig. 3). Still, this is not sufficient to ensure that the bh-loop has no (intermediate)
intersections (see Fig. 3, right). Below, sufficient conditions are presented.
Now, let (h0 , b0 ) and (h1 , b1 ) denote the turning points of a simple loop (cf.
Fig. 4). Then, the loop can be split into two paths p+ , p− : [h0 , h1 ] → R with
p+ (h) < p− (h) for all h ∈ (h0 , h1 ) , p+ (hk ) = p− (hk ) = bk k ∈ {0, 1}.
(10)
On-the-Fly Algorithm To initiate the loss model, we assume that the current state
(h0 , b0 ) is a turning point of a simple hysteresis loop, where h is (wlog) increased.
Thus, the simulation will follow the curve p+ (using μ+ diff ), see Fig. 4. Now, a
second computation is simultaneously performed based on μ− diff to follow the reverse
direction, which results in a prediction of the return path p− . Both simulations might
use e.g. an ODE solver. For discrete steps, the resulting trapezoids (see Fig. 5) can
be summed up to approximate the loop area. The incorporated halving of the area
takes into account that there is a forward and backward phase. This continues until
the second reversal point is reached. The procedure is then restarted from this point.
246 J. Kühn et al.
b
(h1 , b1 )
p− h
p+
(h0 , b0 )
Fig. 5 For discrete points, the model results in trapezoids
b b
h h
Fig. 6 Nearly steady state: actual turning point after (left) and before (right) the intersection of
the curves p+ and p−
Non-steady State Loss Computation For simple closed bh-loops (steady state)
the proposed method is accurate up to numerical precision of the employed
solver. Now, if the turning point (h1 , b1 ) cannot be determined accurately by the
intersection of p+ and the predicted curve p− (as depicted in Fig. 6), we are not in
steady state. We can prove (via some fixed-point argument) that this model exhibits
convergent behavior for simple periodic inputs. Due to this, non-steady states are
converging to the steady state. Numerical examples are presented in Sect. 4. If the
difference between the actual and predicted reversal point is small enough (criteria
set by user), we consider our model as a valid approximation and say it is nearly
steady state.
Analytical Results Next, we develop criteria that guarantee the existence of at least
one further intersection point (h1 , b1 ) based on a turning point (h0 , b0 ) and the
corresponding paths p+ and p− . In a second step we then investigate when exactly
only the intersection points (h0 , b0 ) and (h1 , b1 ) exist. It is then shown that both
the Tellinen’s model [8] and the thermal extension [5] converge towards the steady
state.
A Hysteresis Loss Model for Tellinen’s Scalar Hysteresis Model 247
±
Fig. 7 Example with the same Bsat as in Fig. 2, two paths p+ and p− and beq . Left: Only two
intersections of p± . Right: More than two intersections of p±
+
Lemma 5 Let Bsat fulfil (12) and the temperature T be constant. We start from
operation point (h0 , b0 ) ∈ I and h varies periodically between h0 and h1 with
h0 < h1 . The sequence of b-values Bk (k ∈ N) at the turning point given by h0
(computed by Tellinen’s model) is convergent for k → ∞. The resulting stable loop
is unique and depends only on the choice of h0 and h1 , but not on b0 .
Proof (sketch) First, we define one iteration. To this end, let b+ (h) be the solution
+ + +
+ +
dh = μdiff (h, b ) with b (h0 ) = b0 and μdiff as in (4). At h = h1 ,
of the ODE db
we have
h1
b1 = b+ (h1 ) = b0 + μ+ +
diff (h, b (h))dh . (13)
h0
−
The reverse direction is the same. Let b− (h) be the solution of db dh =
μ−diff (h, b − (h)) with b − (h ) = b . Evaluated at h , this results in b := b − (h )
1 1 0 2 0
(analog to (13)).
+ − + −
Let ϕ : [Bsat (h0 ), Bsat (h0 )] → [Bsat (h0 ), Bsat (h0 )] denote the resulting b-value
at h0 after one iteration (starting from b̄), i.e.,
h1 h0
ϕ(b̄) = b̄ + μ+ +
diff (h, b (h))dh + μ− −
diff (h, b (h))dh (14)
h0 h1
with b+ defined w.r.t. (h0 , b̄) (13) and b− to (h1 , b + (h1 )). Now, we construct a
sequence Bk via Bk+1 = ϕ(Bk ) and B0 = b0 . If ϕ(b̄) = b̄ holds, the resulting loop
would be closed, i.e., it is stable.
μ+ −
diff (h, b) and μdiff (h, b) are strictly monotone in the second component. This
+
causes two ODE solutions of db db = μdiff with different initial values to become
closer to each other. An analogous statement can be made for μ− diff .
Given (h0 , b0 ), (h0 , b1 ) ∈ I , we can prove, that q ∈ [0, 1) exists, such that
Fig. 8 Example of convergence. A random starting point (h0 , b0 ) ∈ I and h1 > h0 is chosen.
Periodically and monotonously alternating between h0 and h1 converges to a stable loop
4 Numerical Results
±
As an academic example, a material is defined by Bsat depicted Fig. 8, left. We
choose h0 = −1.8e5 A/m, h1 = 2.5e5 A/m and b0 = −1.5 T. Starting at
(h0 , b0 ) ∈ I , the ordinary differential equation dh
db
= μ+diff (h, b) is solved numer-
ically on the interval [h0 , h1 ] by a Runge-Kutta method. Then, dh db
= μ−diff (h, b)
is solved backward from h1 to h0 , where the initial value is the final value of
the previous computation. This procedure is repeated n times. This results in the
sequence of b-values at h0 : B0 , . . . , Bn . As seen in Fig. 8, right, the absolute
difference |Bn − Bn−1 | converges. Even a low number of loops n results in a nearly
steady state and thus, would allow us to apply the loss model presented above.
We have proposed a loss model with on-the-fly computation for Tellinen’s hysteresis
model. In steady state, it results in a precise computation of the model respective
hysteresis loss. Thus it is a valid approximation for nearly steady state. Moreover,
the convergence towards a stable bh-loop is proven. We note that our model is not
suited for complex waveforms or rotating fields, cf. [4]. As a model feature, we
stress that this hysteresis loss model employs only rough material data and needs
small computational and memory cost. Moreover, this model can be combined with
other than Tellinen’s hysteresis model, if the value of b can be computed for changes
in h. But the properties of the Tellinen’s model make it an almost optimal candidate.
Our next step will be the integration of this loss model into a finite element
magnetoquasistatic field simulation and its analysis. Special attention will be paid
to whether this model can be applied per node.
250 J. Kühn et al.
References
1. A.P.S. Baghel, S. Kulkarni, Dynamic loss inclusion in the Jiles–Atherton (JA) hysteresis model
using the original JA approach and the field separation approach. IEEE Trans. Magn. 50, 369–
372 (2014)
2. L.R. Dupre, R. Van Keer, J.A.A. Melkebeek, An iron loss model for electrical machines using
the Preisach theory. IEEE Trans. Magn. 33(5), 4158–4160 (1997)
3. G. Friedman, I.D. Mayergoyz, Hysteretic energy losses in media described by vector Preisach
model. IEEE Trans. Magn. 34(4), 1270–1272 (1998)
4. A. Krings, J. Soulard, Overview and comparison of iron loss models for electrical machines. J.
Electr. Eng. 10, 162–169 (2010)
5. J. Kühn, A. Bartel, P. Putek, A thermal extension of Tellinen’s scalar hysteresis model, in
Proceedings of the SCEE 2018, ed. by G. Nicosia, V. Romano. Springer (2020), pp. 55–63
6. S. Steentjes, K. Hameyer, D. Dolinar, M. Petrun, Iron-loss and magnetic hysteresis under
arbitrary waveforms in no electrical steel: A comparative study of hysteresis models. IEEE
Trans. Ind. Electron. 64(3), 2511–2521 (2017)
7. C.P. Steinmetz, On the law of hysteresis. Proc. IEEE 72(2), 197–221 (1984)
8. J. Tellinen, A simple scalar model for magnetic hysteresis. IEEE Trans. Magn. 34(4), 2200–2206
(1998)
Hybrid Modeling: Towards the Next
Level of Scientific Computing
in Engineering
Stefan Kurz
S. Kurz ()
Bosch Center for Artificial Intelligence, Renningen, Germany
Centre for Computational Engineering, Technical University of Darmstadt, Darmstadt, Germany
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 251
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_25
252 S. Kurz
Following again [3], we call this approach the Keplerian paradigm. Both paradigms
complement each other. For a simple enough model system, Kepler’s laws can be
derived from Newton’s theory. Conversely, starting from a two-body model system,
actual trajectories of celestial bodies can be modeled by Newton’s laws plus data-
driven terms that correct for perturbations due to effects that are not present in the
model.
In modern terms, we call this complementary approach hybrid modeling.
Definition Hybrid models combine first principle-based models with data-based
models into a joint architecture, supporting enhanced model qualities, such as
robustness and explainability.
First principles express formalized domain knowledge. For the purpose of this
paper the domain knowledge results from physics. But there are other possibilities,
such as statistics (e.g., probabilistic graphical models [7, Ch. 8]) or discourse
(e.g., ontologies [8]). Data may be obtained from any source, in particular from
observation or simulation. We find also the somewhat narrower terms scientific
machine learning [9], physics-based machine learning [10] and predictive data
science [4], respectively.
Consider a high-dimensional manifold that contains some big data. It might
be that a submanifold can be identified, which is dictated to us by the laws of
physics, e.g. regarding admissible system dynamics. Learning algorithms can then
be used to project the data into this submanifold. In other words, the structure of
submanifold embeds physical constraints. A classical example is Kálmán filtering
[11], and an example is presented in Sect. 2. Kálmán filtering was actually an
enabling technology for the moon landing in 1969, where the goal was landing
within ≈500 m after ≈400,000 km of travel. More preference is given to physics
or data, depending on the level of uncertainty. Ensemble Kálmán filtering is used
in weather forecasting centres worldwide [12]. They have to deal with about 106
incoming data points per hour, and mathematical models with about 109 states.
Ensemble Kálmán filtering can be recognized as Gaussian hidden Markov model
[13]. This use case is similar to digital twinning, since data from the field is acquired
and used to update the models. Citing [4, p. 39]: “Learning from data through the
lens of models is a way to exploit structure in an otherwise intractable problem.”
Looking closer into engineering, we notice that a large class of physics models
can be decomposed into conservation laws and constitutive laws [14, Ch. 1.3], [15].
The conservation laws are of topological nature and can therefore be discretized
easily, leaving little room for data-driven techniques. The situation is different for
the constitutive relations, which are of metric nature, and encode phenomenological
material properties. Except for simple media (local, linear) there are many potential
complications (non-local, hysteretic, non-linear, multi-scale, multi-physics, etc.).
Here, data-driven models can be useful, provided that the models fulfil certain
admissibility criteria, which can often be expressed in terms of invariance with
respect to symmetry groups (orthogonal group, Lorentz group, etc.). This is
showcased in Sect. 3.
Hybrid Modeling 253
To sum up, hybrid modeling has the potential to improve the Pareto tradeoff
between simulation accuracy and simulation cost significantly, and therefore bring
scientifc computing in engineering to the next level. In the remainder of the paper
we will showcase this by some recent achievements.
40mm
360mm
Fig. 1 Field quality maps of a dipole field in a rectangular magnet cross section. Top: interpolation
of measurement data. Bottom: reconstruction from a BEM model. The colours ranging from blue
to red indicate the deviation from an ideal dipole field, in logarithmic scale
254 S. Kurz
Table 1 Bayesian update and Kálmán update. Quantities ν and d are to be understood as random
variables with probability distributions p(·). The number of degrees of freedom of the model is
denoted by N, and the dimension of measurement data by M
ν State vector of BEM model
ν ∼ N (ν, Q) ν ∈ RN mean values
Q ∈ RN×N covariance matrix, process noise
d measurement data vector
d|ν ∼ N (Mν, R) M ∈ RM×N discrete measurement operator
R ∈ RM×M covariance matrix, measurement noise
Under normal distribution assumption this can be computed easily explicitly. Then,
the Bayesian update turns into a Kálmán update, which can be readily expressed in
terms of linear algebra operations,
ν
→ ν + K(d − Mν) , (2a)
Q
→ (I − KM)Q , (2b)
where
−1
K := QM MQM + R ∈ RN×N (3)
is the Kálmán gain matrix. Matrix M is the measurement matrix. It maps the degrees
of freedom of the BEM model to the measured quantities. These are flux density
vectors in case of Hall probe measurements, and magnetic fluxes in case of coil-
based systems. Technically, this amounts to evaluating the integral operator of the
double layer potential, in terms of the discrete model. In actual applications this
approach is extended to a box-shaped domain in three dimensions, cf. Fig. 2. The
Kálmán update results in a three-step procedure.
1. We select some prior from previous measurements or simulations. In the simplest
case, we start from zero, with some estimate for the covariance matrix, i.e. ν ∼
N (0, Q). This is a so-called smoothing prior. In fact, the reconstruction of
the dipole layer from the measured field boils down to an inverse problem,
Hybrid Modeling 255
Fig. 2 Hybrid model combining measured data with the BEM. Top left: A translating induction
coil consists of multiple single-wire loops in x direction. For fixed y, it measures magnetic flux
increments ΔΦ between successive trigger points along the z axis. Bottom left: The covariance
matrix can be estimated from an ensemble of runs. The figure shows frequency distributions for
three exemplary positions. Right: Contour plots of posterior mean and variance field magnitudes
where H denotes the magnetic field strength, j the imposed source current density,
B the magnetic flux density, and Ω the considered domain. Moreover, we assume
256 S. Kurz
1We do not delve into regularity considerations or functional analytical frameworks here.
2This formulation was proposed on the Compumag Conference 1983 in Genoa. The related
variational principle was called “Ligurian”, in honor of the Genoa region, and in similarity to
“Lagrangian”. [21, p. 49].
Hybrid Modeling 257
Fig. 3 Iterative data-driven solver. Left: Measured B(H )-characteristic. Active measurement data
(red crosses) are closest to given field points (blue circles). Right: The outer fixed point iteration
combines solutions of a variational principle by a modified FE solver (blue circles) with discrete
optimizations that select states associated with active measurement data (red crosses)
3 In this model problem, an array of slot machines is considered. The gambler must balance the
goal to find the slot machine with the highest gain (exploitation) with the goal to achieve good
results on every play (exploration).
258 S. Kurz
1 EEC 2
D 1 2 D
C 1’ 2’ C
Fig. 4 Schematic for system simulation: extracted equivalent electrical circuit (EEC) of the trace
pair, surrounded by mode converters
ports 1&1’
Gradient
fixed outer vectors at (ii) Mixed
trace
free surface: mode
inner trace w/ (i) Reflecton conversion
free surface
ports 2&2’
Fig. 5 Trace pair with bend. Left: The outer trace is fixed, the inner trace has a free interior surface.
Right: Shape gradient vectors at the free surface for the two objectives (i) and (ii)
achieved so far and computes the Expected Improvement (EI). The next sample is
taken at the point with the largest EI; this yields yet another optimization problem.
The BO algorithm stops if the EI drops below some threshold. The BO approach can
be generalized in various ways, such as BO with noise, BO in several dimensions,
and BO for several objectives.
As an industrial example we consider BO of a differential trace pair on a printed
circuit board. Differential signalling benefits from high immunity against electro-
magnetic interference and low crosstalk. However, bend discontinuities in transmis-
sion lines introduce (i) reflection and (ii) differential-to-common-mode conversion.
An optimal design hence requires multi-objective optimization of the geometry.
A parametric case was studied in [25], while we aim at free-shape optimization.
Figure 4 shows a schematic for system simulation. The trace pair with ports 1,1’
and 2,2’, respectively, is described by an equivalent electrical circuit (EEC). Mode
converters admit a separation of differential mode (D) and common mode (C) signal
components. The optimization objectives can be stated in terms of S-parameters: 4
! !
(i) reflection |SDD11| = min; (ii) mixed mode conversion |SCD21 | = min.
The geometric setting is depicted in Fig. 5 left. The outer trace is fixed, while the
inner trace has a free interior surface. The geometry is described by a finite element
mesh, and the free surface can be re-shaped by mesh morphing. This corresponds
to a high-dimensional design space with ≈200 dimensions. This should be put
in contrast to the six-dimensional design space that was considered in [25]. The
optimization problem is: Find the Pareto front for the shape of the free surface that
minimizes the objectives.
The ingredients for solving the optimization problem are: finite element electro-
magnetic field solver, EEC extraction, and adjoint sensitivity analysis. Figure 5 right
shows the shape gradient vectors at the free surface for the two objectives (i) and
(ii). The two gradient vector fields point in opposite directions, so the objectives are
conflicting. However, the gradient fields are not exactly negatives of each other, so
there is still subtle room for improvement.
The BO is extended to the multi-objective case as follows. The Pareto front is
approached via a sequence of auxiliary optimization problems, each with respect to
a certain 2D affine subspace of the high-dimensional design space. This particular
affine subspace is spanned by the adjoint-based gradients; it is the subspace of
maximum objective variance. For each optimization problem of the sequence, BO
learns and optimizes GP surrogate models for the objective functions, restricted to
this subspace. Once the intermediate Pareto front is converged in this subspace,
new subspaces may be chosen on the intermediate Pareto front. Figure 6 shows the
result of this algorithm, after only ≈100 design evaluations. Note that even subtle
improvement potentials will be exploited by the hybrid free-shape optimizer.
(ii) Mixed mode conversion
Low
Starng point:
reflecon
parametrically
Selected opmized
optmized design
design Low
conversion
(i) Reflecon
Fig. 6 Converged Pareto front for the trace pair with bend (green dots). The algorithm was started
with a parametrically optimized design (black dot). An optimized design from the Pareto front was
selected as an example (red dot)
260 S. Kurz
Fig. 7 Field inversion and machine learning. Step (1): Identify the model errors ε i (x, t) as defined
in (7) for given time series Y i , by solving PDE-constrained optimization problems. Step (2): Train
a NN that describes a mapping from state variables (more precisely: features thereof) to model
errors, by using the results from the previous step. Step (3): Include the correction operator ε̂ in the
state equation
Field Inversion and Machine Learning (FIML) [29] This method stems from
computational fluid dynamics (CFD). For turbulent flows one may either solve
Navier-Stokes equations by direct numerical simulation (DNS) or large eddy
simulation (LES). This approach is accurate but numerically expensive, since it
involves a range of space and time scales. On the other hand, one may use the
Reynolds-averaged Navier-Stokes (RANS) method, where turbulence effects are
accounted for by phenomenological models rather than first principles. This method
is much more efficient but less accurate. With the help of FIML, both approaches
can be combined.
Going beyond CFD, on an abstract level, let some system dynamics be governed
by a low-fidelity state equation of the form
∂t + D u(x, t) = ε(x, t) , (7)
where ∂t is the time derivative, D is the differential operator in space, u(x, t) is the
state variable, and ε(x, t) is the (unknown) model error. We consider a discretized
setting. Assume that an observable y(u) is defined by some functional of the state
variable, and several observed time series Y i , i = 1, . . . , N are available, either
measured or from high-fidelity simulation. The idea of FIML is to learn a correction
operator ε̂ to account for the model error, cf. Fig. 7. Note that the NN does not
directly operate on the state variable, but rather on some low-dimensional feature
set f (u). Some achievements, limitations and further developments of this method
applied to airfoil modeling can be found in [30].
Epilog We have discussed hybrid modeling mainly from a physics-based perspec-
tive, where significant advantages could be achieved by joining with data-driven
models. Conversely, hybrid modeling is also beneficial from the standpoint of
industrial AI. In contrast to consumer AI, industrial AI focuses on smart products
and their creation. Such AI should be robust, that is sufficiently tolerant against
perturbations, and explainable, that is, the AI function can be made comprehensible
262 S. Kurz
Acknowledgments Support in preparing the examples as well as inspiring discussions with the
following colleagues are acknowledged: Armin Galetzka (TU Darmstadt); Andreas Klaedtke,
Xiaobai Li, Manuel Schmidt (Bosch Corporate Research); Melih Kandemir, Zico Kolter (Bosch
Center for Artificial Intelligence); Melvin Liebsch (CERN).
References
1. M. Walker, Hype cycle for emerging technologies, 2018. Tech. Rep. G00340159, Gartner
Research (2018). https://www.gartner.com/en/documents/3885468/hype-cycle-for-emerging-
technologies-2018
2. P.V. Coveney, E.R. Dougherty, R.R. Highfield, Big data need big theory too. Philos. Trans.
R. Soc. A: Math. Phys. Eng. Sci. 374(2080), 20160,153 (2016). https://royalsocietypublishing.
org/doi/abs/10.1098/rsta.2016.0153
3. W. E, Machine learning: Mathematical theory and scientific applications, in ICIAM – Inter-
national Congress on Industrial and Applied Mathematics (2019). https://web.math.princeton.
edu/~weinan/ICIAM.pdf
4. K. Willcox, Predictive data science for physical systems – from model reduction to scientific
machine learning (2019), in ICIAM – International Congress on Industrial and Applied Mathe-
matics (2019). https://kiwi.oden.utexas.edu/papers/Willcox-Predictive-Data-Science-ICIAM-
2019.pdf
5. C.F. Higham, D.J. Higham, Deep learning: an introduction for applied mathematicians. SIAM
Rev. 61(4), 860–891 (2019)
6. Wikipedia contributors: Newton’s laws of motion — Wikipedia, the free encyclope-
dia (2019). https://en.wikipedia.org/w/index.php?title=Newton27s_laws_of_motion&oldid=
915580692 [Online. Accessed 26 September 2019]
7. C.M. Bishop, Pattern Recognition and Machine Learning (Springer, New York, 2006)
8. N. Guarino, D. Oberle, S. Staab, What is an ontology?, in Handbook on Ontologies (Springer,
New York, 2009), pp. 1–17
9. S. Lee, N. Baker, Basic research needs for scientific machine learning: core technologies for
artificial intelligence. Tech. rep., USDOE Office of Science (SC)(United States) (2018). https://
www.osti.gov/servlets/purl/1484362
10. R. Swischuk, L. Mainini, B. Peherstorfer, K. Willcox, Projection-based model reduction:
Formulations for physics-based machine learning. Comput. Fluids 179, 704–717 (2019)
11. R.E. Kálmán, A new approach to linear filtering and prediction problems. Trans. ASME–J.
Bas. Eng. 82, 35–45 (1960)
12. A. Stuart, The legacy of Rudolph Kálmán – blending data and mathematical models, in Boeing
Distinguished Colloquia, Univ. Washington (2019). https://www.sfb1294.de/fileadmin/user_
upload/Kalman_Lectures/1st_Kalman_Lecture_2018_Andrew_Stuart.pdf
13. W. Pieczynski, F. Desbouvries, Kálmán filtering using pairwise Gaussian models. in 2003 IEEE
International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.
(ICASSP’03), vol. 6, pp. VI–57–VI–60 (IEEE, New York, 2003)
14. E. Tonti, The Mathematical Structure of Classical and Relativistic Physics (Springer, New
York, 2013)
15. E. Tonti, Discrete physics – algebraic formulation of physical fields (2014). http://www.
discretephysics.org/en/. [Online; Accessed 26 September 2019]
16. M. Liebsch, S. Russenschuck, S. Kurz, Boundary-element methods for field reconstruction in
accelerator magnets. IEEE Trans. Magn. 56(3), 1–4 (2020)
Hybrid Modeling 263
17. J.M. Bardsley, Computational Uncertainty Quantification for Inverse Problems, vol. 19
(SIAM, New York, 2018)
18. H. De Gersem, A. Galetzka, I.G. Ion, D. Loukrezis, U. Römer, Magnetic field simulation with
data-driven material modeling (2020). Preprint. arXiv: 2002.03715
19. T. Kirchdoerfer, M. Ortiz, Data-driven computational mechanics. Comput. Methods Appl.
Mech. Eng. 304, 81–101 (2016)
20. J. Rikabi, C. Bryant, E. Freeman, An error-based approach to complementary formulations of
static field solutions. Int. J. Numer. Methods Eng. 26(9), 1963–1987 (1988)
21. B. Trowbridge, Compumag conference – the first 25 years (2001). https://www.compumag.
org/wp/wp-content/uploads/2018/07/TwentyFiveYearsOfCompumag.pdf. [Online; Accessed
16 April 2020]
22. S. Conti, S. Müller, M. Ortiz, : Data-driven problems in elasticity. Arch. Ration. Mech. Anal.
229(1), 79–123 (2018)
23. S. Schuhmacher, A. Klaedtke, C. Keller, W. Ackermann, H. De Gersem, Adjoint technique for
sensitivity analysis of coupling factors according to geometric variations. IEEE Trans. Magn.
54(3), 1–4 (2018)
24. P.I. Frazier, Bayesian optimization, in Recent Advances in Optimization and Modeling of
Contemporary Problems (INFORMS, Catonsville, 2018), pp. 255–278
25. C. Gazda, I. Couckuyt, H. Rogier, D.V. Ginste, T. Dhaene, Constrained multiobjective
optimization of a common-mode suppression filter. IEEE Trans. Electromagn. Compatibil.
54(3), 704–707 (2012)
26. M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics-informed neural networks: a deep learning
framework for solving forward and inverse problems involving nonlinear partial differential
equations. J. Comput. Phys. 378, 686–707 (2019)
27. Y. Zhu, N. Zabaras, P.S. Koutsourelakis, P. Perdikaris, Physics-constrained deep learning for
high-dimensional surrogate modeling and uncertainty quantification without labeled data. J.
Comput. Phys. 394, 56–81 (2019)
28. F. de Avila Belbute-Peres, K. Smith, K. Allen, J. Tenenbaum, J.Z. Kolter, End-to-end
differentiable physics for learning and control, in Advances in Neural Information Processing
Systems (2018), pp. 7178–7189
29. E.J. Parish, K. Duraisamy, A paradigm for data-driven predictive modeling using field inversion
and machine learning. J. Comput. Phys. 305, 758–774 (2016)
30. J.R. Holland, J.D. Baeder, K. Duraisamy, Towards integrated field inversion and machine
learning with embedded neural networks for RANS modeling, in AIAA Scitech 2019 Forum
(2019), p. 1884
Machine Learning for Initial Value
Problems of Parameter-Dependent
Dynamical Systems
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 265
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_26
266 R. Pulch and M. Youssef
The mass matrix M : Π → Rn×n and the right-hand side f : [t0 , tf ]×Rn ×Π → Rn
include the parameters. Thus the solution x : [t0 , tf ] × Π → Rn depends both on
time and the parameters. If the mass matrix is non-singular, then (1) represents a
system of ODEs. If the mass matrix is singular, then a system of DAEs is given. We
examine initial value problems (IVPs)
In the case of DAEs, the initial values have to be consistent, see [5]. Consistent initial
values often depend on the parameters. We define a QoI y : [t0 , tf ] × Π → Rn by a
function g : Rn → R via
y(t, p) = g(
x (t, p)). (3)
Each selection of the parameters yields a trajectory of the QoI in the time domain.
We obtain the mapping
p
→ {(t, y(t, p)) : t ∈ [t0 , tf ]} (4)
Machine Learning for Parameter-Dependent Dynamical Systems 267
for any p ∈ Π. Our aim is to construct an approximation of the mapping (4), which
can be evaluated cheap, in particular, without solving IVPs (1), (2) any more.
The following approach can also be used for boundary value problems (BVPs)
of dynamical systems, because we only include the trajectories of the QoI in the
method. The trajectories are computed from either IVPs or BVPs.
3 Time Discretisation
We discretise the trajectories of the QoI (3) in the time domain [t0 , tf ]. Let
Each evaluation of (6) requires to solve an IVP (1), (2) followed by the extraction of
the QoI (3). The IVPs of the dynamical systems are solved by numerical methods,
see [4, 5], like Runge-Kutta schemes and linear multistep methods. The methods
yield approximations of the solution in discrete time points, which are typically
determined by a local error control. Thus these time points are not identical to our
choice (5). Nevertheless, we obtain the solution in the points (5) by an interpolation
or a dense output in time.
Stiff systems of ODEs and all DAEs require implicit methods in the time
integration. Therein, a nonlinear system of algebraic equations has to be solved
in each time step. Thus the computational effort becomes large. Our goal is to
determine an approximation of the mapping (6), whose evaluation is cheap.
4 Machine Learning
Fig. 1 Artificial NN with input layer (red), hidden layers (green), and output layer (blue)
including a matrix Aj ∈ RNj ×Nj−1 , a vector bj ∈ RNj , and the input z ∈ RNj−1 . In
the context of machine learning, the entries of A j and bj are denominated as weights
and biases, respectively. The operator ρ is a nonlinear transfer function ρ : R → R
(also called activation function). Typical choices are, for example, the hyperbolic
tangent sigmoid function
ρ(x) = 2
1+e−2x
−1 (8)
Strain = {p1 , . . . , pk } ⊂ Π
Svalid = {
q1 , . . . , qk } ⊂ Π (10)
Stest = {r1 , . . . , rk } ⊂ Π.
Machine Learning for Parameter-Dependent Dynamical Systems 269
For example, random samples can be chosen, where a uniform probability distri-
bution is assumed in the parameter domain Π. The minimisation is based on the
differences Θ(pi )−Ψ (pi ) for parameter tuples pi from the training set. The (vector-
valued) differences are measured using the mean squared error or the mean absolute
error. The error measure decreases monotone for the parameters in the training set
due to the minimisation. The validation set is included to prevent an overfitting. If
the error measure of the validation set increases, then the training is stopped and the
best previous case is put out. The test set is not involved in the minimisation at all.
Hence this set allows for an estimate of the quality of the trained NN.
All numerical computations were performed within the software MATLAB [8]
using the Deep Learning Toolbox.
We investigate an electric circuit introduced in [1], which is illustrated by Fig. 2.
This circuit performs a voltage doubling for specific choices of parameters and input
voltage. A mathematical modelling yields a nonlinear system of DAEs (1) with
n = 3 equations for the three unknown node voltages presented in [1]:
3 2 1
R1 C2
uin C1
0
R2
with amplitude A = 500 and period T = 0.1. The total time interval of our
simulations is [t0 , tf ] = [0, 0.5]. The initial values (2) are set to zero, which
represents a consistent case in this example. The backward differentiation formulas
(BDF), see [4], yield the numerical solutions of the IVPs. High accuracy requests
are imposed in the local error control with relative tolerance εrel = 10−4 and
absolute tolerance εabs = 10−6 . The error control generates approximations on
a non-uniform grid in time. We extract the trajectories of the QoI in m = 200
equidistant time points t
=
Δt for
= 1, . . . , m with Δt = tf −t m
0
by
interpolation. The order of accuracy coincides for both the uniform grid and non-
uniform grid. The associated error of the time integration is negligible in comparison
to the approximation error of the NNs below. Figure 3 gives an impression of the
variability within the trajectories of the QoI for our parameter domain.
We select the number of samples as k = k = k = 500 in the sets (10).
Often the validation set and the test set are chosen smaller than the training set
due to a restricted amount of data. In contrast, we are able to use larger sets,
since a high number of trajectories can be produced by numerical simulations. In
particular, a large test set provides reliable statistics in the error analysis. A pseudo
random number generator yields the parameter samples in the multidimensional
cuboid Π. Our NNs include two hidden layers with 400 neurons in each layer.
200
-200
106 108
Train Train
Validation
105 107 Validation
Test Test
Best Best
106
104
105
103
104
2
10
103
101 102
0 100 200 300 400 500 600 700 0 50 100 150 200 250
(i) (ii)
Fig. 4 Mean squared errors during the fitting of the two NNs in the iterative minimisation (The
green line of the validation set is mostly located behind the red line of the test set.)
Table 1 Number of iteration steps and mean squared error (MSE) of test set for different training
methods in NNs with hard-limit transfer function (i) and purely linear transfer function (ii)
(i) (ii)
Steps MSE Steps MSE
Conjugate gradient method 728 1067.5 254 277.21
One-step secant method 1696 1062.2 328 277.24
Gradient descent method 10000 1323.1 1413 277.74
Using more hidden layers or more neurons did not improve the results significantly.
We investigate two NNs, which differ only in the choice of the transfer function:
(i) hard-limit transfer function (9),
(ii) purely linear transfer function.
In the training, a conjugate gradient backpropagation method iteratively solves the
minimisation problem. Figure 4 shows the performance of the training procedure.
In the case of the hard-limit transfer function, the training is stopped at the 728th
iteration step, because the error of the validation set increases slightly. In the case of
the linear transfer function, the training is terminated at the 254th iteration step due
to a too small step size. These two NNs are used in the following error analysis.
In addition, we tried two other backpropagation techniques in the training of
the NNs: a one-step secant method (quasi Newton method) and a gradient descent
method with momentum and adaptive learning rate. More information on all three
methods can be found in [2], for example. Table 1 demonstrates the number of
iteration steps (until a termination criterion applies) as well as the final mean squared
error of the test set for the three techniques. We observe that the conjugate gradient
method exhibits the best performance.
Figure 5 illustrates several trajectories of the test set. A comparison of the exact
trajectories from the time integration and the approximations from the two NNs is
shown. An interesting property is that NN (i) with the nonlinear transfer function
272 R. Pulch and M. Youssef
400 400
200 200
0 0
-200 -200
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
400 400
200 200
0 0
-200 -200
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
Fig. 5 Trajectories of the QoI for some samples from the test set
tf − t0 |ỹ(t
, pi ) − y(t
, pi )|
m
Ei = , (12)
m |y(t
, pi )|
=1
where y is the original value from the time integration and ỹ denotes the approxima-
tion from an NN. The initial value is not included due to its value zero. The statistics
of the errors are depicted in Table 2. We discuss the resulting mean values. In NN (i),
a smaller mean error is achieved in the training set, whereas the other two sets show
larger errors in comparison to NN (ii). Moreover, the mean errors are balanced for
all three sets in NN (ii). This behaviour is in agreement to the performance of the
training demonstrated by Fig. 4.
Machine Learning for Parameter-Dependent Dynamical Systems 273
Table 2 Mean value and standard deviation of relative errors in discrete L 1 -norm, see (12), for
the three parameter sets within the two trained NNs
Mean St.dev.
NN (i) NN (ii) NN (i) NN (ii)
Training set 0.044 0.086 Training set 0.135 0.228
Validation set 0.130 0.086 Validation set 0.210 0.158
Test set 0.120 0.079 Test set 0.155 0.146
6 Conclusions
References
1 Introduction
During the last years with the development of technology, energy harvesting systems
have become a popular area of research. They ensure longevity, eco-friendly
operation, low maintenance and have a wide range of applications from aircraft,
biosensors [1] to telemetry systems [2] etc. Here presented, thermoelectric generator
(TEG) acts as an alternative source of energy to provide stable power to electrically
active implants [3]. Choosing the proper geometry for the TEG is very important
aspect [4]. To come up with an adequate design, the impact of geometrical param-
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 275
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_27
276 A. Roy et al.
eters needs to be analyzed. It has been seen that the thermoelectric performance
of the TEG is dependant on the height of the thermocouples [5]. With change in
height of the thermocouples, temperature difference between the top and the bottom
level changes and so the generated power. We wish to investigate this influence via
mathematical method of model order reduction (MOR).
The TEG is modelled as distributed parameter system via partial differential
equations. Finite element analysis converts these partial differential equations into
large-scale ordinary differential equation systems, which solution is computation-
ally costly. MOR derives the low-dimensional approximation of the higher order
original system [6, 7]. Furthermore, during design optimization, the system needs
to be simulated repeatedly for different values of geometrical parameters. If these
parameters can be preserved within the reduced models, then the full-scale system
must not be repeatedly synthesized and reduced at each parameter value. This idea
gives rise to parametric model order reduction (pMOR). In this paper, parametric
modeling of a TEG is carried out with modified matrix interpolation based pMOR
method. The parameter considered is the height of thermocouples.
2 Model Description
The human body is a thermal energy source. When the surrounding temperature
varies, the body temperature varies between 23 ◦ C (at the skin surface) and 37
◦ C (in the body core). Implantable TEG utilizes the temperature difference in the
Heat absorption
Heat
flow
N P Peltier height
Copper
Ceramic plates interconnect
Thermoelectric
Heat emission components
(a) (b)
Based on Seebeck effect, electrons and holes in the thermocouples start moving
when a temperature gradient occurs. As a result, thermal energy is being converted
into electrical energy, which can be utilized to power electrical implants. The voltage
(V) generated by the TEG is given by:
V = n · ΔT · (α1 − α2 ) (1)
where, ΔT is the temperature difference between the top level and bottom level of
the TEG, n is the number of thermocouples, α1,2 are the Seebeck coefficients of the
thermocouple legs.
Here a simplified human tissue model is considered to study the behavior of the
TEG inside human tissue. The human tissue model consists of muscle, fat, and skin
layers as shown in Fig. 2. The TEG is surrounded with a 40×40 mm2 housing made
of Teflon and placed within the fat layer, as maximum temperature difference occurs
there [4].
The material properties of various parts in TEG and different human tissue are
shown in Tables 1 and 2.
Fat
8mm
35mm
2mm Muscle
The heat conduction in the human tissue is described by the Pennes’ bioheat
model [8] expressed as:
∂T
∇(κ∇T ) + ρb cb ω(Ta − T ) +Qm = ρc (2)
! "# $ ∂t
Qb (T )
where T is the resulting temperature field and κ, ρ and c are the thermal conduc-
tivity, density and specific heat of the tissue, respectively. The heat generation rates
provided by metabolism and perfusion are described by Qm and Qb (T ). The density
and specific heat of blood are expressed as ρb = 1049.75 kg/m3 and cb = 3617
J/kg/K. ω is the blood perfusion rate in different tissue layers. Blood temperature
Ta = 37 ◦ C is set as temperature boundary condition at bottom surface of the tissue
model. Note that the temperature dependent perfusion effect Qb (T ) can be applied
as the ‘convection-type’ effect as introduced in [9]. The value of the metabolic heat
generation rates Qm in different tissue layers are introduced in Table 2.
The heat is dissipated by convection at the skin surface as the external heat loss
effect:
where qconv is the heat flux normal to the boudary skin surface. The heat transfer
coefficient is expressed by h and the ambient temperature by Tamb . The steady
state solution of the system (2), (3) is taken as the initial condition for the transient
thermal simulation.
The finite element discretization of Eq. (2) with convection boundary condi-
tion (3) leads to a following large-scale system of ordinary differential equations:
where, E, A ∈ Rn×n are the heat capacity and heat conductivity matrices,
respectively, B ∈ Rn×m is the input distribution matrix and C ∈ Rp×n is the user-
defined output matrix. In this work, the order of the model, n ≈ 4 × 104 changes
with the Peltier height, is very large, and T (t) ∈ Rn is the state vector of unknown
nodal temperatures.
Modeling of Thermoelectric Generator 279
k locally reduced order models (ROMs) are generated by projecting each large-scale
system onto a lower order subspace. We have used one sided Arnoldi algorithm [12],
which generates a transformation matrix Vi ∈ Rni ×r , where ni is the order of each
large-scale system and r ( ni is the order of the corresponding ROM.
On the basis of the method introduced in [10] and [11], the globally reduced
models after applying modified matrix interpolation method, can be written as:
E∗r,i A∗r,i B∗
# $! " # $! " # $! "
r,i
−1 ∗ −1 ∗
Mi Er,i Ti Ṫr (t) = Mi Ar,i Ti Tr (t) + Mi Br,i u(t) (6)
y(t) = Cr,i T−1 T ∗ (t)
! "# i $ r
C∗r,i
4 Simulation Results
In this work, a simplified human tissue model consisting of muscle, fat, and
skin layers is considered. The TEG is placed within the fat layer. A geometrical
parameter, the height of the thermocouple is varied from 3.65 mm to 3.95 mm
and discretized at 3.65, 3.75, 3.85, 3.95 mm. Large-scale finite element models are
generated at these discrete points by using ANSYS Mechanical [13]. Subsequently,
the corresponding ROMs of order 31, are generated by using “Model Reduction
inside ANSYS” [14]. Utilizing these ROMs through modified matrix interpolation
based pMOR algorithm, a global reduced order model is generated.
To verify the proposed method, an intermediate point is chosen at p = 3.8 mm.
A global reduced model is interpolated at this point and compared to the full-scale
model of order n = 44,942, with 3.8 mm Peltier height. To study the influence
of the convection boundary condition as mentioned in (3), an initial state of the
TEG is obtained with heat transfer coefficient h = 8.8 W/m2 /K through steady
state simulation. Afterwards, a transient simulation with heat transfer coefficient
h = 11.18 W/m2 /K is carried out for 7000 s. The ambient temperature is set as
constant Ta = 25 ◦ C. The transient thermal response at the top and the bottom
of the thermocouples, at intermediate point p = 3.8 mm, is shown in Fig. 4. For
ease of measurement, we have calculated the average temperature of the top and
bottom surfaces of the thermocouples. A comparative analysis of relative errors of
average temperatures are shown in Fig. 5. It can be seen that the relative error at the
Modeling of Thermoelectric Generator 281
Fig. 4 Transient thermal response at selected top and bottom nodes of thermocouple at interme-
diate point
intermediate point is around 0.1624%, which is accurate enough for the problem at
hand. Furthermore, the reduced models obtained at every discrete points, produce
results with still higher accuracy. This is expected, because the reduced models are
obtained at these very discrete points, while the model at the intermediate point is
calculated through interpolation.
282 A. Roy et al.
In this paper, the potential design optimization strategy, based on modified matrix
interpolation pMOR, for a human TEG has been investigated. As the thermoelectric
performance of the device is highly affected by its Peltier height, we have chosen
this height as a parameter of interest. Modified matrix interpolation based pMOR
is applied to reduce computational complexity. A reduced model valid for an
arbitrary Peltier height is generated through this method. Numerical simulations
of the original large-scale model and its interpolated surrogate prove the efficacy of
the proposed method.
In the future work, we will incorporate the cross-sectional area of thermocouple
as another parameter and perform multiple-parameter model order reduction to get
an optimal design of the TEG.
Acknowledgments Financial support of the CRC 1270 ELAINE (Electrically Active Implants) is
acknowledged.
References
1. M. Koplow, A. Chen, D. Steingart, P.K. Wright, J.W. Evans, Thick film thermoelectric
energy harvesting systems for biomedical applications, in Proceedings of the 5th International
Summer School Symposium on Medical Devices and Biosensors (2008), pp. 322–325
2. S. Dalola, V. Ferrari, M. Guizzetti, D. Marioli, E. Sardini, M. Serpelloni, A. Taroni,
Autonomous sensor system with power harvesting for telemetric temperature measurements
of pipes. IEEE Trans. Instrum. Measure. 58(5), 1471–1478 (2009)
3. Y.W. Chong, W. Ismail, K. Ko, C.Y. Lee, Energy harvesting for wearable devices: a review.
IEEE Sensors J. 19(20), 9047–9062 (2019)
4. O. Jadhav, C.D. Yuan, D. Hohlfeld, T. Bechtold, Design of a thermoelectric generator for
electrical active implants, in MikroSystemTechnik Congress (2017), pp. 1–4
5. B. Jang, S. Han, J.Y. Kim, Optimal design for micro-thermoelectric generators using finite
element analysis. Microelectron. Eng. 88(5), 775–778 (2011)
6. W.H.A. Schilders, H.A. Van der Vorst, J. Rommes, Model Order Reduction: Theory, Research
Aspects and Applications, vol. 13 (Springer, Berlin, 2008)
7. B. Lohmann, B. Salimbahrami, Introduction to Krylov subspace methods in model order
reduction, in Methods and Applications in Automation (2000), pp. 1–13
8. C.K. Charny, Mathematical models of bioheat transfer, in Advances in Heat Transfer, vol. 22
(Elsevier, Amsterdam, 1992), pp. 19–155
9. C.D. Yuan, S. Kreß, G. Sadashivaiah, E.B. Rudnyi, D. Hohlfeld, T. Bechtold, Towards efficient
design optimization of a miniaturized thermoelectric generator for electrically active implants
via model order reduction and submodeling technique. Int. J. Numer. Methods Biomed. Eng.
36(4) (2020). https://doi.org/10.1002/cnm.3311
10. H. Panzer, J. Mohring, R. Eid, B. Lohmann, Parametric Model Order Reduction by
Matrix Interpolation. Automatisierung-stechnik Methoden und Anwendungen der Steuerungs-
, Regelungs- und Informationstechnik 58(8), 475–484. ISSN (Print) 0178–2312, August 2010.
https://doi.org/10.1524/auto.2010.0863
11. A. Roy, M. Nabi, Efficient simulation of electro-thermal micro-gripper using PMOR, in Indian
Control Conference (ICC), Kanpur, 2018
Modeling of Thermoelectric Generator 283
12. R.W. Freund, Krylov-subspace methods for reduced-order modeling in circuit simulation. J.
Comput. Appl. Math. 123(1–2), 395–421 (2000)
13. Ansys, Academic Research Mechanical, 2019 R3. ANSYS, Inc.
14. E.B. Rudnyi, Mor for ANSYS, in System-Level Modelling of MEMS, ed. by T. Bechtold, G.
Schrag, L. Feng. Wiley-VCH Book Series on Advanced Micro and Nanosystems (Wiley-VCH,
Weinheim, 2013), pp. 425–438
Nonlinear Model Order Reduction
of a Thermal Human Torso Model
1 Introduction
In the last couple of years, aging of the population is the main concern especially in
European countries [1]. Concerning this, various developments in the medical sector
were provoked. The development of electrically active implants is a special boon
in regeneration therapies and deep brain stimulation to treat movement disorders.
Among the various factors affecting the performance of implants, their limited
G. Sadashivaiah ()
Institute for Electronic Appliances and Circuits, University of Rostock, Rostock, Germany
e-mail: [email protected]
C. Yuan · T. Bechtold
Institute for Electronic Appliances and Circuits, University of Rostock, Rostock, Germany
Department of Engineering, Jade University of Applied Sciences, Wilhelmshaven,
Wilhelmshaven, Germany
e-mail: [email protected]; [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 285
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_28
286 G. Sadashivaiah et al.
2 Case Study
In this section, we present the model of TEG incorporated in the fat tissue in the
chest region of the human torso model. The aim of numerical simulations is to find
the temperature difference across the TEG.
Figure 1a represents the setup of an electrically active implant and Fig. 1b
describes the model of TEG that was constructed in ANSYSWorkbench [12]
based on the available commercial TEGs. The geometry consists of top and bottom
ceramic plates made of aluminum oxide, with the cross-sectional area of 24.6×24.6
mm2 and height of 0.565 mm. The junction between two plates encloses an array
of 16 × 16 p-type and n-type thermocouple legs. The legs are made of temperature-
dependent bismuth telluride, each with height of 2.27 mm and cross-section of
Fig. 1 (a) Schematic of a TEG integrated inside the human tissue; (b) TEG model with 16×16
thermocouple legs and housing
Nonlinear Model Order Reduction of a Thermal Human Torso Model 287
Fig. 2 (a) TEG embedded in the fat layer of chest region; (b) human torso model with internal
organs
∂T
ρc = ∇κ∇T + ρb cb ω(Ta − T (r , t)) +Qm , (1)
∂t ! "# $
Qb (T )
where, ρ, c, and κ are the density, specific heat, and thermal conductivity of tissues
respectively. ρb , cb , and ω denote the density, specific heat, and perfusion rate of
blood. To maintain the core temperature at 37 ◦ C, heat transfer occurs internally due
to metabolic heat generation Qm , temperature dependent blood perfusion Qb , and
thermal conduction. Ta is the arterial blood temperature and T (r , t) is the unknown
temperature of the human tissues.
To maintain the natural balance in the body, excess heat is transferred to the
ambient environment through the skin surface, which is given by:
where, qconv , qrad and qeva are the convection, radiation, and evaporation effects
applied as heat flux inputs to the skin surface in ANSYS. Tamb represents the
ambient temperature. The variable hc represents the heat transfer coefficient in
W/m2 /K, σ = 5.6705 × 10−8 W/m2 /K4 represents the Stefan-Boltzmann constant
and = 0.95 represents the emissivity. Heat loss due to evaporation occurs mainly
in the form of sweating through the skin surface. In the evaporation term Pskin and
Pa represents the saturated vapour pressure at skin temperature and partial vapour
pressure respectively. In accordance with the Lewis relation [16], the evaporation
coefficient he can be represented in terms of the heat transfer coefficient as:
he K
= 16.5 . (3)
hc kPa
4030.18
Pskin = 0.1exp(18.956 − ) in kPa, (4)
Tskin + 235
4030.18
Psa = 0.1exp(18.956 − ) in kPa. (5)
Tamb + 235
Therefore, the final equation for the evaporation heat loss is represented by:
4030.18 4030.18
qeva = 1.65hc w{exp(18.956 − ) − φ · exp(18.956 − )},
Tskin + 235 Tamb + 235
(6)
where, w represents the skin wettedness and its value range between 0.06 − 1.
The spatial discretization of the model (1) with boundary conditions (2) at
the skin surface leads to the following large-scale system of nonlinear ordinary
differential equations (ODEs):
⎧
⎪
⎨ E · Ṫ (t) = A(T ) · T (t) + !B · u(T (t)),
"# $
F (T (t )) (7)
⎪
⎩ y(t) = C · T (t),
N
legs. The length of unknown state vector T (t) ∈ RN is N = 1, 340, 734, which
defines the dimension of the full system (7). We would like to emphasize that the
nonlinear input effects can be linearized, and conventional Krylov-subspace based
MOR can be employed. But in our case, the heat conductivity of thermocouple legs
is considered temperature-dependent and hence, the nonlinear MOR methods have
to be applied.
In this work, we employ two different MOR approaches to compute the reduced
order model (ROM) of system (7), the proper orthogonal decomposition and the
dynamic mode decomposition.
One of the most common methods for reducing the dimensionality of the nonlinear
systems is POD, which is also known as Karhunen-Loève (KL) decomposition or
the principal component analysis. The method employs singular value decompo-
sition (SVD) to construct the optimal projection subspace, also called the reduced
basis, which captures most of the dynamics of the given data-set [8, 9]. In this work,
we compute the projection subspace φpod by employing the POD technique, which
is used in conjunction with the Galerkin projection [17] to obtain the ROM:
Er · T˙r (t) = Ar (Tr ) · Tr (t) + φpod
T F (φ
pod Tr (t)),
(8)
r
y(t) = Cr · Tr (t),
where, Er = φpod
T Eφ
pod , Ar (Tr ) = φpod A(T )φpod , Cr = Cφpod are the reduced
T
matrices and accuracy between the full system and ROM is given by T (t) −
φpod Tr (t) .
dT (t)
= Nl {T (t)}, (9)
dt
290 G. Sadashivaiah et al.
d T̃ (t)
= An T̃ (t), (10)
dt
r
T̃ (t) = bi ψidmd exp(ωi t), (11)
i=1
where, ψidmd is the DMD basis of rank r, ωi is the eigenvalues of the matrix An and
bi is the initial condition.
In this work, all the computations are performed on a PC with an Intel Xeon
E5-2680, 2.5 GHz, 128 GB RAM, 4 active cores processor. Initially, steady-state
thermal simulation of the model is conducted with Tamb = 25 ◦ C and hc = 3.1
W/m2 /K. The result of the steady-state simulation is considered as the initial values
to conduct the transient thermal simulation for the new value of hc = 5.48 W/m2 /K.
For both approaches, snapshot matrix is built out of 20 equidistant snapshots with
step size Δt = 350 s for t ∈ [0, 7000]. The singular values obtained by performing
SVD of the snapshot matrix are shown in the Fig. 3a.
Figure 3b and c represents the maximum relative error (%) between the full order
model of dimension N = 1, 045, 923 and ROMs of dimension r=3 (POD) and r=4
(DMD). The maximum relative error of the POD approach amounts to 0.051%, and
of DMD approach amounts to 0.071%. The runtime for transient thermal simulation
of the full system amounts to 5460 s. In both approaches, the time required to
construct snapshot matrix and to perform SVD in offline stage is 5460 + 137.5 s,
Nonlinear Model Order Reduction of a Thermal Human Torso Model 291
a 10
0
b
0.10
Singular values Error in POD Approx. TEG-Top Node
Error in DMD Approx.
–1
10
0.08
–3
10
0.04
–4
10
10
–5 0.02
–6
10 0.00
0 5 10 15 20 0 1000 2000 3000 4000 5000 6000 7000
Snapshots Time (s)
0.08
Relative Error (%)
0.06
0.04
0.02
0.00
0 1000 2000 3000 4000 5000 6000 7000
Time (s)
Fig. 3 (a) Singular values σk of the full human torso model, (b) and (c) represents the maximum
relative error at selected nodes of full and reduced order model
but in online stage POD requires only 63.1 s and DMD only 22.7 s for computation
of ROMs.
In this work, the low-rank approximation of the large-scale nonlinear thermal human
torso model was generated via POD and DMD methods. The main advantage of
the DMD method over POD is, that it does not require any information about
the governing equations of the system to be solved. However, in terms of error
convergence rate, POD-based MOR method is superior compared to DMD method
due to the fact that DMD modes are not orthogonal. In future, both approaches will
292 G. Sadashivaiah et al.
be tested on the parameterized human torso models, with skin wettedness w and
ambient temperature Tamb as parameters.
References
1. C. Casey, J. Gullo, 2018 Aging readiness and competitiveness report. AARP Int. J. 12, 14–15
(2019). https://doi.org/10.26419/int.00036.003
2. S. Priya, D.J. Inman, Energy Harvesting Technologies (Springer, New York, 2009)
3. M.A. Hannan, M. Saad, S.A. Samad, A. Hussain, Energy harvesting for the implantable
biomedical devices: issues and challenges. Biomed. Eng. Online 13, 79 (2014)
4. P. Miao, P.D. Mitcheson, A.S. Holmes, E.M. Yeatman, T.C. Green, B.H. Stark, Mems inertial
power generations for biomedical applications. Microsyst. Technol. 12(10–11), 1079–1083
(2006)
5. O. Jadhav, C.D. Yuan, D. Hohlfeld, T. Bechtold, Design of a thermoelectric generator for
electrical active implants, in MikroSystemTechnik Congress (2017), pp. 1–4
6. R.W. Freund, Krylov-subspace methods for reduced-order modeling in circuit simulation. J.
Comput. Appl. Math. 123, 395–421 (2000)
7. C. Yuan, S. Kreß, G. Sadashivaiah, E.B. Rudnyi, D. Hohlfeld, T. Bechtold, Towards efficient
design optimization of a miniaturized thermoelectric generator for electrically active implants
via model order reduction and submodeling technique. Int. J. Numer. Methods Biomed. Eng.
36, e3311 (2020)
8. P. René, Model Reduction via Proper Orthogonal Decomposition (Springer, Berlin, Heidel-
berg, 2008), pp. 95–109
9. A. Quarteroni, A. Manzoni, F. Negri, Reduced Basis Methods for Partial Differential Equations
(Springer, Cham, 2016), pp. 115–140
10. P. Schmid, Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech.
656, 5–28 (2010)
11. A. Alla, J.N. Kutz, Nonlinear model order reduction via dynamic mode decomposition. SIAM
J. Sci. Comput. 39, 1–20 (2016)
12. Ansys®Academic Research Mechanical, Release 2020 R1, Workbench
13. S. Makarov, G. Noetscher, J. Yanamadala, VHP-Female Datasets. NEVA Electromagnetics,
LLC:VHP-Female 2.2 edn. (2015)
14. P.A. Hasgall, G.F. Di, C. Baumgartner, IT’IS database for thermal and electromagnetic
parameters of biological tissues, IT’IS Foundation, Switzerland, (2018)
15. H.H. Pennes, Analysis of tissue and arterial blood temperatures in the resting human forearm.
J. Appl. Physiol. 1(2), 93–122 (1948)
16. W.K. Lewis, The evaporation of a liquid into a gas. 44, 445–446 (1922)
17. P. Benner, S. Gugercin, K. Willcox, A survey of projection-based model reduction methods for
parametric dynamical systems. SIAM Rev. 57, 483–531 (2015)
18. B.O. Koopman, Hamiltonian systems and transformation in Hilbert space. Proc. Natl. Acad.
Sci. 17, 315–318 (1931)
Multi-Level Iterations for Microgrid
Control with Automatic Level Choice
Abstract Microgrids are considered a key technology for the energy transition,
but the rising penetration of renewable energy sources is pushing current control
approaches to their limits. Nonlinear model predictive control (NMPC) is a promis-
ing approach to address this issue, although achieving real-time feasibility with
standard schemes is challenging. Therefore, we propose to use the Multi-Level
Iteration NMPC scheme with a novel automatic level choice. This allows us to
always use the most accurate linearizations available while being real-time feasible,
even during strongly transient phases where a fixed level choice may be too slow.
We use a realistic-sized microgrid to illustrate the capabilities of this method.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 293
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_29
294 R. Scholz et al.
various filters on different levels. However, experience shows that this control
paradigm reaches its limits under high penetration of RES [1].
Nonlinear model predictive control (NMPC) is a general model-based methodol-
ogy to control dynamical processes. Measurements of the process are embedded in
an optimal control problem, which is solved repeatedly with respect to an objective
function and operational limits. Since NMPC offers a flexible control framework, it
appears to be well suited for the control of MGs.
One of the main challenges for NMPC in the field of power engineering is
the real-time requirement. Transient electrical dynamics involve high-frequency
oscillations, which are costly to simulate, and a high sampling rate is necessary to
react to disturbances in time. Therefore, in the literature, NMPC is mainly used on a
higher control level, relying on traditional integral or droop controllers to handle the
electrical dynamics [2]. To deal with the fast electrical behavior of MGs directly
with NMPC, tailored schemes are necessary, like the Advanced Step Real-Time
Iteration [3].
In this paper, we propose to use the Multi-Level Iteration (MLI) scheme for
MG control. This approach is based on the well-established Real-Time Iteration
[4], which eliminates the need to solve the underlying optimal control problems
until convergence. Additional computation time is saved by updating the problem
linearization only partially in every iteration using cheap update formulas to increase
the feedback rates and render NMPC applicable for MG control.
The differential and algebraic states x(t) and z(t) are subject to the DAE system (1b)
with initial value set to the current system state ξk (1c). The control inputs of the
process are represented by u(t) and the objective is defined by the functional Φ.
The NMPC feedback signal applied in the interval [tk , tk+1 ) is the first part of the
solution uξk (t) = u∗k (t). We discretize the problem with the direct multiple shooting
Multi-Level Iterations for Microgrid Control with Automatic Level Choice 295
method introduced by Bock [5] and obtain a finite dimensional, structured nonlinear
program (NLP) of the compact form
Here l is the discretized objective function (1a), the function b together with the
constant matrix E represent the discretized DAE system (1b) with the initial value
embedding constraint (1c) and wlo and wup are the lower and upper bounds on states
and controls. A sequential quadratic programming (SQP) method is used to solve
this NLP, which generates a sequence of primal-dual iterates (wj , λj , μj )j ∈N based
on the quadratic program (QP)
1 b(wj ) + Eξk + BΔw = 0,
min Δw AΔw + a Δw s.t. (3)
Δw 2 wlo ≤ Δw + wj ≤ wup .
The matrix A is the Hessian (or an approximation thereof) with respect to w of the
Lagrangian L (wj , λj , μj ) of the NLP (2). The linear objective term is defined by
the objective gradient a = ∇w l(wj ) and the constraints are linearizations based on
b(wj ) and its Jacobian B = ∇b(wj ) . The solution (ΔwQP , λQP , μQP ) of QP (3)
is used to update the primal-dual variables:
3 Multi-Level Iterations
Depending on the application, the Real-Time Iteration still requires a high compu-
tational effort in every iteration. To set up the QP (3), the constraints, the objective
gradient, the constraint Jacobian and the Hessian (corresponding to b, a, B, A in (3))
have to be computed. MLI can reduce this computational effort drastically and thus
speed up the feedback process by only updating parts of the QP.
The MLI scheme is based on the fact that Newton-type methods (such as the SQP
method described in the previous section) do not require the exact computations of
296 R. Scholz et al.
Table 1 Computations and update formulas for the QP data for the different Levels
Necessary computations Update formula for QP data
Level b(wj ) a(wj ) B(wj ) A(wj , λj ) b a B A
D ✓ ✓ ✓ ✓ b(wj ) a(wj ) B(wj ) A(wj , λj )
C ✓ ✓ (✓)a ✗ b(wj ) a(wj ) + (B̄C − B(wj ) )λj B̄C ĀC
B ✓ ✗ ✗ ✗ b(wj ) āB + ĀB (wj − w̄B ) B̄B ĀB
A ✗ ✗ ✗ ✗ b̄A āA B̄A ĀA
Only the vector-matrix product λ B needs to be computed in an adjoint fashion
a
In practice, the presented levels are operated simultaneously. The lower levels are
used to give fast feedback and the higher, computationally more expensive levels
provide accurate linearizations of the NLP (2). Usually this is done by a sequence
of levels, which is fixed in advance. To ensure real-time feasibility, the required
computation time must be estimated and the sequence chosen accordingly [6].
This approach turns out to be inflexible, because the evaluation of a level needs to
be finished before the next evaluation is scheduled. In order to be real-time feasible,
the worst-case computation time needs to be treated. If an adaptive integration
method is used, the integration time may vary strongly between the steady state
and transients. This leads to an unnecessary conservative scheduling of higher
levels, even when the computation time is low and would allow a faster rate. To
overcome this issue, we propose to choose the levels automatically online instead.
In this method every level is operated in parallel. In the beginning of the simulation,
the evaluation of all levels is triggered. As soon as an evaluation is finished, the
Multi-Level Iterations for Microgrid Control with Automatic Level Choice 297
P,Q
DG DG BA PV
Fig. 1 Topology of the test MG
of Pload = 5p.u. and Qload = 1p.u. equally. After 1 s a sudden unscheduled load
step of 40% in active and reactive power takes place, which exceeds the capacity of
the generators. To ensure that the operational limits are satisfied, the battery needs
to leave the provided reference values and serve the missing load. The simulation
has an overall length of 8 s.
In MG control, we need to consider several objectives with different priorities.
This is modeled by a continuous least squares objective functional
ti +T
Φ(x, z, u) = r(x(t), z(t), u(t))2 dt (5)
ti
of OCP (1) with a weighted norm and a residual function r(x, z, u). The most
important goal is to steer the frequency ω(t) and voltage at the load Vload (t) to
the nominal value 1p.u. after a disturbance. During transients, we want to utilize the
battery to stabilize frequency and voltage. In steady state, the performance of the
ref ref
battery should follow setpoints PBA , QBA from a higher control level, in order to
charge or discharge the battery. The generators are supposed to share the remaining
load equally. These goals are achieved by tracking terms
5 Numerical Results
We discretize OCP (1) with two multiple shooting intervals and the length of the
prediction horizon is fixed to T = 1 s. The length of the first shooting interval
corresponds to the sampling time of 100 ms and the second to 900 ms. The numerical
simulations are carried out with the NMPC framework MLI [6]. For integration and
sensitivity generation, the SolvIND integrator suite is used and the QPs are solved
by qpOASES [8].
The continuous least squares objective function (5) enables us to use a Gauß-
Newton approximation of the Hessian in QP (3). Besides its favorable numerical
properties, its main advantage is, that it relies only on first-order derivatives.
Therefore, we do not have to compute second-order derivatives, which is the most
costly task when evaluating QP (3).
We compare our proposed MLI-controller with a typical state-of-the-art control
setup for small microgrids: The generators are equipped with an integral controller
for steady-state error elimination of the frequency with a settling time of approxi-
mately 20 s and a sampling time of 100 ms. The voltage setpoint Vref is kept constant
during the full simulation time.
In Fig. 2, the performance of the proposed MLI-controller is shown in com-
parison to the traditional control approach. The MLI-controller steers back the
1
1
ω [p.u.]
V [p.u.]
0.98 I-Controller
MLI-Controller
0.96
0.99
0.94
0 2 4 6 8 0 2 4 6 8
Apparent Power Generator #1 Apparent Power Battery
3.5
operational limit 1
S [p.u.]
S [p.u.]
3 0.5
0
2.5
0 2 4 6 8 0 2 4 6 8
time [s] time [s]
100 Level D
Level B
10− 1
10− 2
0 1 2 3 4 5 6 7 8
sampling time [s]
Fig. 3 Computation times and scheduling of Level B and D. The elapsed computation time
is depicted by the height of the bars while the width shows in which sampling intervals the
computations were performed
frequency faster with a lower initial drop after the unforeseen disturbance. The
voltage gets stabilized faster and the steady state offset is eliminated. In the
beginning, the battery follows the setpoints and does not contribute in load sharing.
After the load drop, the fast reacting dynamics of the battery are used to stabilize
the system. Since the overall load exceeds the operational bounds of the generators,
the battery temporarily deviates from its reference value and instead serves the
necessary additional load. In contrast to this, the integral-controller is not able
to obey the operational limits of the generators. If there are no safety measures
installed, the generators are overloaded, which may cause physical damage. In
Fig. 3, the computation time and the scheduling of the different MLI levels are
shown. In the beginning, the system is in a steady state and the computation time
is low. After the load jump at t = 1 s, the system is in a transient phase and the
computation time rises sharply, which leads to less level D evaluations. Afterwards,
the system gets steered back to a steady state and the computation time decreases.
As the evaluation time for level B is always below the sampling time, no level A
occurs. Level C is not used, because the Gauß-Newton approximation of the Hessian
implies that the difference in computation time between level C and D is low.
6 Conclusion
Acknowledgments This research was funded by the German Federal Ministry of Education and
Research (BMBF) in the research project MOReNet. (Grant No 05M18VHA)
Multi-Level Iterations for Microgrid Control with Automatic Level Choice 301
References
1. M. Ilić, R. Jaddivada, X. Miao, Modeling and analysis methods for assessing stability of
microgrids, in 20th IFAC World Congress, vol. 50, IFAC-PapersOnLine (2017), pp. 5448–5455
2. C. Bordons, F. Garcia-Torres, M.A. Ridao, Model predictive control of microgrids, in Advances
in Industrial Control (Springer International Publishing, Cham, 2019)
3. A. Nurkanović, A. Mešanović, A. Zanelli, G. Frison, J. Frey, S. Albrecht, M. Diehl, Real-
time nonlinear model predictive control for microgrid operation, in 2020 American Control
Conference (ACC) (2020), pp. 4989–4995
4. M. Diehl, Real-Time Optimization for Large Scale Nonlinear Processes. Dissertation, Heidel-
berg University, 2001
5. H.G. Bock, M. Diehl, E. Kostina, J. Schlöder, Constrained optimal feedback control of systems
governed by large differential algebraic equations, in Real-Time PDE-Constrained Optimization
(SIAM, Philadelphia, 2007), pp. 3–22
6. L. Wirsching, Multi-level iteration schemes with adaptive level choice for nonlinear model
predictive control. Dissertation, Heidelberg University, 2018
7. A. Nurkanović, S. Albrecht, M. Diehl, Multi level iterations for economic nonlinear model
predictive control, in Recent Advances in Model Predictive Control: Theory, Algorithms, and
Applications, ed. by T. Faulwasser, M.A. Müller, K. Worthmann (Springer, Cham, 2021),
pp. 65–105
8. H.J. Ferreau, C. Kirches, A. Potschka, H.G. Bock, M. Diehl, qpOASES: A parametric active-set
algorithm for quadratic programming. Math. Program. Comput. 6, 327–363 (2014)
Multi-Level Inversion Based on Mesh
Decoupling
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 303
M. van Beurden et al. (eds.), Scientific Computing in Electrical Engineering,
Mathematics in Industry 36, https://doi.org/10.1007/978-3-030-84238-3_30
304 B. Shachor et al.
2 Problem Description
∇ · (k(x)∇uj ) = fj on Ω. (1)
The source function f (x) is defined as a set of point sources at the source
locations. The state equation for u(x) is solved as many times as the number of
sources. At each solve, one source is set to have amplitude one and all other source to
amplitude zero. After each solve, the computed solution is evaluated at the receiver
locations. After Ns state equation solves, Ns vectors u j of size Nr are available.
Each of these vectors depends on k(x).
Multi-Level Inversion Based on Mesh Decoupling 305
4 Numerical Results
k(x) = −0.2e−72(x−0.45) ,
2
(2)
10-4
10-6
10-8
10-10
100 101 102 103
Number of PDE solves
Fig. 1 Comparison of the single-level (SL) and multi-level (ML) algorithm by considering the
convergence history in the cost functional as function of the number of PDE solves in the
one-dimensional problem. The single-level method converges slowly until reaching the basin of
attraction. Subsequently is converges faster. The multi-level method converges super-linearly on
each level
Multi-Level Inversion Based on Mesh Decoupling 309
10-2
10-3
10-4
10-5
100 102 104
Number of PDE solves
Fig. 2 Comparison of the single-level (SL) and multi-level (ML) algorithm by considering the
convergence history in the cost functional as function of the number of PDE solves in the two-
dimensional problem. Similar observation as in the one-dimensional problem in Fig. 1
algorithm is seen not to suffer from slow convergence on the finest level. The method
is on the contrary seen to significantly reduce the cost functional in a limited number
The solution from the coarser level provides
of PDE solves in the coarsest levels in k.
a sufficiently good initial approximation for the Newton algorithm to converge
superlinearly on a given level. The multi-level algorithm requires considerably less
PDE solves to reach solutions with moderate accuracy. However, to reach the final
solution, both the single and multi-level algorithm require the same number of PDE
solves. This is due to the fact that on the finest level both the single and multi-level
algorithm converge very fast. Further research is required to circumvent this issue.
Figure 3 shows the converge of the single-level and multi-level algorithm in
terms of the evolution of the design variables k. The figure shows that the multi-
level algorithm avoids premature small scale variations in k. The figure also shows
that the single and multi-level algorithm converge to the same solution. The same
message is conveyed in Fig. 4 for the two-dimensional problem.
310 B. Shachor et al.
0.624 0.88 1
1
0.622
0.86 0.95
0.62
K
K
0.5
0.84 0.9
0.618
0
0.616 0.82 0.85
0.614
-0.5 0.8 0.8
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
X X X X
0.6245
1 1
1
0.624
K
0.5
0.623 0.9 0.9
0.6225
0
0.85 0.85
0.622
Fig. 3 Comparison of the single-level (SL) and multi-level (ML) algorithm by considering the
convergence history in the design variables k starting from the same initial guess in the one-
dimensional problem. The second column of pictures shows that the ML algorithm avoids
premature small scale variations in k. (a) SL, iteration 0. (b) SL, iteration 1. (c) SL, 384 PDE
solves. (d) SL, final result. (e) ML, level 1, Iteration 0. (f) ML, level 1, iteration 1. (g) ML, 392
PDE solves. (h) ML, final result
Fig. 4 Assumed exact distribution of the design variables k (left) and the solution found by the
single-level (middle) and multi-level algorithm (right)
5 Conclusions
References
1. D. Chavent, Nonlinear Least Squares for Inverse Problems (Springer, New York,, 2010)
2. A.R. Conn, N.I. Gould, P.L. Toint, Trust Region Methods (SIAM, Philadelphia, 2000)
3. R. de Moraes, H. Hajibeygi, J.D. Jansen, A multiscale method for data assimilation. Comput.
Geosci. 24, 425–442 (2020)
4. D. Echeverría, P.W. Hemker, Manifold mapping: a two-level optimization technique. Comput.
Vis. Sci. 11(4–6), 193–206 (2008)
5. E. Haber, Computational Methods in Geophysical Electromagnetics (SIAM, Philadelphia,
2014)
6. D. Lahaye, W. Mulckhuyse, Adjoint sensitivity in PDE constrained least squares problems as a
multiphysics problem. COMPEL: Int J Comput. Math. Electri. Electron. Eng. 31(3), 895–903
(2012)
7. B. Shachor, Multi-Level Inversion Based On Mesh Decoupling. Master’s thesis, TU Delft,
2019
8. B. Shachor, Multi-Level Inversion Based on Mesh Decoupling for Poisson inverse problems
with Dirichlet BC. https://github.com/Bennyshachor/Multi-Level. Cited 12 Jan 2020
9. A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation (SIAM,
Philadelphia, 2005)
10. C.R. Vogel, Computational Methods for Inverse Problems (SIAM, Philadelphia, 2002)