0% found this document useful (0 votes)
161 views

Vineet PDF

This thesis presents a finite volume algorithm to solve Maxwell's equations on parallel computers. The algorithm is developed to compute the radar cross section of aerospace vehicles. Both data parallel and message passing approaches are used to solve Maxwell's equations on body-conformal curvilinear grids in parallel. Results showing scattering from various targets like a sphere, ogive, ellipsoid and aircraft components agree well with exact solutions, experiments and other codes.

Uploaded by

anubhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
161 views

Vineet PDF

This thesis presents a finite volume algorithm to solve Maxwell's equations on parallel computers. The algorithm is developed to compute the radar cross section of aerospace vehicles. Both data parallel and message passing approaches are used to solve Maxwell's equations on body-conformal curvilinear grids in parallel. Results showing scattering from various targets like a sphere, ogive, ellipsoid and aircraft components agree well with exact solutions, experiments and other codes.

Uploaded by

anubhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 160

The Pennsylvania State University

The Graduate School


Department of Aerospace Engineering

A FINITE VOLUME TIME DOMAIN METHOD


FOR MAXWELL'S EQUATIONS ON PARALLEL COMPUTERS

A Thesis in
Aerospace Engineering
by
Vineet Ahuja

Submitted in Partial Ful llment


of the Requirements
for the Degree of

Doctor of Philosophy

December 1995
We approve the thesis of Vineet Ahuja.

Date of Signature

Lyle N. Long
Associate Professor of Aerospace Engineering
Thesis Adviser
Chair of Committee

Philip J. Morris
Boeing Professor of Aerospace Engineering

Raymond J. Luebbers
Professor of Electrical Engineering

Charles L. Merkle
Professor of Mechanical Enginnering

Dennis K. McLaughlin
Professor of Aerospace Engineering
Head of the Department of Aerospace Engineering
ABSTRACT

A nite volume algorithm to solve Maxwell's equations on parallel computers


has been developed. The main goal in developing this algorithm is to compute the
Radar Cross Section (RCS) of components of aerospace vehicles. Both a data parallel
and a message passing approach to solving the Maxwell's equations for generalized
body conformal curvilinear grids on parallel computers is presented. The 3-D nite
volume algorithm is explicit in nature and is especially suited for the message passing
paradigm. It utilizes a four stage Runge-Kutta time integration method. Integration
of the Maxwell's equations is carried out on a dual grid wherein the electric and
magnetic eld quantities are evaluated on di erent grids. The formulation used in
this case is for the scattered eld and the Liao boundary condition is used at the outer
non-re ecting boundary. The far eld transformation has also been implemented to
evaluate the far zone scattering results. Although both the message passing and data
parallel paradigms have been utilized, the message passing approach has been found
to be more e ective. In utilizing the message passing paradigm, a zonal approach is
taken. Each zone is placed on a separate processor and inter-processor communication
is carried out using the Message Passing Library (MPL). The algorithm has been
successfully tested on the SP-2 and the CM-5 computers in solving scattering problems
of electromagnetic waves from various targets. RCS results are presented for the
problem of scattering from a perfectly conducting sphere, a perfectly conducting ogive,
an ellipsoid and the NASA almond. These results are in extremely good agreement
with the exact solution, experimental data and results obtained with a standard nite-
di erence time-domain code. Qualitative results are also provided for scattering from

iii
realistic geometries like a metallic trapezoidal wing, and an aircraft engine inlet.

iv
TABLE OF CONTENTS

LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : viii

LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xii

ACKNOWLEDGMENTS : : : : : : : : : : : : : : : : : : : : : : : : : : : xiii

VITA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiv

1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
1.1 Scattering Problems in Electromagnetics : : : : : : : : : : : : : : : : 4
1.2 Advances in Computational Electromagnetics : : : : : : : : : : : : : 4
1.2.1 Structured Grids : : : : : : : : : : : : : : : : : : : : : : : : : 8
1.2.2 Unstructured : : : : : : : : : : : : : : : : : : : : : : : : : : : 9
1.2.3 Hybrid Methods : : : : : : : : : : : : : : : : : : : : : : : : : : 10
1.3 Parallel Computing : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11
1.4 Thesis scope and outline : : : : : : : : : : : : : : : : : : : : : : : : : 15

2 NUMERICAL MODEL : : : : : : : : : : : : : : : : : : : : : : : : : : 17
2.1 Governing Equations : : : : : : : : : : : : : : : : : : : : : : : : : : : 17
2.2 The Electromagnetic-Acoustic Analogy : : : : : : : : : : : : : : : : : 21
2.3 Integral Form of the Maxwell's Equations : : : : : : : : : : : : : : : : 24
2.4 Scattered Field Formulation : : : : : : : : : : : : : : : : : : : : : : : 26
2.5 Dual Grid : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 27
2.6 Discretization and Time Integration : : : : : : : : : : : : : : : : : : : 29

v
2.7 Dispersion, Dissipation and Arti cial Dissipation : : : : : : : : : : : 36
2.8 Time Step Calculation : : : : : : : : : : : : : : : : : : : : : : : : : : 41
2.9 Boundary Conditions : : : : : : : : : : : : : : : : : : : : : : : : : : : 44
2.9.1 Surface Boundary Condition : : : : : : : : : : : : : : : : : : : 44
2.9.2 Outer Radiation Boundary Condition : : : : : : : : : : : : : : 47

3 NEAR TO FAR FIELD TRANSFORMATION : : : : : : : : : : : : 55


3.1 Steady State Transformation : : : : : : : : : : : : : : : : : : : : : : : 56
3.2 Time Dependent Transformation : : : : : : : : : : : : : : : : : : : : 59

4 PARALLELIZATION ISSUES : : : : : : : : : : : : : : : : : : : : : : 63
4.1 Data Parallel and CM Performance : : : : : : : : : : : : : : : : : : : 63
4.2 Message Passing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 69
4.3 Message Passing Libraries : : : : : : : : : : : : : : : : : : : : : : : : 70
4.4 Implementation in MPL : : : : : : : : : : : : : : : : : : : : : : : : : 72
4.5 Domain Decomposition Strategies : : : : : : : : : : : : : : : : : : : : 75
4.6 Message Passing Performance : : : : : : : : : : : : : : : : : : : : : : 81
4.7 Performance Results : : : : : : : : : : : : : : : : : : : : : : : : : : : 83

5 RESULTS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 86
5.1 Wave Propagation : : : : : : : : : : : : : : : : : : : : : : : : : : : : 87
5.2 Scattering from a sphere : : : : : : : : : : : : : : : : : : : : : : : : : 88
5.3 Scattering from an ogive : : : : : : : : : : : : : : : : : : : : : : : : : 101
5.4 Scattering from an ellipsoid : : : : : : : : : : : : : : : : : : : : : : : 107
5.5 Scattering from the NASA almond : : : : : : : : : : : : : : : : : : : 110
5.6 Scattering from a trapezoidal wing : : : : : : : : : : : : : : : : : : : 120

vi
5.7 Scattering from an engine inlet : : : : : : : : : : : : : : : : : : : : : 121

6 CONCLUSIONS AND FUTURE RECOMMENDATIONS : : : : 129


6.1 Summary and Conclusions : : : : : : : : : : : : : : : : : : : : : : : : 129
6.2 Limitations and Drawbacks : : : : : : : : : : : : : : : : : : : : : : : 131
6.3 Future Work And Recommendations : : : : : : : : : : : : : : : : : : 132

REFERENCES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 133

APPENDIX : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 143
A Di erent forms of Maxwell's equations : : : : : : : : : : : : : : : : : 143

vii
LIST OF FIGURES

1.1 Scattering mechanisms for a combat aircraft. : : : : : : : : : : : : : : 2


1.2 Contribution of RCS for a combat aircraft. : : : : : : : : : : : : : : : 3
1.3 Scattering from an object. : : : : : : : : : : : : : : : : : : : : : : : : 5
1.4 A typical Yee Cell. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7
1.5 Schematic of a shared memory system. : : : : : : : : : : : : : : : : : 12
1.6 Schematic of a local memory system. : : : : : : : : : : : : : : : : : : 13

2.1 Induced currents for decreasing and increasing ux density B. : : : : 18


2.2 A typical dual grid cell. : : : : : : : : : : : : : : : : : : : : : : : : : : 29
2.3 A H- eld cell. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 30
2.4 A E- eld cell. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 30
2.5 Surface area calculation. : : : : : : : : : : : : : : : : : : : : : : : : : 32
2.6 Illustration of points for ux calculation. : : : : : : : : : : : : : : : : 33
2.7 Points contributing to calculation of ux on surface. : : : : : : : : : : 34
2.8 Decomposition of a hexahedron cell into ve tetrahedra : : : : : : : 35
2.9 Dissipation for a Runge-Kutta with central di erencing. : : : : : : : : 37
2.10 A comparison of the dispersion characteristics. : : : : : : : : : : : : : 40
2.11 Dissipation for a staggered system. : : : : : : : : : : : : : : : : : : : 42
2.12 Flux Evaluation on PEC Surface. : : : : : : : : : : : : : : : : : : : : 46
2.13 PEC Surface Boundary implementation. : : : : : : : : : : : : : : : : 46
2.14 Comparison of the electric eld in a plane parallel to direction of incidence 53
2.15 Comparison of the electric eld in a plane perpendicular to direction
of incidence : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 54
viii
3.1 Staircasing errors in scattering from a sphere. : : : : : : : : : : : : : 57
3.2 Schematic for farzone potential computation. : : : : : : : : : : : : : : 60

4.1 Serial-data parallel far eld comparison. : : : : : : : : : : : : : : : : 68


4.2 Processor Domain Decomposition. : : : : : : : : : : : : : : : : : : : : 72
4.3 Message Passing Algorithm. : : : : : : : : : : : : : : : : : : : : : : : 75
4.4 Schematic of 1-D domain decomposition. : : : : : : : : : : : : : : : : 76
4.5 Schematic of 2-D domain decomposition. : : : : : : : : : : : : : : : : 77
4.6 Schematic of 3-D domain decomposition. : : : : : : : : : : : : : : : : 78
4.7 Zonal grid with complex arrangement of zones. : : : : : : : : : : : : : 79
4.8 Linkage between zones in a disorderly manner. : : : : : : : : : : : : : 80
4.9 Processor performance scaleup. : : : : : : : : : : : : : : : : : : : : : 84

5.1 T-M Wave propagation Ez component. : : : : : : : : : : : : : : : : : 88


5.2 T-M Wave propagation Ez component. : : : : : : : : : : : : : : : : : 89
5.3 T-M Wave propagation Ez component. : : : : : : : : : : : : : : : : : 89
5.4 T-M Wave propagation Ez component. : : : : : : : : : : : : : : : : : 89
5.5 T-M Wave propagation Ez component. : : : : : : : : : : : : : : : : : 90
5.6 T-M Wave propagation Ez component. : : : : : : : : : : : : : : : : : 90
5.7 T-M Wave propagation Hy component. : : : : : : : : : : : : : : : : : 90
5.8 T-M Wave propagation Hy component. : : : : : : : : : : : : : : : : : 91
5.9 T-M Wave propagation Hy component. : : : : : : : : : : : : : : : : : 91
5.10 T-M Wave propagation Hy component. : : : : : : : : : : : : : : : : : 91
5.11 T-M Wave propagation Hy component. : : : : : : : : : : : : : : : : : 92
5.12 T-M Wave propagation Hy component. : : : : : : : : : : : : : : : : : 92
5.13 T-M Wave propagation - comparison with Yee scheme. : : : : : : : : 92
ix
5.14 T-M Wave propagation - comparison with Yee scheme. : : : : : : : : 93
5.15 T-M Wave propagation - comparison with Yee scheme. : : : : : : : : 93
5.16 T-M Wave propagation - comparison with Yee scheme. : : : : : : : : 93
5.17 T-M Wave propagation - comparison with Yee scheme. : : : : : : : : 94
5.18 Grid for the sphere. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95
5.19 Points of near zone comparison for the sphere. : : : : : : : : : : : : : 96
5.20 Scattered eld comparison for the sphere 0.02m and 0.04m in the
backscatter direction. : : : : : : : : : : : : : : : : : : : : : : : : : : : 98
5.21 Scattered eld comparison for the sphere at  = 90 and  = 180 in the
backsc atter direction. : : : : : : : : : : : : : : : : : : : : : : : : : : 99
5.22 RCS for the sphere. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 100
5.23 Scattering from the sphere in a plane perpendicular to the direction of
incidence. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 102
5.24 Scattering from the sphere in a plane parallel to the direction of incidence.103
5.25 Grid for the ogive. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 104
5.26 View of a skewed cell along the axis of symmetry. : : : : : : : : : : : 105
5.27 RCS for the ogive. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 106
5.28 Surface grid for ellipsoid. : : : : : : : : : : : : : : : : : : : : : : : : : 107
5.29 Planar slices for the ellipsoid. : : : : : : : : : : : : : : : : : : : : : : 108
5.30 Backscatter RCS comparison for the ellipsoid and ogive. : : : : : : : 109
5.31 Scattering of an ellipsoid with time. : : : : : : : : : : : : : : : : : : : 111
5.32 Schematic of the NASA almond. : : : : : : : : : : : : : : : : : : : : : 112
5.33 Outer grid for the NASA almond. : : : : : : : : : : : : : : : : : : : : 114
5.34 Surface grid for the NASA almond. : : : : : : : : : : : : : : : : : : : 115
5.35 Grid depicting loss of orthogonality around the NASA almond. : : : : 116
x
5.36 Depiction of incidence for the NASA almond. : : : : : : : : : : : : : 116
5.37 RCS for the NASA almond - VV Polarization. : : : : : : : : : : : : : 117
5.38 RCS vs frequency for the NASA almond - VV Polarization. : : : : : : 118
5.39 Ez Scattered Fields for the NASA almond at 3GHz. : : : : : : : : : : 119
5.40 Grid for the trapezoidal wing. : : : : : : : : : : : : : : : : : : : : : : 122
5.41 Scattered Ez elds for the trapezoidal wing. : : : : : : : : : : : : : : 123
5.42 Scattered H elds for the trapezoidal wing. : : : : : : : : : : : : : : : 123
5.43 Scattered Ez elds for the trapezoidal wing. : : : : : : : : : : : : : : 124
5.44 Scattered Hy elds for the trapezoidal wing. : : : : : : : : : : : : : : 124
5.45 Grid for the engine inlet. : : : : : : : : : : : : : : : : : : : : : : : : : 126
5.46 Zonal Distribution for the engine inlet. : : : : : : : : : : : : : : : : : 127
5.47 Scattered Ez elds for di erent zones of the engine inlet. : : : : : : : 127
5.48 Scattered Ez eld for a planar slice of the engine inlet. : : : : : : : : 128

xi
LIST OF TABLES

1.1 Penn State CM-200 and NASA Ames CM-5 characteristics : : : : : : 15


1.2 Penn State and MHPCC SP-2 characteristics : : : : : : : : : : : : : : 15

2.1 Isomorphism of variables between an acoustic plane wave and a linearly


polarised electromagnetic wave : : : : : : : : : : : : : : : : : : : : : : 24

4.1 Performance of Message Passing Libraries on the SP-2 : : : : : : : : 71


4.2 Message Passing Performance without farzone and dissipation : : : : 81
4.3 Message Passing Performance with dissipation without farzone : : : : 82
4.4 Message Passing Performance with farzone without dissipation : : : : 83
4.5 Performance of Codes in secs/node/timestep : : : : : : : : : : : : : 85
4.6 Grid Point Comparison for the Ogive : : : : : : : : : : : : : : : : : : 85

xii
ACKNOWLEDGMENTS

I would like to start by thanking Dr S Venkateswaran for injecting the con dence
in me to start working in a eld I had little knowledge about. Words do fail me, in
expressing my heartfelt gratitude to Prof L.N. Long for his encouragement, advise,
support, guidance and patience during the course of this work. I would also like to
thank Prof R.J. Luebbers for his many useful suggestions that have been invaluable
in charting the direction of this thesis. Many thanks are also due to Prof P.J. Morris
and Prof C.L. Merkle for their comments and suggestions.
I would also like to extend my thanks to members of the Computational Aero-
Acoustics (CAA) group at Penn State for their input in aiding this work.
I cannot help but applaud the computational support provided by Kevin Mo-
rooney and Je rey J Nucciarone of the Center for Academic Computing at Penn
State.
My sincerest appreciation to all my oce-mates, past and present, for their
help, support and patience in putting up with my unpredictable, iconclastic nature.
In particular, I would like to mention Mr Yusuf Ozyoruk and Major Je Little for the
technical assistance they provided me with.
Finally, I would like to thank Mr Joe Schuster and Mr Branko Kosovic for
reviewing my thesis and for their technical collaboration during the entire course of
this thesis.

xiii
VITA

Vineet Ahuja was born on the banks of the river Yamuna in Delhi, India on
September 19, 1966. In June 1988, he obtained a B.A.(hons) degree from St. Stephens
College, Delhi University. He graduated from the Indian Institute of Technology, in
Bombay, with a Master of Science in Applied Mathematics in July 1990.

xiv
Chapter 1

INTRODUCTION

Five decades ago, the ENIAC computer, armed with its vacuum tubes and relay
memories, did battle in the elds of xed-point arithmetic. Five relentless revolutions
later, Massively Parallel Processing(MPP) marches on to combat the grand challenge
applications. The evolution of the computer from a serial mechanical device to a
multitude of high speed processors working in unison has led to rapid advances in the
eld of scienti c computing. Problems exhibiting a fair amount of complexity like
weather forecast modelling, arti cial intelligence, design of VLSI circuits, turbulence
modelling involving Large Eddy Simulation, and stealth technology are being tackled
with existing computer technology [1].
Like all the other grand challenge problems, stealth and vehicle signature tech-
nology requires systems speeds approaching the tera op regime and a memory capac-
ity of hundreds of gigabytes [2], [3], [4]. Fundamental to the stealth technology is the
scattering problem in electromagnetics, which has generated a fair amount of interest,
since Lord Rayleigh [5] investigated the scattering of electromagnetic waves from a
sphere. Since Radar Cross Section (RCS) is physically described as the fraction of
the electromagnetic radiation scattered back to the radar with respect to the imping-
ing radar signals, scattering processes have become important in the identi cation of
objects. Radar based identi cation of objects based on the scattering patterns that
emanate from them has widespread uses in military and space exploration. Reduction
of RCS of ghter aircrafts, missiles and spy satellites is of paramount importance to

1
2

Figure 1.1. Scattering mechanisms for a combat aircraft.


the military. Enhancing the RCS of vehicles that need to be tracked like transport
airplanes, ships and satellites for remote sensing is an important application of the
scattering process. Scattering and re ection of electromagnetic waves from airport
hangars, and buildings that cause interference and adversely a ect radar performance
also needs to be analysed. The control of RCS (Radar Cross-Section) has become a
challenging and important problem, particularly for military applications where re-
duction of RCS is of paramount importance. Electromagnetic scattering from targets
like full aircraft con gurations is a complicated process involving various scattering
mechanisms [6], [7]. Figure 1.1 depicts some of the typical scattering mechanisms for
a ghter aircraft and gure 1.2 shows the amount of RCS contribution of the various
parts of the aircraft. As is evident, prediction of an aircraft's radar signature is an
extremely complicated task. Approximate analytical methods are unsuitable for the
estimation of all the scattering related phenomenon and the use of numerical methods
is warranted.
In order to obtain accurate and reliable results from complex shaped targets
at realistic frequencies we need a high resolution in our numerical approximations,
3

Figure 1.2. Contribution of RCS for a combat aircraft.


thereby, making both computational speed and memory important issues. Cou-
pled with the gain in computing speed associated with parallel computing is the
large increase in working memory. This has enabled researchers in Computational
Fluid Dynamics (CFD) and Computational Electromagnetics (CEM) to model the
elds around complicated three-dimensional bodies involving extremely large com-
putational domains. For the development of stealth type technology it is imperative
that calculations related to radar cross section (RCS) of full aircraft con gurations
be feasible. The emphasis of this thesis lies in exploiting the parallelism of available
computer technology in adapting it to solve for electromagnetic wave propagation
problems around complex bodies.
The next part of this chapter is devoted to scattering problems in electromag-
netics (section 1.1). In section 1.2 the algorithms that are presently used in Com-
putational Electromagnetics (CEM) are discussed. Section 1.3 deals with parallel
computing and its impact on electromagnetics.
4

1.1 Scattering Problems in Electromagnetics


The level of complexity of practical problems related to the scattering phe-
nomenon make computational methods a unique tool in tackling such problems. In
this section a brief description of a typical scattering problem is provided.
A typical electromagnetic scattering problem consists of the target or scatterer
that is embedded in the computational domain. In reality, a transmitter located at
a large distance from the target, directs an incident beam at the scatterer. Com-
putationally, however it would be very expensive and inecient to model the whole
domain between the transmitter and the target. Hence the scatterer is analytically
excited, leading to induced surface currents being setup on the surface of the scatterer
and the scattered waves emanating from it ( g 1.3). In most cases, the receiving or
collecting apparatus is also located very far away from the target.
Since it would be very expensive and inecient to have the computational do-
main extend to the receiving radar, the computational domain is truncated at a
reasonable distance from the scatterer and the radiated elds in the far- eld are
computed with the help of a Green's function. This is usually achieved by de ning
a closed surface that encloses the target con guration and computing the scattered
elds and the associated currents on this imaginary surface. Then the scattered elds
in the far-zone are calculated by using a near-to-far-zone transformation, utilizing the
calculated elds on the imaginary surface.

1.2 Advances in Computational Electromagnetics


The rapid growth of computational resources in the past four decades has lead
to the increased use of Computational Electromagnetics (CEM) in solving problems
5

Computational Domain

scattered waves

CCCCCCCC
CCCCCCCC farzone surface
incident
wave
CCCCCCCC
Target
CCCCCCCC
CCCCCCCC

Outer Radiation Boundary Condition

Figure 1.3. Scattering from an object.


related to electromagnetic interactions. Of the common methods used to analyse these
problems are integral equation techniques like the Method of Moments [8] [9] , time
domain integration methods like Finite Di erence(FDTD) [10] [11] [12] [13], Finite
Volume (FVTD) [14] [15] [16] [17], Finite Element [18] [19] [20] and high frequency
approximate analysis like the Geometric Theory of Di raction (GTD) [21], [22], [23].
Typical problems that are solved by the above mentioned methods include problems of
Radar Cross Sections (RCS), antenna design, microwave circuits, bio-electromagnetic
analysis, power generation and transmission, and shielding.
The origin of the use of time domain calculations to solve problems of wave
propagation, scattering and radar cross-section can be traced to Yee. In 1966, Yee
[13] presented an algorithm that provided a numerical solution to an initial boundary
6

value problem involving Maxwell's equations in the time domain for an isotropic me-
dia. This novel approach of utilizing digital computers to solve Maxwell's equations
in the time domain deviated from the trend of solving a set of ordinary di erential
equations in the frequency domain. Yee's algorithm, as it is now known, utilised the
leapfrog method to integrate the equations on a completely staggered grid i.e. all
the electric eld and magnetic eld components were located at completely di erent
positions on the grid. The electric eld components lie in the middle of the edges
and the magnetic eld components are located at the center of the faces ( g 1.4).
Yee's approach is notable in that it was the rst paper that used nite di erences in
electromagnetics, thus giving birth to the discipline that came to be known as Com-
putational Electromagnetics (CEM). The staggered nature of his scheme is especially
suited to scattering type problems as it helps in easily evaluating boundary conditions
at the surface of the scatterer. This is primarily because only those eld components
lie on the surface for which the boundary conditions can be analytically determined.

Some of the advantages that accrued by using Yee's leapfrog algorithm include
the use of a uniform size staggered mesh thereby impeding the production and propa-
gation of spurious waves, eciency for parallel computation and simplicity in specify-
ing boundary conditions on the surface of targets. Over the past three decades, Yee's
algorithm has been used to solve various types of scatterers with complicated shapes
and materials [11]. However, this scheme has some serious pitfalls [24]. It is prone to
staircasing errors and requires the use of a large number of grid points for geometries
like the ogive and the NASA almond that have a high degree of curvature or have
corners and sharp edges included in them. In fact, FDTD has diculty modelling
7
z

Hz

Ez

Hx Hy

Ex

Ey

Figure 1.4. A typical Yee Cell.


low RCS geometries, which are often of most interest. In order to overcome the dis-
advantages of Yee's algorithm, many di erent methods have been used to carry out
a direct time integration of Maxwell's equations on various con gurations of spatial
grids. Apart from the emergence of various nite di erence (FDTD) algorithms there
has been considerable development in nite volume type algorithms (FVTD) [14],
[15], [16], [17], that directly exploit the integral form of Maxwell's equations. In fact,
some of the algorithms [2], [17] utilised in obtaining solutions to Maxwell's equations
are a result of the close collaboration of researchers in Computational Fluid Dy-
namics (CFD) and Computational Electromagnetics (CEM). Although the numerous
algorithms can be classi ed in many di erent ways, the emphasis will be on di eren-
tiating between algorithms based on the type of mesh they employ. By doing this,
we gain the advantage of being able to encompass all the di erent nite di erence,
nite volume and nite element models that are being utilised. A mesh based clas-
si cation would include three broad categories of meshes, namely: structured grids,
8

unstructured grids and hybrid grids.

1.2.1 Structured Grids


Cartesian
In this category are considered algorithms that essentially use a Cartesian grid.
The grid close to a curved surface is stair-stepped in nature. The Yee algorithm uses
such a grid. Such grids are ideal for wave-propagation problems since they have a
uniform cell size no spurious high frequency waves are generated due to grid clustering
[25]. However, the stair-stepped nature of the grid close to the surface of the body
does create spurious waves. Several investigators [26], [3], [27] have started using
body- tted grids close to the surface of the body and move to Cartesian grids further
away. Most of the algorithms considered in this category are completely staggered
in nature and computationally very ecient. Comparatively, they require far less
computational memory and CPU time.

Body Conformal Grids

Body- tted structured grid algorithms, are curvilinear orthogonal grids that
map the surface of the body exactly. In this case there exists a transformation between
the physical domain and the computational domain through a set of metrics. The
use of body- tted grids in Computational Electromagnetics evolved as a result of
technology transfer from Computational Fluid Dynamics. There are several nite
di erence and nite volume methods that are employed in solving CEM problems.
The algorithms used by Shankar [2], Shang [17] are characteristic based methods
that use one-sided di erencing based on the direction of wave propagation. Shang
9

[17] has used a ux-vector splitting Alternating Direction Implicit (ADI) scheme
by Douglas and Gunn for his three-dimensional characteristic formulation. Shankar
[2] uses Osher's Riemann solver for ux linearization with a Lax-Wendro scheme.
Both of the above approaches use collocated E and H eld points, hence failing to
exploit the duality of Maxwell's equations. Goorjian [10] proposed that a staggered
formulation could be used on curvilinear body- tted grids. However, the nature of the
staggering is di erent from Yee staggering ( g 1.4) in that all the E- eld components
are collocated as are all the H- eld components. Such a staggering procedure keeps
the advantages of Yee staggering with respect to simplicity in implementation of
the boundary conditions, without the limitation of a uniform Cartesian mesh and
staircasing errors. Noack and Anderson [15] extended the Goorjian staggering to a
nite volume formulation. Body conformal structured algorithms are rather ecient
in computational time and memory as compared to unstructured or hybrid methods.
They are relatively easy to parallelize and load balancing between processors is not
a problem.

1.2.2 Unstructured
Unstructured grid methodology has become very popular in the CFD commu-
nity in the last few years mainly due to the relative ease with which unstructured grids
can be generated around complex con gurations [28], [29]. Unstructured grid solvers
are replacing structured solvers for steady state aerodynamic problems [30] [31] [32],
particularly those that need to solve the Euler equations [33] [34] [30]. Since Maxwell's
equations can be readily put in the form of the Euler equations unstructured grids
seemed to be promising in CEM too. However, unstructured methodology in CFD is
10

still largely restricted to problems of steady state and there has been little develop-
ment of time dependent type wave propagation problems. This is largely attributed
to problems of ux reconstruction and accuracy. Although there is tremendous time
savings in generating unstructured grids it seems to be o set by the fact that the
computer time required for unstructured ow solvers is considerably more than that
required by structured solvers. Unstructured grid methods also require more memory
because of the storage of grid information regarding grid connectivity between the
faces and vertices of the grid cells. Many nite volume and nite element models [19],
[20], [35] fall in this category. However, most of the nite volume methods employed
here have problems of accuracy especially for time dependent problems. The algo-
rithms in this category are computationally very intensive. There is an additional
overhead cost of maintaining connectivity arrays and inverting large matrices in the
case of nite element methods.

1.2.3 Hybrid Methods


Hybrid methods refer to methods that utilize an unstructured mesh or a con-
formal mesh close to the surface of the scatterer and uses Cartesian grids in the
surrounding area. The elds are computed at the interface between the two grid sys-
tems by interpolation. Yee and Chen [26] have used a prismatic grid near the surface
of the scatterer and have transitioned to a regular rectangular grid away from the
scatterer. At each time step a double interpolation needs to be carried out from the
data of one grid to move to the other. Accuracy is presumably di erent on both grid
systems and there has been little done so far in terms of analysis of the propagation
of errors between grid systems. Recently Riley and Turner [27] have used a nite
volume hybrid technique where they have terminated the unstructured mesh on the
11

rectangular mesh thereby removing the need to carry out any sort of interpolation.
However, grid generation in itself has become a tedious process, and the advantage
gained by using an unstructured mesh seems to be lost in the process. Paralleliza-
tion of these hybrid methods is quite dicult and load balancing becomes an issue of
serious concern.

1.3 Parallel Computing


Computer technology has come a long way from the days of mechanical levers
to the use of thousands of high speed processors working cohesively in parallel. His-
torically speaking, the fourth generation of computers is largely responsible for the
introduction of parallel computing. Representative of this generation are machines
like the VAX/9000, Cray X/MP, and the IBM 3090. During the late 1980's paral-
lelism in computers was mostly sought in the development of vector supercomputers.
The Cray Y-MP, NEC SX3/14 and the Fujitsu VP2600/10 are some of the products
of this time. The early 1990's has seen the emergence of massively parallel machines
with the Thinking Machine's CM-5, MasPar MP-2, the Intel Paragon, the Cray T3D,
and IBM's SP class of machines. Flynn's taxonomy categorises computers into four
basic types:
 SISD - Single Instruction Single Data
 SIMD - Single Instruction Multiple Data
 MISD - Multiple Instruction Single Data
 MIMD - Multiple Instruction Multiple Data
Although the third category is purely academic and category SISD represents
a truly serial machine, most systems that exhibit some degree of paralellism can be
placed in the SIMD or MIMD classes. For most SIMD machines di erent data is
12

Processors

P1 P2 Pi Pn

Connecting Network

MB1 MB2 MBm

Memory Blocks

Figure 1.5. Schematic of a shared memory system.


stored on each node or processor and the same instructions are applied to all the
processors simultaneously although di erent data may be accessible . In the case
of MIMD machines several di erent instructions can be processed simultaneously on
di erent data distributed among the processors [1] and [36]. Although Flynn's taxon-
omy is the most conventional way of classifying machines, memory based organization
is more convenient for parallel computers. Parallel machines are normally classi ed
either as shared memory or distributed memory systems. Shared memory systems
are de ned as systems consisting of processing elements and memory blocks, wherein
there exists an interconnecting system that permits all processors to be linked to
memory (Fig 1.5). Shared memory systems do not scale well beyond about 16 pro-
cessors.
Distributed or local memory systems (Fig 1.6) on the other hand, have multiple
processing elements each with its independent memory.
13

Processors
P1 P2 Pi Pn

Memory Blocks
MB1 MB2 MBi MBn

Connecting Network

Figure 1.6. Schematic of a local memory system.


In the recent past, however, this distinction between shared and local memory
systems has become less well-de ned since processors on shared memory systems have
large cache memory, thereby exhibiting a structure akin to local memory systems.
However, there are some other key di erences between shared and local memory
systems - there is adequate parallelizing support in code development that is provided
on shared memory machines. Partitioning data and parallelization of programs is
more complicated on local memory systems although local memory systems provide
more control to the user. Distributed systems provide the exibility of expansion as
far as adding more processing elements is concerned.
Parallel algorithms can be classi ed under either data parallel or message pass-
ing paradigms. Message passing algorithms are mostly suited for distributed memory
systems. Message passing is especially suited for a zonal grid algorithm where di er-
ent zones can be placed on di erent sets of processors and the whole computational
domain can be divided up in a systematic manner. Message passing algorithms are
also computationally very ecient for problems in CEM and CFD as the user has
control in decomposing the domain and decomposition can be easily done such that
14

boundaries are equally divided between processors to achieve near perfect load bal-
ancing. Data parallel type algorithms are better suited to shared memory systems,
although not restricted to them. Data parallel algorithms do not show any speci c
preference for zonal grid type algorithms. However, data parallel type algorithms
carry a large overhead cost with processes that are inherently serial in nature. Pro-
gramming using a data parallel approach is relatively easy as data partitioning is
usually transparent to the user.
In the recent past, there has been a considerable e ort made in utilizing both
the data parallel and message passing paradigms to solving FDTD/FVTD problems
in computational electromagnetics. In [37] parallelization issues related to the Yee al-
gorithm are discussed for the data parallel paradigm. Parallelization issues have also
been discussed for FVTD type algorithms in [14] for the CM-200/CM-5 machines.
Certain drawbacks of the data parallel paradigm are the overhead costs incurred in
performing the near to far eld transformation and computation of the outer ra-
diation boundary condition. Shang et al [38] addresses the domain decomposition
strategies for message passing type electromagnetic algorithms. Rowell et al [16] have
also incorporated a zone based parallel strategy on the Intel Paragon for their ux
di erence algorithm. Nguyen and Hutchinson [12] have used the bicharacteristic form
of the Maxwell's equations on parallel architectures supporting message passing. In
[39] the authors have presented a dual grid based approach where the message passing
is reduced due to the fact that only the E- eld variables need to be passed across
processors, thus making the algorithm very ecient on parallel architectures that
support message passing.
Code development for the data parallel part of the current work was done on
the Thinking Machines Corporation CM-200 computer at The Pennsylvania State
15

University and most of the results shown herein were obtained on the CM-5 at NASA
Ames Research Center. The CM-200 is a SIMD type machine consisting of Weitek
oating point processors while the CM-5 has vector unit processors and can function
in both the SIMD and MIMD modes.
Table 1.1. Penn State CM-200 and NASA Ames CM-5 characteristics
Machine No. of Processors Peak Speed Memory
CM-200 64 1.2 Giga ops 0.256 GBytes
CM-5 128 16.4 Giga ops 4.096 GBytes

Message passing code development was done on the IBM 9076 Scalable Power
parallel system (SP-2) which comprises of a group of RISC system/6000 processors
connected by a low latency high bandwidth switch. For the purpose of the present
work both the SP-2 at The Pennsylvania State University and that at the Maui High
Performance Computing Center (MHPCC) were utilized.
Table 1.2. Penn State and MHPCC SP-2 characteristics
Machine Nodes Memory
PSU-SP2 32 RS/6000 model 390 128 MBytes/proc
16 RS/6000 model 370 128 MBytes/proc
1 RS/6000 model 590 1 GByte/proc
MHPCC-SP2 400 RS/6000 model 590 64-1024 Mbytes/proc

1.4 Thesis scope and outline


The main goal of this work is to develop a generic algorithm that

 solves scattering problems for a wide cross-section of scatterers including those


that have sharp corners and curvature.
16

 avoids the production of stair-stepping errors that are typical of Yee's leapfrog
algorithm.

 is able to take advantage of the dual nature of Maxwell's equations

 is able to exploit the parallelism in existing computer technology.

 has favourable dispersion and dissipation characteristics.

The governing equations, the numerical algorithm, spatial and temporal dis-
cretization are discussed in Chapter 2. This chapter also deals with issues regarding
accuracy and numerical dispersion and dissipation errors. Boundary conditions are
also treated in this chapter. The surface boundary conditions are derived and dis-
cussed in this chapter. Also the outer radiation boundary condition is dealt with
here.
Chapter 3 discusses the near-to-far-zone transformation and the post-processing
that is required to calculate the Radar Cross Section of targets.
Chapter 4 is devoted to the parallelization of the algorithm. Both the message
passing and data parallel implementations are discussed in detail.
Chapter 5 describes the scattering results for di erent bodies. Comparisons of
RCS with exact solution and other FDTD computations are shown in this part of the
thesis.
Chapter 6 summarizes the work described in this thesis and presents recommen-
dations for future work.
Chapter 2

NUMERICAL MODEL

The numerical algorithm is discussed in this chapter. The formulation, govern-


ing equations, discretization, time integration and the nature of the algorithm are
described in detail. Section 2.1 deals with the theory of Maxwell's equations. In sec-
tion 2.2 an analogy is developed between the electromagnetic and acoustic variables.
The implementation of Maxwell's equations in the integral form is shown in section
2.3. Since the algorithm uses a scattered eld formulation, the advantages of using
such an approach compared to the total eld formulation are discussed in section 2.4.
Section 2.5 explains the dual grid concept around which the algorithm is based and
expounds on the di erence in staggering between the dual grid concept used here and
the staggering used by Yee [13] and other conventional FDTD methods. Section 2.6
is devoted to Runge-Kutta time integration, the nite volume formulation used, and
the discretization of the equations. In Section 2.7 the dispersion and dissipation char-
acteristics of the scheme are discussed. The implementation of arti cial dissipation
is also addressed in section 2.7. Finally, section 2.8 deals with the calculation of the
timestep used in the algorithm.

2.1 Governing Equations


A brief review of the derivation of the Maxwell's equations is provided in this
section. A more comprehensive review can be found in most standard textbooks on
electromagnetic theory [40], [41] and [42]. The equations are then recast in integral

17
18

Wire loops

I I

Magnetic flux density B Magnetic flux density B


decreasing with time increasing with time

Figure 2.1. Induced currents for decreasing and increasing ux density B.


form as they have been used in the numerical algorithm.
The rst of Maxwell's equations originates from Faraday's law which states that
the total EMF induced in a closed circuit is equal to the time rate of decrease of the
total magnetic ux linking the circuit

V = , d
dt
m
(2.1)

The negative sign in the equation forces the relationship between EMF and current
directions in accordance with the right hand rule (Fig 2.1).
However, the total magnetic ux associated with the circuit can be written as
Z Z
m =
S
B  dS (2.2)

and
I
V = E  dl (2.3)
19

where the integration in 2.3 is over the path that encloses the surface S. Combining
equations 2.1, 2.2 and 2.3 we obtain
I Z Z
E  dl = , @t@ S
B  dS (2.4)

The line integral on the left hand side of equation 2.4 can be viewed as the electric
circulation around a path S that encloses the surface S.
Applying Stokes' theorem to the left hand side of equation 2.4 we have
Z Z Z Z
(r  E)  dS = , @ B  dS (2.5)
S @t S

The di erential form of Maxwell's rst equation can be written as

@B + r  E = 0 (2.6)
@t

Maxwell's second law can be derived from Ampere's law which states that the
line integral of magnetic eld around a closed contour is equal to the sum of the
conduction and displacement currents enclosed.
I Z
H  dl = S (Jcond + Jdisp)  dS (2.7)

The conduction current can be written as

Jcond = E (2.8)

for a medium having nite electrical conductivity.


20

And the displacement current through the surface is given as

Jdisp = @@tD (2.9)

Combining the above equations 2.7, 2.8 and 2.9 we have


!
Jcond + @@tD  dS
I Z Z
H  dl = S
(2.10)

The left hand side of equation 2.10 represents the magnetic circulation around the
curve S and the curve S is a magnetic contour.
In an analogous fashion to Maxwell's rst equation, by using Stokes' theorem
on the left hand side of equation 2.10 we have
!
Jcond + @@tD  dS
Z Z Z Z
(r  H)  dS = , (2.11)
S S

Hence Maxwell's second equation can be written in di erential form as

@D + J = r  H (2.12)
cond
@t

The nal two equations of Maxwell stem from Gauss's law relating to the conservation
of the electric and magnetic ux densities over a closed surface.
The electric ux density through a closed surface is equal to the volume integral
of the charge density enclosed by that surface
Z Z Z Z Z
S
D  dS = V
dV (2.13)
21

The di erential form can be obtained from the above equation 2.13 by applying the
divergence theorem to the term on the left hand side and assuming an in nitesimal
volume element
rD= (2.14)

The corresponding analog for the magnetic ux density is


Z Z
S
B  dS = 0 (2.15)

and the corresponding di erential form can be written as

rB=0 (2.16)

Although Maxwell has presented his equations in a rather cumbersome form


in his treatise [43], the elegant collection of equations 2.6, 2.12, 2.14 and 2.16 are
universally known as the Maxwell's equations. It should be noted here that the the
divergence equations 2.14 and 2.16 are included in the curl equations 2.6 and 2.12. It
would be interesting to digress from pure electromagnetic theory and draw parallels
between the theories of electromagnetic and acoustic waves. The next section is
devoted to relating the primitive electromagnetic variables to those used in acoustics.

2.2 The Electromagnetic-Acoustic Analogy


In this section an isomorphism is presented between the primitive electromag-
netic variables used in Section 2.1 and those used in the theory of acoustics. For this
purpose, a comparison is made between a linearly polarized electromagnetic plane
wave (polarised in the y-direction) travelling along x-axis and an acoustic wave that
22

is also travelling along the same direction. Mathematically, the electromagnetic wave
can be written by assuming Ex = Ez = Hx = Hy = 0 in the equations 2.6 and 2.12.
Equation 2.6 reduces to
@Hz = , 1 @Ey (2.17)
@t  @x
Similarly, equation 2.12 reduces to

@Ey = 1 @Hz (2.18)


@t  @x

Combining equations 2.17 and 2.18 we obtain the wave equation

@ 2 E , c 2 r2 E = 0 (2.19)
@t2 y y

which is analogous to the wave equation of linear acoustics

@ 2 p , C 2 r2 p = 0 (2.20)
@t2

where
p is the induced perturbation to the pressure,
C is the speed of sound in the medium.
It should be noted here that the above equation is for an ideal compressible uid
[44] with negligible dissipation processes and constant ambient pressure and density.
The other important acoustic equations that helps in establishing the electromagnetic
-acoustic analogy include the equation of conservation of mass 2.21 , the adiabatic
23

equation of state 2.23 and Newton's second law 2.24

@
s = , @x (2.21)

where
 is the uid particle displacement,
s is the condensation variable de ned as

s = ( , 0) (2.22)
0

p = as (2.23)

@ 2  = , 1 @p (2.24)
@t2 0 @x
Comparing equation 2.19 with 2.20 and equation 2.17 with 2.24 a correspondence can
be drawn for the primitive variables in acoustics with those in electromagnetics. This
isomorphism between the acoustic and electromagnetic variables, along with some of
the other physical quantities that are constituted by a combination of these variables,
is shown in table 2.1.
In the next section Maxwell's equations are retraced and put in its integral form
which is suitable for computation.
24

Table 2.1. Isomorphism of variables between an acoustic plane wave and a linearly
polarised electromagnetic wave
Electromagnetic Acoustic
Electric Field Ey pressure p
Magnetic Field Hz velocity _
Inverse Permittivity  1 Adiabatic bulk modulus a
Magnetic permeability  Ambient density 0
Magnetic Energy / vol. 2 Hz kinetic energy / mass 12 0 _2
1 2
Electric Energy / vol. 21 Ey 2 potential energy / massq 12 p2 a
Speed of light c = 1=p Speed of Sound C = a =0
EM Impedance Z = c Acoustic Impedance Z = 0 C
Poynting vector Sx = Ey Hz Intensity I = p_

2.3 Integral Form of the Maxwell's Equations


Maxwell's curl equations can be re-written in conservative vector form as

@ B = ,r  E (2.25)
@t
@D = r  H , j (2.26)
@t
For computational purposes equations 2.25 and 2.26 can be integrated over a volume
to obtain
Z Z Z @B Z Z Z
V @t dV = , r  E dV
V
(2.27)
Z Z Z @ D dV = Z Z Z
r  H dV (2.28)
V @t V

Applying Gauss Divergence Theorem to the right hand side of the above set of equa-
tions we obtain
Z Z Z
@ B dV = , Z Z n  E dS (2.29)
@t
V S
25

Z Z Z @ D dV = Z Z
n  H dS (2.30)
V @t S

The surface integral on the right hand side of equation 2.29 can be interpreted as the
electric vorticity over the surface S and similarly, the right hand side of equation 2.30
is the magnetic vorticity over the surface S.
In the case of homogeneous, linear, isotropic, and time-invariant materials the
following constitutive relationships can be de ned

D = E (2.31)

B = H (2.32)

where the electrical permittivity  and the magnetic permeability  can be considered
to be material dependent properties.
Assuming the present formulation to be for free space and combining equations
2.29 and 2.30 with equations 2.31 and 2.32 we have

@ Z Z Z HdV = , 1 Z Z n  EdS (2.33)


@t V  S
@ Z Z Z EdV = 1 Z Z n  HdS (2.34)
@t V  S
This is the nal form of the equations that will be used in the numerical al-
gorithm (sec 2.6) that will be used to model electromagnetic wave and scattering
phenomena. Note that the time variation of the magnetic eld is linked to the spatial
derivatives of the electric eld components and vice versa. This facilitates the use
26

of a staggered grid where the electric and magnetic eld components can easily be
obtained on di erent grids. This is discussed in detail in section 2.5. The next section
deals with the use of the scattered eld formalism to solve for the Maxwell's equations
and the advantages over the total eld formalism.

2.4 Scattered Field Formulation


Since Maxwell's equations are linear for most materials, the total electric and
magnetic elds can be split up into their scattered and incident components for these
materials and Maxwell's equations should independently hold for each of the above
mentioned components.

E total = E incident + E scattered (2.35)

H total = H incident + H scattered (2.36)

As is de ned in [11] the incident eld here is assumed to travel in free space
throughout the domain of the problem and the total eld is supposed to propagate in
free space outside the scatterer and in the media of the scatterer when propagating
within the scatterer. The scattered eld is de ned as in the equations above as
the resultant of the incident elds subtracted from the total elds. Scattered elds
emanate from the scattering object when the incident waves impinge upon them.
Separating the two elds has some inherent advantages in solving scattering
problems. The incident eld in this case can be speci ed analytically. Separating the
elds provides the exibility of making di erent waveforms incident on the scatterer
with relative ease. The only precaution to be taken in this case is that the speci ed
27

eld should be Maxwellian, i.e. they should satisfy Maxwell's equations exactly.
Analytically specifying the incident eld makes treatment of the outer boundary
condition easier to implement since only the outgoing scattered elds have to be dealt
with at this boundary. If a total eld formulation had been used, one would have
to keep track of the incoming and outgoing waves and boundary treatment would be
more complicated. Analysing scattered elds in situations where the scattered eld
amplitude is much lower than the incident eld amplitude is easier with a separate
eld formalism. In order to compute Radar Cross Section (RCS), scattered elds
are explicitly needed as the ratio of the far- eld scattered to incident amplitudes
are required. Thus the use of the separate eld formalism is again preferred, as less
post-processing work is required.

2.5 Dual Grid


Algorithms in CEM are frequently staggered in space. Using a staggered grid
to solve Maxwell's equations clearly has certain advantages. Firstly, it is easier to
specify surface boundary conditions using a staggered grid. The staggering can be
implemented in such a way that only the desired variables lie on the surface of the
boundary. For example, for a perfectly conducting surface (PEC) only the electric
eld components need lie on the surface of the conductor. In some cases, like the
Yee staggering, only the tangential electric eld component lies on the surface. The
magnetic eld variables need not lie on the surface in this case and would not have
to be speci ed at the boundary.
Secondly, the outer radiation boundary condition would not have to solve for
all the eld variables, but just those that lie on the outer boundary. Therefore the
number of equations that need to be solved at the outer boundary would be reduced
28

depending on the type of staggering.


Thirdly, staggering improves the dissipation and dispersion characteristics of
the scheme. Staggering the variables across the grid induces a certain amount of
dissipation that helps in preventing odd-even decoupling and damping of numerical
high frequency errors. In most cases, staggering precludes the use of explicitly adding
arti cial dissipation. This point would be taken up at length in section 2.7
Fourthly, in message passing type parallel algorithms staggering usually reduces
the amount of message passing that needs to be carried out, thereby increasing the
overall eeciency. This is primarily because only some of the variables need to be
stored on boundaries between contiguous domains.
The most common type of staggering used in CEM is the Yee staggering where
each electric eld variable is stored at the center of the edge that is parallel to the
corresponding coordinate axes. Each magnetic eld variable is stored at the center
of the cell face that is perpendicular to its respective coordinate axis (Fig 1.4). The
staggering used in the algorithm presented in this work can be labelled a "Dual Grid"
type of staggering. As is seen in Figure 2.2 all the E- eld components lie on the same
grid point and all the H- eld components lie on the same dual grid point. However,
the dual grid points are o set from the regular or E- eld grid points. Each dual grid
point is constructed by taking the average of the coordinates of the corners of each E-
eld cell. This type of staggering exploits the duality of Maxwell's equations since the
time derivative of the electric eld is directly dependent on the curl of the magnetic
eld and vice versa. Therefore a typical H- eld cell ( Figure 2.3) has an E- eld placed
at its centroid so all the uxes for that cell are computed using the H- eld variables
at the edges and those are used to update the E- eld at the centroid. The dual grid is
generated by rst using a standard grid generator to generate the grid for the E- eld
29

CCCC
CCCCCCCCCCCCC
CCCC H
CCCCCCCCCCCCC
E H
CCCC
CCCCCCCCCCCCC
CCCCCCCCC
CCCC
CCCCCCCCCCCCC
CCCCCCCCC
H CCCC
H
CCCCCCCCC
CCCC
CCCCCCCCC
CCCCH
E CCCCCCCCC
CCCC
CCCCCCCCC
CCCC
CCCCCCCCC
E
CCCC H

E E

Figure 2.2. A typical dual grid cell.


points for a scatterer that is a perfect conductor (and this grid would correspond to
a H- eld grid if the scatterer was composed of a magnetic material). The H- eld grid
is then generated from the E- eld grid by taking the average of the coordinates of the
corners of each E- eld cell.

2.6 Discretization and Time Integration


This section deals with the discretization of the equations, time integration, and
implementation of the nite volume algorithm on the dual grid. As has been previ-
ously discussed in section 2.4 the linear nature of the Maxwell's equations permits us
to solve only for the scattered elds since the analytical solution for the incident eld
is known. Henceforth the elds written without any superscripts will be understood
to be the scattered elds (E  E scatt and H  H scatt). If the total or incident elds
have to be alluded to they will be explicitly denoted by their respective superscripts.
A nite volume algorithm is used to solve the system of equations 2.33 and 2.34
on a dual grid using Runge-Kutta time marching. The four-stage Runge-Kutta time
30

H
H

H
H
xE
H
H

H H

Figure 2.3. A H- eld cell.

E
E

E
E
xH
E
E

E E

Figure 2.4. A E- eld cell.


31

integration method is chosen because of its high order of accuracy and its successful
application in modelling time dependent phenomena in aero-acoustic type problems
[45] and [46].
The system is solved using a four-stage Runge-Kutta time integration method
given by
Qmi =1 = Qni , 1tR0i
Qmi =2 = Qni , 2tR1i
Qmi =3 = Qni , 3tR2i
Qmi =4 = Qni , 4tR3i
Qni +1 = Qmi =4
The time step is denoted by n and each stage of the Runge-Kutta method by m,
where the coecients are m = 41 ; 31 ; 12 ; 1 respectively.
2 3 2 3
6
6
Hx 77 6
6
Ex 77
Q1 = 6666 7
Hy 777 Q2 = 6666 7
Ey 777 (2.37)
4 5 4 5
Hz Ez

The residuals Ri are de ned as

R1 = 1V (F + G + K) (2.38)

R2 = 1V (L + M + N) (2.39)


d
32
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
d c
CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCC
CCCCCCC
x

CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCC
a CCCCCCC
b

ξ
S
Figure 2.5. Surface area calculation.
where

F = 1 Si+1;j+1=2;k+1=2  Ei+1;j+1=2;k+1=2 , Si;j+1=2;k+1=2  Ei;j+1=2;k+1=2


 

G = 1 Si+1=2;j+1;k+1=2  Ei+1=2;j+1;k+1=2 , Si+1=2;j;k+1=2  Ei+1=2;j;k+1=2


 

K = 1 Si+1=2;j+1=2;k+1  Ei+1=2;j+1=2;k+1 , Si+1=2;j+1=2;k  Ei+1=2;j+1=2;k


 

In the above expressions S ; S ; S represent the projected surface areas of con-


stant ; ;  faces respectively. They are calculated by taking half the vector cross
product of the two diagonal vectors that join the four vertices of a cell face. For
example the area of the surface abcd as seen in Figure 2.5 is calculated as
2 3
6 (yc , ya) (zd , zb) , (zc , za ) (yd , yb) 77
S = 1 666 7
2 664 (zc , za ) (xd , xb) , (xc , xa ) (zd , zb ) 777
5
(xc , xa ) (yd , yb) , (yc , ya) (xd , xb )
33

ζ
E
η
E

CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
x

CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCC
CCCCCCC
x

CCCCCCCCCCCCC
CCCCCCC
x

CCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCC
CCCCCCC

ξ
E
Figure 2.6. Illustration of points for ux calculation.
It will be noticed that the electric eld points (Figure 2.6) used in the evaluation
of the above uxes lie in the center of the cell faces and are not computed directly
from the integration process. Instead they are extrapolated from the electric eld
points that make up the corners of the cell faces. For example from Figure 2.7

Ei;j+1=2;k+1=2 = 0:25  (Ei;j;k + Ei;j+1;k + Ei;j;k+1 + Ei;j+1;k+1)

Kordulla and Vinokur [47] have proposed an ecient method of calculating cell
volumes for three-dimensional ow predictions. The volume V is calculated by
partitioning the hexahedron cell into ve tetrahedra and computing the volume of
each tetrahedron separately. This decomposition helps avoid gaps and overlaps in
computing cell volumes and is computationally very ecient. Figure 2.8 shows the
breakup of a hexahedra into its respective tetrahedral elements.
34

CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCC
CCCCCCCCCCCCCCCCCCC
E(i, j, k+1) E(i, j+1,
CCCCCCCCCCCCC
k+1)
CCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCC CCCCCCC
CCCCCCCCCCCCC CCCCCCC
CCCCCCCCCCCCC CCCCCCC
CCCCCCCCCCCCC CCCCCCC
CCCCCCCCCCCCC
E(i, j, k) CCCCCCC E(i, j+1, k)

ξ
E
(i,j+1/2,k+1/2)

Figure 2.7. Points contributing to calculation of ux on surface.

The residual R2 used to evaluate the E- eld components can be computed in a


similar manner:

L = , 1 Sdi+1;j+1=2;k+1=2  Hi+1;j+1=2;k+1=2 , Sdi;j+1=2;k+1=2  Hi;j+1=2;k+1=2


 


M = , 1 Sdi+1=2;j+1;k+1=2  Hi+1=2;j+1;k+1=2 , Sdi+1=2;j;k+1=2  Hi+1=2;j;k+1=2


 

N = , 1 Sdi+1=2;j+1=2;k+1  Hi+1=2;j+1=2;k+1 , Sdi+1=2;j+1=2;k  Hi+1=2;j+1=2;k


 

The expressions Sd ; Sd ; Sd represent the projected surface areas of constant
; ;  faces of the dual grid respectively. It must be pointed out the subscripts i,j,k
used in the calculation of the above uxes are subscripts on the dual grid.
It should be worth commenting on at this point that the dual grid based nite
volume scheme just described can be easily extended to higher order spatial accu-
racy with the help of a Gaussian quadrature technique [48]. This technique involves
35

55555555 55555555
55555555 55555555
55555555
55555555 55555555
55555555 55555555
55555555 55555555
55555555 55555555
55555555 55555555
55555555 55555555
55555555
55555555

5555555555555
5555555555555
5555555555555
5555555555555
5555555555555
5555555555555
5555555555555
5555555555555
5555555555555
5555555555555 555555555
55555555
555555555
5555555555555 55555555
555555555
5555555555555 55555555
555555555
5555555555555
5555555555555 55555555
555555555
5555555555555
5555555555555 55555555
555555555
5555555555555
5555555555555 55555555
555555555
5555555555555
5555555555555 55555555
555555555
5555555555555
5555555555555 55555555
555555555
5555555555555
5555555555555 555555555
5555555555555
5555555555555 555555555
5555555555555
5555555555555 555555555
5555555555555
Figure 2.8. Decomposition of a hexahedron cell into ve tetrahedra
36

determining the elds at the Gaussian points using interpolation and obtaining the
pointwise values of the solution using a reconstruction process.

2.7 Dispersion, Dissipation and Arti cial Dissipation


Most central di erencing algorithms require the addition of some sort of explicit
arti cial dissipation to damp out the spurious numerical high frequency waves. Fig-
ure 2.9 shows the dissipation characteristics for a conventional second order central
di erencing in space and four stage Runge-Kutta time integration for the one dimen-
sional wave equation. It is evident that there is little or no damping of the high
wavenumber components thereby requiring the addition of explicit arti cial dissipa-
tion. However, for a spatially staggered scheme the dissipation characteristics with a
four stage Runge-Kutta method are quite di erent. This has been analysed with the
help of a one-dimensional system of hyperbolic equations
2 3 2 3 2 3
@ 66 u 7
7+6
0 c 7 @ 6
7 u 7
7=0
6 6 (2.40)
@t 4 v
5 4
c 0
5 @x 4
v
5

The u and v variables were staggered in space so that the system can be written
as
2 3 2 3 2 3
@ U = 66 0 0 7
7 Uj ,1 + 6
c
0 x 77 6 0 , c 7
x 7 U
@t j 4 c 5
6
4 5 Uj + 64 5 j +1 (2.41)
x 0 , cx 0 0 0

where 2 3
6 u 7
U = 64 7
5
v
1
Figure 2.9. Dissipation for a Runge-Kutta with central di erencing.

cfl=0.5
cfl=0.75
cfl=0.9
0.999 cfl=1.0

0.998
Dissipation - |G|

0.997

0.996

0.995

0.994

0.993
0 0.5 1 1.5 2 2.5 3 3.5
phase angle - (k dx)

37
38

By analysing the solution of this system for a single harmonic

Ujn = U^jn ei (2.42)

and inserting it in the equation 2.41 a relationship is obtained for the time discretiza-
tion
U^jn+1 = G()U^jn (2.43)

where G is called the ampli cation matrix and for the system in equation 2.41 using
four stage Runge-Kutta comes out to be
2 3
6 1 , 22 (1 , 32 sin2( k2 x ))(sin2 ( k2 x )) (1 , 232 sin2 ( k2 x ))(eikx , 1) 77
G = 64 5
(1 , 232 sin2( k2 x ))(1 , e,ikx) 1 , 22(1 , 32 sin2( k2 x ))(sin2( k2 x ))

where  is the CFL number and can be de ned as ct=x. The eigenvalues of the
matrix G are
s
1 (18,362sin2 ( kx )+124sin4 ( kx ) ,1442 + 2 cos2(kx)sin2 ( kx ))
 = 18 2 2 2

The dissipation characteristics obtained from the spectral radius j+j of the
ampli cation matrix G using the four stage Runge-Kutta time integration is plotted
against the phase angle (kx)in gure 2.11. As is evident, the low wavenumber com-
ponents propagate without being damped. However, the high frequency components
are damped due to the grid induced dissipation caused by the staggering of the spa-
tial derivatives. In most cases, this amount of grid induced dissipation is enough to
suppress any high frequency numerical oscillations that may arise. These type of dis-
sipation characteristics are quite commonly seen with implicit and explicit methods
39

that utilize central di erencing [49] and have added arti cial dissipation terms.
Apart from improving the dissipation characteristics, spatial staggering is seen
to have an impact on the dispersion characteristics too. In gure 2.10 a comparison is
made between the dispersion error obtained for the one-dimensional wave equation on
a non-staggered grid using second order central di erences and a system of equations
on a staggered grid for various phase angles (kx) at a CFL number of 1. The
dispersion error for the staggered system is obtained by

 = 1 tan,1 Imag  (2.44)


e kx Real 

Four stage Runge-Kutta time integration has been used in both cases. It is seen
that there is signi cant improvement of the staggered system over the non-staggered
equation and that the staggered approach can be utilised to resolve 12 cells per
wavelength at 99 percent accuracy compared to the 20 cells per wavelength needed
for a standard second order central di erencing algorithm.
It should be pointed out that the above analysis is for a one-dimensional case
on a uniformly spaced mesh with perfect staggering. In reality, however, body- tted
grids are usually clustered and may have imperfections like skewed cells or cells with
high aspect ratios. Also, the staggering discussed in section 2.5 in three dimensions is
not completely perfect. All the above reasons contribute in the production of spurious
di usion [25] for the high wavenumber components that are not completely damped
by the grid induced dissipation. Hence the need for adding arti cial dissipation.
When required, fourth order dissipation [50], [51], [52] is added

DQ = D Q + D Q + D Q (2.45)
1
R-K non-staggered
Figure 2.10. A comparison of the dispersion characteristics.

0.9 R-K staggered

0.8

0.7
dispersion error

0.6

0.5

0.4

0.3

0.2

0.1

-0.1
0 0.5 1 1.5 2 2.5 3 3.5
phase angle

40
41

where D Q, D Q, and D Q are the fourth order dissipation operators in the  ,


 and  directions respectively and can be explicitly written as

D Q = di+ 21 ;j;k , di, 12 ;j;k

D Q = di;j+ 21 ;k , di;j, 21 ;k

D Q = di;j;k+ 21 , di;j;k, 21

where the above di 12 ;j;k can be represented as

(Vd )i, 12 ;j;k


di+ 21 ;j;k = t
(Qi+2;j;k , 3Qi+1;j;k + 3Qi,1;j;k , Qi,2;j;k)
1
The coecient
is a constant whose value is 256
Hence inserting the expressions for arti cial dissipation in equation 2.39 and
rewriting it

R2 = 1V (L + M + N , DQ) (2.46)


d

2.8 Time Step Calculation


Since the grid used in this formulation is mostly non-uniform, the time step
permitted by the stability, dispersion and dissipation characteristics would vary from
cell to cell. As the problem under consideration is a time-dependent problem the time
step is calculated for each cell in the domain and the minimum of these time steps is
1
cfl=0.5
cfl=0.6
cfl=0.75
cfl=0.9
Figure 2.11. Dissipation for a staggered system.

0.95 cfl=1.0

0.9
Dissipation - |G|

0.85

0.8

0.75

0.7
0 0.5 1 1.5 2 2.5 3 3.5
phase angle - (k dx)

42
43

utilized in the integration process [53].

t = min (t)i;j;k

where
h i
(t )i;j;k (t )i;j;k (t )i;j;k
(t)i;j;k = h i (2.47)
(t )i;j;k (t )i;j;k + (t )i;j;k (t )i;j;k + (t )i;j;k (t )i;j;k

or
(t)i;j;k = 1 (2.48)
1 1 1
(t )i;j;k + (t )i;j;k + (t )i;j;k
and
(V )ijk
(t )i;j;k = CFL rh i2 h i2 h i2 (2.49)
C Sx + Sy + Sz
where
  
h
Sx 1i

 

= 2 Sx i+1=2;j;k + Sx i,1=2;j;k
h i 1     
  
Sy = 2 Sy i+1=2;j;k + Sy i,1=2;j;k
h i 1     
  
Sz = 2 Sz i+1=2;j;k + Sz i,1=2;j;k

Similar expressions can be derived for (t ) and (t ).


Boundary conditions constitute an important part of any numerical algorithm.
The next section will be devoted to the treatment of the boundary conditions both
at the surface of the scatterer and the outer boundary.
44

2.9 Boundary Conditions


We have to consider the boundary conditions at the surface of the scatterer and
the radiation condition at the outer boundary. In both cases we have only the E- eld
points located at the boundaries. The H- eld points lie half a cell away from the
boundaries and are considered as interior points. Hence, we have to determine the
boundary conditions for only the E- eld components.

2.9.1 Surface Boundary Condition


At the surface of a perfectly conducting scatterer we know that the tangential
components of the total electric eld are equal to zero.

n  Etotal = 0

where n is the unit normal to the surface of the conductor.


The ux at the face that lies on the surface of the body can be exactly calculated
by using the above condition and the analytically computed incident elds (Figure
2.12).
S  Escatt = ,S  Eincident
To compute the uxes on the faces perpendicular to the surface of the scatterer it can
be easily seen that the above condition is not enough to compute the three components
of the electric eld for the grid points that lie on the surface of the scatterer. We
need an additional condition in order to make the system determinate. In this case
we take into consideration Gauss's electric law which can be written in the integral
45

form as
Z Z Z
(r  D) dV = 0 (2.50)
V

or
Z Z
(n  D) dS = 0 (2.51)
S

We need to apply Gauss's electric law to each cell that lies on the boundary.
Assuming that the grid points that lie on the surface of the conductor do not con-
tribute anything to the dot product in Gauss's law for those sides of the cell that are
perpendicular to the conducting surface (Figure 2.13).
Combining this condition with the fact that the tangential components of the
total elds are equal to zero we can write explicit expressions for the scattered eld
components of the electric elds at the boundaries.

Extotal normal =  2
Sx (T )  (2.52)
[Sx ] + [Sy ]2 + [Sz ]2

Sy (T )
Eytotal normal =   (2.53)
[Sx ]2 + [Sy ]2 + [Sz ]2

Eztotal normal =  Sz (T )  (2.54)


[Sx ]2 + [Sy ]2 + [Sz ]2
where T is the Gauss Divergence Law summation over the whole cell in accordance
with the gure 2.13.
46

CCCCCCCCCCCPEC Surface
PEC Surface CCCCCCCCCCC
CCCCCCCCCCC
CCCCCCCCCCC
incident
flux for the shaded surface is = − S ηx E

Figure 2.12. Flux Evaluation on PEC Surface.

CCCCCCC
o
CCCCCCC
o

PEC Surface CCCCCCC


CCCCCCC PEC Surface
CCCCCCC
CCCCCCC
x x
normal

For each of the shaded surfaces only


points marked o contribute to calculation
of the Gauss Divergence law

CCCCC
CCCCC
o

CCCCC
CCCCC
CCCCC
o
CCCCC x
CCCCCnormal
PEC Surface
CCCCC
CCCCC
PEC Surface

CCCCC
x

Figure 2.13. PEC Surface Boundary implementation.


47

2.9.2 Outer Radiation Boundary Condition


Scattering problems are normally problems that relate to open or unbounded
domains. Computationally, modeling domains that extend to in nity is not feasible.
Therefore, there is the need to truncate the domain at a nite distance from the scat-
terer. This is achieved by using some form of outer radiation or absorbing boundary
conditions, so that waves that are travelling outward leave the domain with minimal
re ection.
The most commonly used outer radiation boundary condition used for prob-
lems of electromagnetic scattering with the Yee algorithm is the Mur [54] boundary
condition. The Mur boundary condition is based on the formulation of absorbing
boundary conditions for second order wave equations introduced by Engquist and
Majda [55]. In the Yee formulation only two of the three electric (magnetic) eld
components that are tangential to the boundary lie on the surface of the boundary.
Therefore, only these two components have to be updated at each time step with the
help of the radiation boundary condition. Because of the staggering, it is imperative
that the Maxwell's equations be uncoupled to solve for the variables on the boundary.
Starting with Maxwell's curl equations 2.6 and 2.12

@B + r  E = 0 (2.55)
@t

@D + J = r  H (2.56)
cond
@t
For a source free lossless media Jcond = 0, we can combine the two equations
after taking the curl of each of the two equations. This yields the second order wave
48

equations for both the electric and magnetic eld components

@ 2 H , c 2 r2 H = 0 (2.57)
@t2

@ 2 E , c2r2 E = 0 (2.58)
@t2
Expressing the above equations for the time-harmonic case we have

k2 H + r2H = 0 (2.59)

k2 E + r2E = 0 (2.60)

where k is referred to as the phase constant.


A one way wave equation method can be used by factorizing out the di erential
equation 2.60. For example
! !
@ ,L @ +L E =0 (2.61)
@x @x

where v
u !
u
L = ikt1 + 1 @2 + @2 (2.62)
k2 @y2 @z2
For the absorption of waves travelling along the negative x direction at a plane surface
located at x=0 we can apply the following condition
!
@ ,L E =0 (2.63)
@x
49

In order to model the boundary condition computationally the square root term in
equation 2.62 needs to expanded in its Taylor series approximation. For example, the
rst order approximation for a time dependent problem would reduce to solving
!
@ , @ E=0 (2.64)
@x @t

Equation 2.64 is also the rst order approximation boundary condition introduced
by Enquist and Majda [55]. In the eld of computational electromagnetics this is
also known as the Mur boundary condition [54]. For this boundary condition to be
e ective it has been assumed that the waves incident on the plane represented by x=0
are plane waves travelling along the x-direction. This boundary condition produces
a re ection of waves for angles of incidence that are not normal to the boundary.
Analytically the re ection coecent can be written as
!
Aref = Ainc cos , 1 (2.65)
cos + 1

where  is the angle between the direction of the incident wave and the normal to the
boundary. For a curved boundary the boundary condition can be modi ed to become
!
@ , @ + 1 E=0 (2.66)
@r @t 2r

where r is the radius of curvature of the boundary.


In order to deal with waves that are incident to the boundary at arbitrary angles
of incidence the Liao [56] formulation of transmitting boundary has been incorporated.
Liao's boundary condition is based on the fact that the plane wave solution to the
wave equation can be expressed in the form f(ct - x cos ) and that any wave can be
50

expressed as a summation of plane waves with di erent angles of incidence.

X
(x; t) = f (ct , xcosi ) (2.67)
i
Furthermore, if the solution (x; t) represented a single plane wave, then the value
of at (t + t) at a position x can be determined from the value at a point located
at (x-c t/cos ) at a time t, if the angle  of the outgoing wave is known.

X
(x; t + t) = f (c(t + t) , xcosi ) (2.68)
i

In [56] a transformation was used to rewrite equation 2.68 as

X
(x; t + t) = f ( i + i) (2.69)
i

where
i = ct , (x , ct)cosi

i = ct(1 , cosi)

Using a di erence operator

f ( + ) = f ( + ) , f ( )

and a recursive relation

m f ( + ) = m,1 f ( + ) , m,1 f ( )
51

the function f can be rewritten as


N
X
f ( + ) = m,1 f ( ) + N f ( + )
m=1

Finally, a shifting operator is de ned as Z such that Z f ( ) = f ( , ) such that


f ( + ) can be rede ned with the help of the shifting operator and the binomial
theorem as
XN
f ( + ) = (,1)j+1CjN Z j,1f ( ) (2.70)
j =1
which consequently gives
N
X
(x; t + t) = (,1)j+1CjN f (x , j ct; t , (j , 1)t) (2.71)
j =1

The discrete form of a linear interpolation of the Liao boundary condition 2.71
can be rewritten by replacing the functional f by the electric eld E as

En=+1jmax = T11 En=jmax + T12 En=jmax,1 + T13 En=jmax,2 (2.72)

where

T11 = (2 , s)(1
2
, s) T12 = s(2 , s)

T13 = s(s 2, 1)

and

s = c44nt where 0 2
52

The e ectiveness of the boundary condition was tested for the case of scattering
of a wideband pulse from a perfectly conducting sphere. This case is taken up in
detail in chapter 5. However, at this point, the e ect of truncating the domain with
the Liao boundary condition is investigated. The grid used for this case was an
O-O type grid and consisted of 37x46x73 grid points. The sphere is of radius 0.1
meters. There are 46 grid points in the direction running from the scatterer to the
outer boundary. In order to test the e ectiveness of the outer boundary condition the
computational domain was truncated in the radial direction from 46 grid points in the
original case to 23 grid points. Figure 2.15 is the comparison of the two solutions as
the scattering process evolved in a plane perpendicular to the direction of propagation
of the incident wave. Figure 2.14 is the comparison of the scattered elds in the two
domains in the plane of propagation of the incident wave. As can be clearly seen the
truncation of the domain did not lead to any change in the solution due to spurious
wave re ections at the boundary.
53

Figure 2.14. Comparison of the electric eld in a plane parallel to direction of incidence
54

Figure 2.15. Comparison of the electric eld in a plane perpendicular to direction of


incidence
Chapter 3

NEAR TO FAR FIELD TRANSFORMATION

Maxwell's equations are solved in the time domain close to the body of the scat-
terer since the computational domain ends typically a few wavelengths away from the
surface of the scatterer. After which, it is normally truncated by the outer radiation
boundary condition. This region between the scatterer and the outer boundary is
normally termed the near zone or the near eld. Estimation of radar cross section
(RCS) requires the intensity of scattered wave at in nity. This is normally termed the
far zone or the far eld solution. The far eld solution can be obtained as frequency
domain transformations, that are applicable at single frequencies [3]. In [11] the au-
thors have discussed the implementation of the far eld solution in nite di erence
algorithms in great detail for both time harmonic solutions and wideband excitations.
The solution at the far eld is usually calculated from the one in the near zone with
the help of a Green's function [57]. A closed surface is rst de ned somewhere in
the computational domain that encloses the scatterer. The equivalent electric and
magnetic currents are then computed on the surface by the tangential electric and
magnetic elds. The Green's function transformation is then utilised to obtain the
scattered elds in the far eld that are radiated by the currents on the surface.
This method of estimating the far eld solution, based on a near to far eld
transformation, is analogous to the Kirchho method that is used to predict the
linear acoustic far eld in aerodynamic noise related problems [58] and [59]. In that
case, a control surface is de ned in such a manner that it encompasses all nonlinear

55
56

ow e ects and noise sources, although the surface itself must lie in the linear region.
The complete set of nonlinear equations are solved in the region enclosed by the
surface and the surface integral of the solution over the control surface is utilised
in analytically determining the acoustic far eld. A comprehensive review of this
method and its application to acoustics is found in [60].
One major drawback of the Yee algorithm is that due to the staircased grids that
are employed, close to the surface of the scatterer there appear to be high frequency
errors generated. However as the computational domain is traversed away from the
scatterer these errors seem to get smoothed out. This is quite apparent in gure
3.1 which depicts the process of scattering from a sphere as viewed along a plane
tangential to the direction of propagation of the incident wave. The positioning of
the far zone integration surface in the computational domain is constrained by the
staircasing errors in problems that employ the Yee algorithm. Body-conformal grids
on the other hand have the advantage that the far zone integration surface can be
de ned close to the scatterer.

3.1 Steady State Transformation


In this section the implementation of the far eld transformation for time har-
monic problems is discussed. Assuming that a closed surface enclosing the scatterer
is de ned by S, the electric and magnetic elds are sampled at two distinct time-steps
for each grid point that lies on the far zone surface, once the steady state solution
has been reached. Let us express the electric elds as

E1 = Acos (!t1 + ) (3.1)


57

Figure 3.1. Staircasing errors in scattering from a sphere.


and
E2 = Acos (!t2 + ) (3.2)

Equations 3.1 and 3.2 can be solved to obtain the values of A and .
The time harmonic electric eld can be de ned as

E (!) = Aei

A similar treatment is carried out for the magnetic eld components at all grid points
that lie on the farzone surface. Since the magnetic eld components lie on the dual
grid rather than the actual grid they are obtained by averaging from one dual grid
layer above and the one below the actual surface.
The time harmonic equivalent scattered surface currents are then computed from
58

the time harmonic scattered elds at the surface. These can be written as

js (!) = n  H (!) (3.3)

ms (!) = E (!)  n (3.4)

Next the time harmonic vector potentials are obtained by integrating over the
farzone surface. They are obtained from the surface currents by the following expres-
sions:
Z

N (!) = js (!) eik~rr^ds


s
(3.5)

L (!) = s ms (!) eik~rr^ds (3.6)

where k is the wave number


r^ = is the unit vector to the far zone eld point

~r = is the vector to the source point of integration

The time harmonic far zone electric elds can be obtained by utilizing the time
harmonic vector potentials as

E = ,ie,ikR (ZN + L ) = (2R) (3.7)

E = ,ie,ikR (,ZN + L ) = (2R) (3.8)

where Z is the impedance of free space


R is the distance from the origin to the far zone eld point
 is the wavelength
59

3.2 Time Dependent Transformation


One of the most important applications of solving Maxwell's equations in the
time domain is the analysis of wideband scattered elds from radar targets. For this
purpose, it is important to compute the far zone scattered elds at various frequencies.
Previously, FDTD calculations were made with a Fourier transform being computed
at each time step to evaluate the RCS response at the frequencies of interest. However,
in [61] Luebbers et al have presented a method to eciently transform the FDTD
results to the far zone in the time domain. A fast Fourier transform needs to be
applied as a post-processing step in this method, thus resulting in a tremendous
savings of computational time. It should be mentioned that although the method
developed in [61] is quite generic, the discussion of its application was limited to the
Yee algorithm. In this section, the method has been adapted to the FVTD algorithm.
Since the whole concept of the far zone transform is based on the fact that there exists
a closed surface around the scatterer where the electric and magnetic scattered surface
currents, js and ms , can be calculated, it is critical to de ne the surface rst. As
has been previously discussed, the Yee algorithm is susceptible to staircasing errors,
the far zone surface has to be placed at a signi cant distance from the scatterer.
However, it is also neccessary that the surface should not be too close to the outer
boundary because then the surface currents may get contaminated by re ections from
the radiation boundary. However, for body-conformal grids the integration surface
can very well be de ned one grid cell away from the surface of the target. As long
as the the grid is perfectly orthogonal at the scatterer surface boundary and the
boundary condition is accurate this would be a prudent choice. It is considered a
safe practice to place the farzone surface 4 , 6 grid cells away from the surface of the
60

Time differential
Point of reference
^
−r
x −
r
x

Far zone field point


Point on far zone surface

Figure 3.2. Schematic for farzone potential computation.


scatterer. The surface currents can be computed from the scattered elds as

js = n  H (3.9)

ms = E  n (3.10)

In [61] the authors show that the retarded potentials w and u can be computed from
the scattered surface currents using the following formulas:


w [t , (~r  r^) =c] = 4c 1 


@
Z 

js (t) ds (3.11)
@t s

1

u [t , (~r  r^) =c] = 4c



@
Z

ms (t) ds


(3.12)
@t s

where:
t = the elapsed time

r^ = is the unit vector to the far zone eld point

~r = is the vector to the source point of integration

Depending on the type of mesh employed i.e. O-O type, O-H type the integration
surface is appropriately constructed. Assuming that an O-O type grid is being used
61

the surface  = constant then denotes the integration surface. In that case, the far
zone transform equations can be recast in discretized form as follows:

u [t , (~r  r^) =c] = (E  S )n+1 , (E  S )n (3.13)

At each time step the contribution of each cell surface that makes up the in-
tegration surface  = constant is calculated and placed in its corresponding w bin
whose index is computed by subtracting the time delay factor from the elapsed time
and dividing the result by the time step t. As a result, the size of the potential
arrays will not correspond to the number of iterations. The size of the potential
arrays would be slightly larger, the increase in size being dependent on the relative
distances of points on the far zone surface from the reference point. If Ra represents
the closest point on the far zone surface to the far zone eld point and Rb the furthest
point from the far eld point then

M = cRat + N + cRbt

where cRat is representative iteration di erential for startup between point A and the
reference point. Similarly, cRat is indicative of the time delay between point B and the
reference point. The surface currents on the far zone surface will a ect the radiation
in the far eld for M number of iterations.

The retarded potential w has to be computed in a slightly di erent manner since


the magnetic eld points do not lie on the surface  = constant. Instead, there exists
a magnetic eld point half a cell above and half a cell below the surface. Therefore we
can take the average of the time derivatives of the cell above and below and compute
62

the retarded potential w.

w [t , (~r  r^) =c] = (3.14)


(Hn+1 ,Hn )+ 21 +(Hn+1 ,Hn), 12
S 

2

The far zone scattered elds can be computed from the above potentials as

E = ,Zw , u E = ,Zw + u

where Z is the impedance of free space.


The farzone scattered elds are then Fourier transformed and divided by the
Fourier transform of the incident pulse to obtain the scattering cross-section, which
can be de ned as:

(r0) 2

scatt
4(r0)2 E

 = rlim
0 !1 (3.15)
E inc

In the next chapter, implementation of the transformation in both the data


parallel and the message passing paradigms will be discussed. Chapter 6 will deal
with the results obtained using the transformation for a variety of scatterers including
the RCS response at various frequencies.
Chapter 4

PARALLELIZATION ISSUES

Problems related to electromagnetic scattering and stealth technology are in-


cluded in the Grand Challenge problems in scienti c computing. This creates the
need to obtain near tera op performance to tackle realistic problems. For example,
the computation of RCS for a ghter jet at 1GHz, that has a characteristic span of
16 meters, requires a computational domain of 32.76 billion grid points at 20 cells
per wavelength. Parallel computational techonology is evolving rapidly to meet the
challenges of problems that require high performance computing. This chapter deals
with the implementation of the algorithm discussed in chapter 2 to both the data
parallel and message passing paradigms. In section 4.1 parallelization issues related
to the data parallel implementation in CM-Fortran are discussed. Drawbacks and
limitations of this approach for time-dependent problems in CFD and CEM are also
discussed in section 4.1. Section 4.2 deals with message passing. Issues relating to
message passing libraries, performance, domain decomposition and implementation
are dealt with in this section.

4.1 Data Parallel and CM Performance


Data parallel algorithms generally refer to algorithms that have data operations
being executed concurrently over the processors that make up the machine. Syn-
chronization of data parallel operations is usually enforced by the compiler. The
computers used to develop and run the data parallel versions of the FVTD algorithm

63
64

are the CM-5 and the CM-200 produced by Thinking Machines Corporation. The
CM-5 has an architecture that comprises of 32 - 1024 processors with each proces-
sor supporting 4 vector units. Each vector unit has 8 megabytes of memory. The
language used in code development on these machines is CM-Fortran which is simi-
lar to the High Performance Fortran standard. It di ers from Fortran 90 in mainly
two ways. Firstly, there exist compiler directives that specify how the data is dis-
tributed over processors. This is done with the help of the CMF LAYOUT command.
This command instructs the compiler as to how the arrays should be laid out in the
computers memory. For example the compiler directive

CMF $ LAY OUT A(: news; : news; : news)

would force A to be spread across all the parallel processors. However, the compiler
directive
CMF $ LAY OUT A(: news; : news; : serial)

would distribute the elements of array A across all processors in such a way that each
processor has a 2-D array of the size of the rst two dimensions.
Secondly, there exist constructs for data parallelism like the FORALL state-
ment. These constructs are array processing features that provide the programmer a
little more latitude and control in inducing parallelism. For example, the FORALL
statement helps in controlling data motion, thereby enabling the programmer to:
 exercise DO LOOP type control in CM-Fortran
 de ne a masking operation, much like an embedded IF in a DO LOOP
 perform complicated permutations on multi-dimensional arrays.
 operate in a complicated manner on irregularly shaped parts of an array.
65

A CM-Fortran ( essentially High Performance Fortran ) version of the FV-TD


code has been developed. Some of the issues that came up during the development
process are discussed below.

In CM-Fortran and essentially high performance Fortran the lowest level of


working object can be an array or a single variable. This feature permits the pro-
grammer to use conformable arrays to perform operations on arrays without the use
of subscripts for the arrays. For example:

A=B+C

can be used to sum up the elements of an n x n array B with the corresponding


elements of C and stored in an array A. Since CM-Fortran supports array based data
structures most of the \Do Loop" processes in the Fortran 77 version of the FV-TD
code were transformed into simple array operations.
Extensive use is made of the CMF LAYOUT command. All the electromagnetic
variables are assigned to parallel arrays using the :news directive. In other words, all
the six components of the electric and magnetic elds are stored as parallel variables
spread across all the processors.
CM-Fortran has some data movement functions that reposition elements of ar-
rays. The most commonly used function is CSHIFT or a circular shift that permits
the elements of an array to be shifted along a speci ed dimension. For example, the
command
66

do j=1,jmax
do i=1, imax
a(i,j)=b(i,j)+b(i+1,j)
enddo
enddo
can be performed by issuing the command

a = b + cshift(b; dim = 1; shift = 1)

However, multiple cshifts have been found to be very expensive. Therefore additional
arrays are de ned to cut down communication costs. For example

A = cshift(cshift(b; 1; 1); 2; 1)

can be broken up into

c = cshift(b; 1; 1) and A = cshift(c; 2; 1)

The more dicult parts of code conversion lie in the implementation of the outer ra-
diation boundary condition. The Liao boundary conditions use 3-dimensional arrays
but one of the dimensions is less than the other two. Additional arrays are de ned
that store the electromagnetic variables at previous time steps and a masking func-
tion is used to perform the boundary condition operations only at the boundaries.
This method is not very ecient since most of the processors are sitting idle while
the boundary operations are being performed.
However, the most signi cant issue is that of the implementation of the far zone
67

transformation. The far zone transformation is essentially a serial process since the
time delay factor dictates the index of the potential arrays where the contribution
of the surface arrays have to be accounted for. This is being achieved by a process
called send with add and the use of indirect array referencing, where the time delays
of di erent points on the far zone surface are stored in an array that acts like a
set of pointers for the potential arrays. Since the electric and magnetic currents
on the far zone transformation surface are cross products of surface areas and the
time derivatives of the electromagnetic eld components they are computed as two-
dimensional quantities.
Once the components of the magnetic and electric eld currents are computed,
their contribution is added to the potential arrays which are essentially one-dimensional
quantities. Hence the current arrays are converted to one-dimensional quantities using
the RESHAPE function. For example :

dcurntmx = reshape (mold = [imax1  kmax1] ; source = curntmx)

Issuing the above command converts the two-dimensional array CURNTMX


into a one-dimensional array DCURNTMX with a length of imax1 x kmax1.

For the currents corresponding to each grid point on the far zone transformation
surface a time delay factor is computed that determines the appropriate potential
array into which the contribution of the current is added. In other words the time
delay factor array acts as a pointer array to the currents array.

FORALL (I = 1 : N ) WPOT (I ) = WPOT (I ) +


68

5000
4500
Serial FVTD
Parallel FVTD

4000
3500
Retarded Time Steps
2000
1500
1000
500
0 2500 3000
2

1.5

0.5

-0.5

-1

-1.5

E_theta Amplitude

Figure 4.1. Serial-data parallel far eld comparison.


SUM (DCURNTMX (1 : lfzc) ; MASK = MM 3 (1 : lfzc) :eq:I )

Here WPOT is a potential array, and MM3 is a pointer array. The contribution
of dcurntmx is added to the appropriate potential array block by the masking con-
dition provided by the pointer array MM3. This, essentially, is the logic behind the
SEND-WITH-ADD process [62].
A run on the 32 node CM , 5 took 19.92 micro-sec/grid point/time-step.
69

This time included the overhead of reading/writing les. With the far zone pro-
cess excluded, an equivalent run on the 32 node CM , 5 took 17.38 micro-sec/grid
point/time-step.
Although a data parallel FVTD algorithm results in an e ective speedup of over
70 percent in turnaround time over the serial version of the code, this approach has
some de ciencies. It should be noted here that the parallel version performed at
approximately 2 percent of the peak speed of the machine. Firstly, implementation
of the outer radiation boundary condition proves to be quite costly since most of
the processors wait while a few compute the outer boundary. Secondly, the farzone
transformation, although eciently implemented still produces a large overhead due
to its inherently serial nature. These de ciencies prompted the consideration of the
message passing paradigm that is discussed in the next section.

4.2 Message Passing


The computer that was used for the message passing algorithm is the IBM 9076
Scalable Power parallel system (SP-2) which is a tightly clustered group of RISC
system/6000 processors. Eciency in message passing is provided by a low latency
high bandwith switch. \Latency" is the startup time that is required at the time of
communication between processors, or in other words, it is the time taken to send
a message in the limit as the message size goes to zero. \Bandwidth" refers to the
message size that is passed per unit time. Communication time is roughly equal to
the sum total of the time taken for startup and the actual time required to transfer
data. Message passing on the SP-2 is supported by various message passing libraries.
A brief discussion of the di erent libraries available and the reasons for using MPL
(IBM's proprietary Message Passing Library) are provided in section 4.3. Porting of
70

the code to MPL and issues regarding implementation of the algorithm to a message
passing type paradigm is delineated in 4.4.
The eciency of a message passing algorithm is gauged by its ability to mini-
mize the communication-to-computation ratio and to balance the workload e ectively
by partitioning the data structure evenly. Many di erent program/data decomposi-
tion techniques are used to achieve this end. The three commonly used types of
decomposition techniques are domain decomposition, control decomposition and ob-
ject decomposition. Domain decomposition techniques are best suited for structured
data problems and will be discussed at length in section 4.5. Control decomposition is
best adapted to situations where the data cannot be well organised, hence parallelism
can be enforced by distribution of the ow of control of computation rather than the
partition of the computational domain. An object oriented programming model can
be utilized as a parallelizing tool leading to object based decomposition.

4.3 Message Passing Libraries


Communication in a distributed memory system is provided by passing data
between nodes. This is usually e ected by the use of message passing libraries. The
SP-2 supports various message passing libraries. The most common ones supported
are MPI (Message Passing Interface), MPL (Message Passing Library - an in-house
IBM library) and PVM (Parallel Virtual Machine message passing library). Some
of the issues that need to be addressed in the choice of the message passing library
include performance, porting, access and the state of development. MPI is the
result of co-operation between industry, academia and government aliations in set-
ting up a message passing standard that would be portable over all platforms that
support message passing. Two versions of MPI are commonly supported on the SP-2.
71

MPICH has been developed at the Argonne National Laboratory in conjunction with
Mississipi State University and MPI-F is a product of the research division of IBM.
MPL is the message passing library developed by IBM to parallelize applications on
the Scalable Powerparallel Systems (SP1 and SP2). It is a robust library and has
provided the foundation for the development of the MPI class of libraries. PVM
is one of the oldest of this class of message passing libraries and was developed at
the Oak Ridge National Laboratory and the University of Tennessee. The versions
supported on most SP platforms is an optimised version called PVMe. The Numer-
ical Aerodynamic Simulation Program at NASA Ames Research Center carried out
an extensive performance analysis comparing the di erent message passing libraries.
Table 4.1 shows the latency and bandwidth comparison of the various libraries.

Table 4.1. Performance of Message Passing Libraries on the SP-2


Library latency bandwidth
MPI-F 43 microseconds 34 Mbytes/sec.
MPL 45 microseconds 34 Mbytes/sec.
MPICH 58 microseconds 33 Mbytes/sec.
PVMe 83 microseconds 31 Mbytes/sec.

While MPI and MPL have given reasonably good performance, PVM has been
found to be lacking in performance on the SP2. PVM, however, has the exibility of
dynamically adding and deleting tasks during execution of a parallel application. It
should be pointed out here that PVM is structurally di erent from the other libraries
and cannot take advantage of asynchronous message passing that is performance
enhancing if applicable to the applications. Although MPI is extremely portable and
can just as easily be run over a cluster of workstations as on the SP2, MPL was chosen
as the library for implementation of the FVTD algorithm. Since MPL was developed
72

Processor Domain i Processor Domain i+1


o x
o x
o x
o x
o x

H−field grid

E−field grid Interior points on PD i


are boundary points for PD i+1

Figure 4.2. Processor Domain Decomposition.


mostly to be used on the SP class of parallel machines, MPL based applications can
be easily ne tuned for performance on the SP-2. Also porting codes from MPL to
MPI is rather easy. On the other hand, there exists an asymmetry when it comes to
porting the other way around from MPI to MPL due to the orthodoxy in the syntax of
MPL calls. MPL also provides the ability to have both collective communication and
speci c communication synchronously and asynchronously, all of which are required
for message passing of the FVTD algorithm.

4.4 Implementation in MPL


Since the grid for the algorithm discussed in chapter 2 consists of the E- eld grid
and the H- eld dual grid, a Processor Domain (PD) for our purposes is a combination
of the two. Message passing is carried out between two adjacent processor domains
(PD's) by creating pseudo boundaries between them. As is evident from gure 4.2
the interior cell values (E- elds in this case) are passed from one processor over to
its neighbouring processor and they constitute the pseudo boundary conditions for
the neighbouring processor and vice versa. Parallelization of the algorithm
73

was performed on the SP-2 using the Message Passing Library (MPL). MPL uses the
Single Program Multiple Data (SPMD) model wherein the same program resides on
all processors and is executed by each one of the processors. However, each processor
has its identi er processor number and this permits MPL to have processor based
conditional statements. The reason for using a SPMD model is that in a distributed
memory system, memory and address space is local to each processor and the only
way data can be shared among processors is by passing messages. Each processor has
it's own processor domain to work with. Each processor domain is either bounded by
pseudo boundaries or a combination of physical and pseudo boundaries.
Figure 4.3 illustrates the SPMD mode and the manner in which the algorithm is
implemented in MPL. A pre-determined processor (with processor ID=0) acts as the
control processor and carries out the pre-processing tasks involved in setting up the
parallel environment and all parameters required for message passing. It reads in the
grid and distributes it to the various processors depending upon the type of domain
decomposition. Each processor then generates its own dual grid from the information
it has about its own grid. The grid and the constructed dual grid together constitute
the processor domain of the processor. The processor then computes the volumes and
surface areas of the cells that are located in its processor domain. It also calculates the
time step required for computation on its own processor domain. This information
about the minimum time step on each processor domain is then communicated to the
processor with ID = 0. The processor with ID = 0 in turn determines the minimum
time step for the entire computational domain and broadcasts it to all the processors.
Message passing between processors is carried out by issuing explicit MPL calls.
For example
74

CALL MP BSEND(qeast; jmax  kmax  3  r4size; taskid + 1; msgid)

will send the array \qeast" of size (jmax  kmax  3  r4size) to the processor with
a processor identi cation number equal to (taskid+1). Similarly, the call

CALL MP BRECV (qeast; jmax  kmax  3  r4size; taskid , 1; msgid; nbytes)

will receive the array \qeast" of size(jmax  kmax  3  r4size) from the processor
with a processor identi cation number equal to (taskid-1).
Since the whole simulation is a time dependent process, all processors are forced
to be synchronized after each Runge-Kutta stage with the MPL call:

CALL MP SY NC (allgrp)

which blocks all execution on all processors until all processors have made the cor-
responding call. For the far zone transformation to be implemented each processor
is required to compute the surface currents for that part of the far zone integration
surface that lies on it. Some processors have no part of the surface lying in their pro-
cessor domain whereas others may have a signi cant part of the surface. This creates
minor problems of uneven load balancing. For two contiguous processors that have
the intersecting nodes lying on them, the far zone currents are computed from that
processor that has a lower ID. Once each of the relevant processors computes the mag-
netic and electric surface currents on the far zone integration surface, it then sums
their contributions to the potential arrays depending on the retarded times. Each
75
Processors

PID=0

PID=i Pre−processing

Read grid from Proc 0 Read grid from Proc 0

Generate dual grid


Calculate metrics, vol

PID=0
Calculate time step Gather time steps
Calculate min time step

Broadcast min time step to processors

Calculate fluxes Get / Send boundary values


Integrate from/to neighbour processors

PID=0
Send potentials to PID=0
Calculate farzone Sum potentials based
potentials on retarded times

No Done
?
Yes

End

Figure 4.3. Message Passing Algorithm.


processor then communicates its potential arrays to the processor with ID=0. The
processor with ID=0 combines the various contributions to the retarded potentials
and stores the time history of the potential variations.

4.5 Domain Decomposition Strategies


Domain decomposition in allocating Processor Domains (PD) is very important
to achieving optimum parallel eciency on distributed memory systems. Careful
consideration has to be paid to issues of load balancing, synchronization, distribution
of boundaries over processors and the amount of message passing involved. There
76
jmax

1
kmax

1
k
j 1−D Parallelization

Figure 4.4. Schematic of 1-D domain decomposition.


exist di erent methods of partitioning among processor domains. As is discussed in
[38] and [63] the simplest form of domain decomposition is that of one-dimensional
parallelization. In this case domains are partitioned along one coordinate direction.
For example, for the case of the ogive the processor domain consisted of (ni  jmax 
kmax) grid points, where ni refers to the number of i-planes in the processor domain.
As has been pointed out in [38] this sort of partitioning achieves near perfect load
balancing. Each processor has the same number of interior points and the same
number of outer boundary points correponding to the j=jmax boundary. Although
this form of domain decomposition is simple and easy to implement, the ratio of the
number of nodes involved in message passing to the number of nodes in the domain
is comparatively large, thus forcing a limit on scalable performance.
Pencil or two-dimensional parallelization increases the complexity involved in
the level of parallelization over linear domain decomposition ( gure 4.5). Here each
processor contains (ninj kmax) grid points. This decreases the amount of message
passing involved between two contiguous processors considerably. However this sort
of decomposition may not be natural to all geometries and grids. Even distribution
77

2−D Parallelization

kmax

1
k
j

Figure 4.5. Schematic of 2-D domain decomposition.


of boundaries is far more dicult in this case, which in turn adversely a ects load
balancing. For example, this type of decomposition would be unsuitable for the ogive,
and other axisymmetric bodies.
Adding another level of parallelization leads to three dimensional or ordered
block parallelization. Here each processor has (ni  nj  nk) grid points residing
on it and has six neighbouring processors ( gure 4.6). This type of decomposition
is natural to massively parallel systems that have numerous processors and leads to
a near linear scale-up performance for most problems with extremely large domains.
It is typically suitable for H-H type grids. Processors need to be identi ed in a
structured manner for this level of parallelization.
Apart from the three ordered types of domain decomposition considered, there is
the disordered type where domain decomposition is constrained by grid based zonal
decomposition. An example of such decomposition is the NASA Almond that has
been gridded into six zones that have a pattern as depicted in gure 4.7. The linkage
between the di erent zones is disorderly and is shown in gure 4.8. Such a disordered
78

3−D Parallelization

k
j

Figure 4.6. Schematic of 3-D domain decomposition.


form of decomposition can be handled by means of a connectivity array, such as those
used by unstructured grid solvers in ux computation to identify neighbouring cell
faces/centers. The connectivity arrays act as pointers in disseminating and collecting
information during message passing between processors. This sort of domain decom-
position may create serious load balancing problems but may be the only recourse
when complex zonal gridding patterns are utilized. A good implementation of such
a pointer method of domain decomposition can handle any of the other forms of do-
main decomposition techniques discussed, since the pointer method is certainly the
superset of all other forms of domain decomposition methods.
79

Figure 4.7. Zonal grid with complex arrangement of zones.


80

Zone 6

Zone 2 Zone 1 Zone 4 Zone 5

Zone 3

Figure 4.8. Linkage between zones in a disorderly manner.


81

4.6 Message Passing Performance


Performance of a parallel algorithm is gauged on the basis of how well an algo-
rithm scales with the increase in the number of processors and its communication to
computation ratio. The lower the communication to computation ratio, the higher
is the eciency of the program. The test case of scattering from the NASA almond
for a grid size of 225720 grid points was run for 10,000 iterations. The entire com-
putational domain was spread over 7 processors. A rst run was made of the code
with the far zone integration and the arti cial dissipation switched o . Table 4.2
indicates the computational time taken by each processor for the iterative process.
The communication time listed in the table includes time to perform inter-processor
communication and latency time including latency time for synchronisation after each
time step.

Table 4.2. Message Passing Performance without farzone and dissipation


Processor ID Computational time Communication & wait time/Latency
0 7657.96 sec 417.3 sec
1 7597.17 sec 412.77 sec
2 7603.37 sec 406.27 sec
3 7552.64 sec 457.40 sec
4 7493.74 sec 515.65 sec
5 7416.06 sec 593.97 sec
6 7445.39 sec 629.04 sec
average 7538.04 sec 490.34 sec

avg: communication &wait time = 0:0650


avg: computational time
82

Next the performance of the code was evaluated after switching on arti cial dis-
sipation. Fourth order arti cial dissipation requires that additional message passing
be performed since the fourth order derivative stencils require that an additional eld
point be communicated from adjacent domains on either side. Table 4.3 shows the
computational and communication times utilised in this case. In comparison to table
4.2 it is seen that there is a 5.8 percent increase in the computational time required
whereas the communication requirement has increased by 10.2 percent. The relative
increase in the amount of communication over the computations has lead to a slight
deterioration in the eciency ratio.

Table 4.3. Message Passing Performance with dissipation without farzone


Processor ID Computational time Communication & wait time/Latency
0 8113.10 sec 417.3 sec
1 8052.68 sec 412.77 sec
2 8047.51 sec 406.27 sec
3 7991.06 sec 457.40 sec
4 7927.19 sec 515.65 sec
5 7848.37 sec 593.97 sec
6 7886.45 sec 629.04 sec
average 7980.90 sec 540.76 sec

avg: communication &wait time = 0:06775


avg: computational time
Lastly, the computer code's performance with the near to far eld transformation
was analysed. As seen in table 4.4 the time taken for message passing showed a ve-
fold increase over the communication time needed with the far zone transformation
switched o . This increase can be attributed to the message passing involved in
passing and summing up the potential arrays over all the processors that have a part
83

of the far zone integration surface de ned in their respective processor domains.

Table 4.4. Message Passing Performance with farzone without dissipation


Processor ID Computational time Communication time/Latency
0 7780.98 sec 1919.70 sec
1 7648.11 sec 1987.48 sec
2 7663.26 sec 1972.26 sec
3 7610.09 sec 2024.97 sec
4 7548.55 sec 2086.40 sec
5 7459.22 sec 2175.55 sec
6 7464.41 sec 2235.13 sec
average 7596.37 sec 2057.35 sec

avg: communication&wait time = 0:2708


avg: computational time
In order to determine how well the FVTD algorithm scales up on a distributed
memory architecture test cases were run on 4, 8 and 16 nodes of the SP-2 for a
computational domain comprising of 668388 grid points. The domain was subject to
one-dimensional decomposition. The scaleup map is shown in gure 4.9. This clearly
shows that one-dimensional domain decomposition cannot sustain a linear scaleup
as the size of a domain is increased, due to the increase in ratio of the number of
nodes involved in message passing to the number of total nodes in the computational
domain.

4.7 Performance Results


In this section the performance of the FVTD algorithm will be compared against
the FDTD algorithm [11] in terms of the computational time required.
Table 4.5 shows the time taken per grid point per iteration for both the FVTD
84

13

Micro-seconds per grid point per iteration


12

11

10

3
4 6 8 10 12 14 16
No of processors

Figure 4.9. Processor performance scaleup.


and FDTD codes. In the data parallel mode the FDTD algorithm is 33.2 times faster
than the FVTD algorithm. This is attributed to the fact that the FDTD is run on
Cartesian grids whereas the FVTD algorithm is used on curvilinear grids and the
additional cost of the transformation from the physical to the computational grid
is incurred while computing the numerical uxes. Also, the FVTD algorithm has
a four-stage time integration process which is a further overhead. The numbers in
Table 4.5 can be misleading chie y because of two reasons: Firstly, as was seen in
Chapter 2 the FVTD algorithm requires fewer cells per wavelength than the FDTD
algorithm. Secondly, the curvilinear nature of the FVTD grid requires far less cells
in the computational domain than is required by the stair-stepped Cartesian grid
needed for the FDTD algorithm. Table 4.6 shows the number of grid points required
by the two algorithms to adequately resolve the scattering wave patterns for an ogive
shaped target.
Here, it can be clearly seen that approximately 20 times more grid cells are
required by the FDTD algorithm over the FVTD algorithm. This shows that even
85

Table 4.5. Performance of Codes in secs/node/timestep


Algorithm Computer secs/node/timestep
Serial
FDTD RS/6000 14.0
FVTD RS/6000 94.0
Data Parallel
FDTD CM-5 0.6
FDTD w/o farzone CM-5 0.1
FVTD CM-5 19.92
FVTD w/o farzone CM-5 17.32
Message Passing
FVTD SP-2 (4-node) 12.76
FVTD SP-2 (16-node) 3.99

Table 4.6. Grid Point Comparison for the Ogive


Algorithm Grid Dimensions Total Grid Points
FDTD 400  100  100 4 Million
FVTD 200  31  32 198,400

though the FDTD algorithm is much faster than the FVTD algorithm , it may not
be that attractive in the overall scheme of things.
It is also found from the data presented in table 4.5 that the message passing
paradigm is better suited to the FVTD algorithm over the data parallel mode. A
16-node SP-2 execution of the FVTD code was 5 times faster than a 32-node run
on the CM-5. This clearly indicates that the FVTD algorithm is more ecient as a
message passing based algorithm.
Chapter 5

RESULTS

The purpose of this chapter is to apply the various tools that have been devel-
oped and discussed in the previous chapters to simulate the process of wave propa-
gation and scattering from a variety of targets. The algorithm developed in chapter
2 is rst used to model the propagation of TEM (Transverse Electromagnetic) waves
in a rectangular domain. The validity, accuracy and e ectiveness of the algorithm
is tested in calculating the scattered elds from a perfectly conducting sphere. The
results are compared with those obtained from the standard FDTD algorithm that
uses Yee stencils and the leapfrog method in time. Since the main reason for the
development of the algorithm was to analyse scattering from complex three dimen-
sional aircraft-like con gurations, sections 5.3, 5.4, 5.5, 5.6 and 5.7 are devoted to
examine the e ects of scattering from various components of an aircraft. In sections
5.3 and 5.4 the RCS is computed for an ogive and ellipsoid. These con gurations
are typical of shapes used to represent the nose sections of aircrafts. In section 5.5
the elusive case of scattering from the NASA almond is considered. The results are
compared with available experimental data at all angles of incidence. Section 5.6 is
based on qualitatively analysing scattering for a trapezoidal wing. Lastly, the process
of scattering from an engine inlet is investigated in section 5.7

86
87

5.1 Wave Propagation


The rst simulation carried out was that of propagation of TM waves in an
isotropic rectangular domain. A Cartesian grid comprising of 100  100 grid points
was used for this case. Maxwell's equations in a two dimensional domain for the T-M
mode reduce to
Ex = Ey = Hz = 0 (5.1)
!
@Ez
= 1 @Hy
, @Hx
(5.2)
@t  @x @y

@Hx
= , 1 @Ez (5.3)
@t  @y

@Hy
@t
= 1 @E
@x
z
(5.4)

For the case under consideration, the domain is bounded by perfectly conducting
boundaries at x = ,5 and x = 5. A Gaussian pulse having an Hy component was
excited over 20 cells in the middle of the domain.

= 21 e,6(x,ct) (5.5)
2
Hy

The Gaussian disturbance gives rise to two waves that are transmitted in the
domain in opposite directions. The left travelling wave has an Ez component and a
Hy component that are in phase whereas the right travelling wave has a Ez component

and a Hy component that are opposite in phase. A wave of this type, with the E and
H transverse to the direction of propagation is called a Transverse Electromagnetic
(TEM) wave. Figures 5.1 - 5.6 show the left and right travelling branches of the Ez
component solution at every 40th iteration. As is seen in gure 5.3 the waves reach
88

0.5

-0.5

-1

100

0 50

50

100 0

Figure 5.1. T-M Wave propagation Ez component.


the conducting boundary at the edge of the domain. At the boundary, both branches
of the Ez component undergo a reversal of phase due to total re ection (5.4). Figures
5.5 - 5.6 depict the propagation of the re ected waves into the domain. Evolution
of the corresponding Hy solution is shown in gures 5.7 - 5.12. The H eld points
are o set by half a grid point from the perfectly conducting boundaries at the two
ends of the domain. Therefore, no boundary conditions need to be speci ed for the
Hy components at the conducting boundaries. Figures 5.10 - 5.12 show the charac-

teristics of the Hy component after the wavefront reaches the re ecting boundaries.
The TEM wave solution obtained with the Runge-Kutta algorithm is compared with
that obtained with the Yee algorithm. Figures 5.1 - 5.5 show that there is excellent
agreement between the two schemes for the Ez component..

5.2 Scattering from a sphere


The rst test case considered here is that of scattering of a wideband pulse from
a perfectly conducting sphere. The grid used for this case is an O-O type grid and
consisted of 37x46x73 grid points. The sphere is of radius 0.1 meters. The incident
pulse is represented by the derivative of the Gaussian and can be explicitly written
89

0.25
0.2
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
-0.2
-0.25

100

0 50

50

100 0

Figure 5.2. T-M Wave propagation Ez component.

0.25
0.2
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
-0.2
-0.25

100

0 50

50

100 0

Figure 5.3. T-M Wave propagation Ez component.

0.25
0.2
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
-0.2
-0.25
100

0 50

50

100 0

Figure 5.4. T-M Wave propagation Ez component.


90

0.25
0.2
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
-0.2
-0.25
100

0 50

50

100 0

Figure 5.5. T-M Wave propagation Ez component.

0.1

0.05

-0.05

-0.1
100

0 50

50

100 0

Figure 5.6. T-M Wave propagation Ez component.

0.0015

0.001

0.0005

100

0 10 50
20 30 40 50 60 70 80 90 0

Figure 5.7. T-M Wave propagation Hy component.


91

0.0007
0.0006
0.0005
0.0004
0.0003
0.0002
0.0001
0
-0.0001
100

0 10 50
20 30 40 50 60 70 80 90 0

Figure 5.8. T-M Wave propagation Hy component.

0.0007
0.0006
0.0005
0.0004
0.0003
0.0002
0.0001
0
-0.0001
100

0 10 50
20 30 40 50 60 70 80 90 0

Figure 5.9. T-M Wave propagation Hy component.

0.0007
0.0006
0.0005
0.0004
0.0003
0.0002
0.0001
0
-0.0001
100

0 10 50
20 30 40 50 60 70 80 90 0

Figure 5.10. T-M Wave propagation Hy component.


92

0.0007
0.0006
0.0005
0.0004
0.0003
0.0002
0.0001
0
-0.0001
100

0 10 50
20 30 40 50 60 70 80 90 0

Figure 5.11. T-M Wave propagation Hy component.

0.0015

0.001

0.0005

100
-0.0005
0 10 50
20 30 40 50 60 70 80 90 0

Figure 5.12. T-M Wave propagation Hy component.


0.25
Yee
R-K
0.2

0.15

0.1

0.05

-0.05

-0.1

-0.15

-0.2

-0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 5.13. T-M Wave propagation - comparison with Yee scheme.


93

0.25
Yee
R-K
0.2

0.15

0.1

0.05

-0.05

-0.1

-0.15

-0.2

-0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 5.14. T-M Wave propagation - comparison with Yee scheme.


0.25
Yee
R-K
0.2

0.15

0.1

0.05

-0.05

-0.1

-0.15

-0.2

-0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 5.15. T-M Wave propagation - comparison with Yee scheme.


0.25
Yee
R-K
0.2

0.15

0.1

0.05

-0.05

-0.1

-0.15

-0.2

-0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 5.16. T-M Wave propagation - comparison with Yee scheme.


94
0.1
Yee
R-K
0.08

0.06

0.04

0.02

-0.02

-0.04

-0.06

-0.08

-0.1
-5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 5.17. T-M Wave propagation - comparison with Yee scheme.


as
f (t) = ,2 ( , t) e, ( , t) (5.6)
2

where  is the time delay parameter

 = t , (~r  r^) =c

and
0 <  < 2 t

!2

=
4
t
and is a user de ned parameter that dictates the width of the pulse. The
value of used for the FVTD run in this case was 756. It should be mentioned at
this point that the FVTD results are being compared to the results obtained with
the PSU-FDTD code [11] that employs the Yee algorithm. An equivalent of 64 was
used in the FDTD run. (The FDTD run was made with a cell size of 0.01m or ten
cells per radii) The di erence in the value can be attributed to the di erence in
95

Figure 5.18. Grid for the sphere.


96

near zone points x


of comparison

incident
x x θ x
direction

Figure 5.19. Points of near zone comparison for the sphere.


the time step, because the value of t has to be the same for both codes. Since the
value of de nes the spectral band width it has to be chosen judiciously so as not to
create any unnecessary noise. The highest frequency for which the RCS is computed
corresponds to a wavelength, roughly equal to the radius of the sphere, that would
be resolved by 14 cells per wavelength. It should be noted here that the FDTD grid
comprised of 262144 grid cells, which is almost double the FVTD grid of 124246 grid
points.
Near zone (i.e. points within the computational space) and far- eld comparisons
of the FVTD and FDTD solutions were made in order to ascertain the validity of
the FVTD algorithm. For the near zone comparisons two points were chosen in the
backscatter direction at distances of 0.02m and 0.04m from the surface of the sphere.
One point was sampled at an angle of 90 degrees from the direction of incidence and
another point of comparison was chosen in the shadow region behind the sphere at an
angle of 180 degrees ( gure 5.19). Figure 5.20 show the near zone comparisons of the
Ez component at two locations in the backscatter directions. The comparison in the
rst plot in Figure 5.20 is taken at a location 0.02m from the surface of the sphere
and the location for the comparison in the second plot of Figure 5.20 is located at
a distance of 0.04m from the sphere. Both gures clearly indicate that the FVTD
97

algorithm captures the creeping wave [64] behind the main component of the re ected
wave. Figure 5.21 show the comparison of the Ez eld component at locations 0.04m
from the surface of the sphere and  = 90 and 180 respectively. All four near zone
comparisons are found to be in very good agreement.
Figure 5.22 is a comparison of the backscatter cross-section against frequency
of the FVTD and FDTD algorithms and the exact solution. The exact solution for
the monostatic radar cross section can be written as

j E s j2 1 (,1)n (2n + 1) 2
2 X

3d = rlim 2
!1 4r jEi j2 = 4 n=1 H (5.7)

^ n(2) (kR)H^ n(2) (kR)


0

and
@ H^ n(2) (kR)
H^ n(2) (kR) =
0

@ (kR)
where
H^ n(2) (kR) is the spherical Hankel function of the second k ind
 is the wavelength
k is the wavenumber
R is the radius of the sphere
The comparison between the exact solution and the FVTD solution is excep-
tionally good especially at low frequencies and is seen to be an improvement over the
FDTD solution.
The time evolution of the scattering process is depicted in gures 5.23 and 5.24.
Figure 5.23 shows the scattering process evolving in a plane perpendicular to the
direction of incidence whereas gure 5.24 is along a tangential plane to the direction
of incidence. Both gures depict the scatter waves emanating from the surface of the
sphere and travelling outwards towards the outer boundary.
98

NEAR FIELD COMPARISON


2.5
"FVTD"
"PSU-FDTD"
2

1.5

1
E_z Amplitude

0.5

-0.5

-1

-1.5

-2

-2.5
0 1 2 3 4 5 6
TIME (x 1.0e-9 secs)

NEAR FIELD COMPARISON


2
"FVTD"
"PSU-FDTD"
1.5

0.5
E_z Amplitude

-0.5

-1

-1.5

-2
0 1 2 3 4 5 6
TIME (x 1.0e-9 secs)

Figure 5.20. Scattered eld comparison for the sphere 0.02m and 0.04m in the backscat-
ter direction.
99

Near Zone Comparison 180 deg.


2.5
"FDTD"
2 "FVTD"

1.5

0.5

0
E_z

-0.5

-1

-1.5

-2

-2.5

-3
0 0.5 1 1.5 2 2.5 3 3.5 4
time (Nano secs)

Near Zone Comparison 90 deg


1.5
"FDTD"
"FVTD"
1

0.5

0
E_z

-0.5

-1

-1.5

-2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
time (Nano secs)

Figure 5.21. Scattered eld comparison for the sphere at  = 90 and  = 180 in the
backsc atter direction.
Backscatter RCS vs Frequency for PEC Sphere
10
Exact Soln
FDTD Soln
FVTD Soln
0

-10

-20

-30

RCS (Dbsm)
-40

-50

Figure 5.22. RCS for the sphere.


-60
0.1 0.4 0.7 1 1.3 1.6 1.9 2.2
Frequency (GHz)
100
101

5.3 Scattering from an ogive


The shaping of aerodynamic vehicles based on electromagnetic and radar con-
siderations is an important aspect of the design process. Shaping the nose of an
aeroplane/missile is of primary importance in reducing the e ective area that is visi-
ble to the radar. Most nose sections are ogive shaped, although some may be spheroids
or ellipsoids. This makes it important to study the scattering e ects and evaluate
the RCS for ogives at nose-on incidence. Therefore, the next test case investigated is
that of the metallic ogive. The ogive has a half angle of 22.62 degrees and an aspect
ratio of 5:1. The length of the ogive is 10 inches and is a standard EMMCC (Elec-
tromagnetic Code Consortium) benchmark test. An O-H type grid has been used in
this case (Figure 5.25). The grid consisted of 200 x 31 x 32 cells and is clustered near
the leading and trailing edges and close to the surface of the scatterer. A planar grid
was rst generated and rotated along the axis of symmetry. This grid is synonymous
with the grid for the E- eld components. Special treatment has been applied for
the calculation of these elds along the axis of symmetry. In the case of the sphere,
the E- elds were computed along the axis of symmetry by averaging the values of
the neighbouring elds along the rotational direction. In the case of the ogive, due
to the presence of the sharp edge, averaging is not a reasonable remedy to calculate
the elds along the line of symmetry. Therefore, special cells have to be constructed
around the points lying on the axis of symmetry. These cells are formed from the
dual grid points that lie adjacent to the axis of symmetry. Needless to say, these
special cells are (n + 2) sided polyhedrons, where n is the number of grid cells in the
rotational direction. A zonal approach is taken in solving this problem and the entire
computational domain is spread out over eight processors of the IBM SP-2 equally.
102

Figure 5.23. Scattering from the sphere in a plane perpendicular to the direction of
incidence.
103

Figure 5.24. Scattering from the sphere in a plane parallel to the direction of incidence.
104

Figure 5.25. Grid for the ogive.


105

dual grid points

axis of symmetry special E grid cell

Figure 5.26. View of a skewed cell along the axis of symmetry.


The far zone integration surface is located four grid points from the surface of
the ogive. A wideband Gaussian pulse is incident normal to the leading edge and
is de ned as
f (t) = e, ( , t) (5.8)
2

where  is the time delay parameter

 = t , (~r  r^) =c

and
0 <  < 2 t

!2

=
4
t
where is chosen such that the frequency content of the incident pulse extends up
to 5GHz. The direction of propagation of the incident pulse is parallel to the major
axis of the ogive. In gure 5.27 the co-polarized RCS is plotted versus frequency for
the FVTD and FDTD algorithms. The FDTD results have been obtained in this
106

5
4.5
FDTD
FVTD

4
Backscatter RCS vs Frequency for PEC Ogive

3.5
Frequency (GHz)
2
1.5
1
0.5 2.5 3
-20

-25

-30

-35

-40

-45

-50

-55

-60

-65

-70

RCS (Dbsm)

Figure 5.27. RCS for the ogive.


case with 4 million grid points. The FDTD solution has a resolution of 80 cells per
wavelength at 5Ghz whereas the corresponding FVTD solution has been obtained
with a resolution of only 20 cells per wavelength at the upper end of the frequency
spectrum. The reason for such a high resolution FDTD solution was the need to
resolve the sharp edge adequately with a stair-stepped grid. The comparison shown
in gure 5.27 indicates that the FVTD and the FDTD RCS data mantain the same
trends with variation in frequency and are in extremely good agreement in almost
the entire frequency range that has been resolved.
107

X Y

Figure 5.28. Surface grid for ellipsoid.


5.4 Scattering from an ellipsoid
As has been discussed in the previous chapter, shaping the target is of great im-
portance in reducing RCS. The RCS for an ogive for nose- on incidence was evaluated
in the previous section. In this section the scattering from an ellipsoid is investigated
and the resulting RCS is compared to the one for the ogive for the purpose of design
implications. With this in mind, the ellipsoid under investigation here is considered
to have the same aspect ratio of 5:1 and the same length as the ogive. An O-O type
grid was generated for this purpose ( gure 5.29). It consisted of 51 x 32 x 37 grid
points. Since the grid was perfectly axisymmetric all E- eld points along the axis
of symmetry were treated exactly as has been delineated in the previous section. A
wideband Gaussian pulse, as de ned in the previous section, is nose-on incident to
the ellipsoid. The frequency content of the Gaussian, in this case, is limited to 3.5
GHz. The resulting RCS is plotted in gure 5.30 and compared to the corresponding
RCS for the ogive.
As is clearly seen in gure 5.30 an ogival section gives a considerably lower RCS
108

X Y

Figure 5.29. Planar slices for the ellipsoid.


for nose on incidence than an ellipsoidal section, at most frequencies. Since the more
important radar frequencies are high frequencies this has considerable in uence on the
shaping of the nose section of aircraft, particularly those used for military purposes.
The time history of the scattering process from an ellipsoid is detailed in gure 5.31.
Figure 5.30. Backscatter RCS comparison for the ellipsoid and ogive.

-25
ellipsoid
ogive
-30

-35

-40
RCS (Dbsm)

-45

-50

-55

-60

-65

-70
0.5 1 1.5 2 2.5 3 3.5
Frequency Ghz

109
110

Up to this point in the thesis, all the geometries considered have been relatively
simple three dimensional shapes. In the next few sections geometries with more
complexity will be studied for the scattering e ects produced when subject to elec-
tromagentic excitation. The rst of these is the NASA almond which will be taken
up in the following section.

5.5 Scattering from the NASA almond


One of the most important EMCC benchmark targets is the NASA almond. It
is important partly, because of the nature of its geometry and, because it remains
one of the more dicult bodies for which to procure an accurate RCS. Looking at it
from a cross-sectional view, it resembles a circular-arc airfoil 5.32. However, it has a
nite thickness that is rapidly varying in the spanwise direction. This makes it rather
dicult to grid using a standard grid generator. The almond is 9.936 inches long and
it's geometrical speci cations can be de ned as

,0:416667d  x  0
x 2 ] 21  cos
y = 0:193333d  [1:0 , ( )
0:416667
x
z = 0:064444d  [1:0 , ( )2] 12  sin
0:416667

0  x  0:583338d
x 2 , 0:96] 21  cos
y = 4:83345d  [1:0 , ( )
2:08335
x
z = 1:611148d  [1:0 , ( )2 , 0:96] 21  cos
2:08335
111

Figure 5.31. Scattering of an ellipsoid with time.


112
Y

0.53338 d 0 -0.41667 d

d = 9.936 in

Figure 5.32. Schematic of the NASA almond.


where d = 9:936 inches and 0    360
Surface data from the above analytic expressions was obtained using a sinusoidal
clustering along the rotational direction. The clustering was necessary to avoid getting
high aspect ratio cells. Since an O-H type grid was desired, an axis of symmetry was
added to the almond that extended from the leading and trailing edges of the almond
to the outer boundaries at both ends ( gure 5.34). Along each equiangular curve of
surface points a standard grid generator, HYPGEN [65], was used to create a planar
grid. All the planar grids were then combined to obtain a fully three-dimensional grid.
The disadvantage of using such a grid is that orthogonality at certain parts close to
the body surface is not optimal( gure 5.35 ). Two grids were employed for all the
cases run for the almond. One of the grids has 198 x 20 x 57 grid points and the
other consists of 198 x 45 x 57 grid points. One-dimensional domain decomposition
was employed in all cases along the longest dimension. The computational domain
was spread across 7 processors equally on the IBM SP-2.
The far zone integration surface was de ned six cells away from the surface of the
almond. Due to the nature of the grid used for this test case, odd-even decoupling was
observed. It was suppressed by adding fourth order arti cial dissipation, as previously
discussed in Section 2.7. Scattering from the almond was investigated with
113

 a sinusoidal excitation of 3 GHz at nose on incidence


 a wideband Gaussian pulse (equation 5.8) for all angles of incidence from 0
degrees to a 180 degrees at intervals of 15 degrees ( gure 5.36).
The monostatic RCS computed was compared to experimental results obtained
for Vertical/Vertical polarization by Woo et al [66]. For experimental measurements
the metallic almond target was made out of aluminium. The experimental results have
been plotted by sweeping through the rotational  direction at 5 degree intervals. A
comparison of the FVTD and experimental results is compared in gure 5.37. Each
FVTD point in gure 5.37 represents a complete FVTD run for the angle of incidence
represented by  on the abcissa scale. The FVTD computational results are in very
good agreement with the experimental observations and are mostly within a 2 db
agreement margin at all angles except those at 45 degrees and 135 degrees where
the algorithm seems to underpredict and overpredict the measured RCS respectively.
The accuracy of the experimental data in [66] is not obvious since nothing has been
mentioned about the error bars in the experimental report.
The RCS is plotted against frequency for various angles in gure 5.38. The vari-
ation between two successive null depths with the angle of incidence is representative
of the minimal cross section visible in the plane of incidence. The steady state or
time harmonic scattered Ez eld solution for nose on incidence at 3Ghz is shown in
gure 5.39 for both planes parallel and perpendicular to the direction of polarization.
114

Figure 5.33. Outer grid for the NASA almond.


115

Figure 5.34. Surface grid for the NASA almond.


116

Figure 5.35. Grid depicting loss of orthogonality around the NASA almond.

o PEC o
0 180
Almond phi

different directions
of incidence

Figure 5.36. Depiction of incidence for the NASA almond.


Monostatic RCS for the Almond at 1.19 GHz
Figure 5.37. RCS for the NASA almond - VV Polarization.

-10
EXPERIMENTAL
FVTD
-15

-20

-25
RCS (Dbsm)

-30

-35

-40

-45

-50
0 20 40 60 80 100 120 140 160 180
angle (phi)

117
118

Figure 5.38. RCS vs frequency for the NASA almond - VV Polarization.


119

Figure 5.39. Ez Scattered Fields for the NASA almond at 3GHz.


120

5.6 Scattering from a trapezoidal wing


One of the main contributors to aircraft RCS is the scattering from the leading
edge of wings [6]. The radar returns from wings are important for normal and near-
normal incidence. In order to gain some qualitative insight into the scattering from
wings a perfectly conducting wing was chosen as the next target for investigation.
The wing used in this case is the trapezoidal Lockheed Wing-C [67] with camber
and twist. The purpose of this test case was three-fold. Firstly to demonstrate the
ability of the algorithm to tackle realistic three-dimensional con gurations. Secondly,
to evaluate the robustness of the dual grid based algorithm in working with di erent
types of grids, in particular a C-H based grid system with a split line extending from
the trailing edge of the wing to the outer boundary. Lastly, to get a qualitative
perspective into the mechanism of scattering at normal incidence for the wing.
The grid pertaining to a single zone of the wing is depicted in gure 5.40. Ghost
points pertaining to the magnetic eld were introduced to solve for the electric eld
points that lie on the split line between the trailing edge and the outer boundary. The
grid consisted of a 109 x 42 x 146 grid points. It was generated using PACMAPS,
which is part of the TEAM (Three-Dimensional Euler/Navier-Stokes Aerodynamic
Method) [68]. Since the grid is designed to simulate steady state aerodynamic prob-
lems it has some limitations that restrict its feasibility for time dependent electro-
magnetic problems. The grid cells at the pinch that is introduced to close the grid
at the wing tip are extremely small, thereby making the time step extremely small.
This makes evaluation of RCS at realistic frequencies a very time consuming process
requiring hundreds of thousands of time steps. However, the grid can still be used to
obtain qualitative scattered eld patterns.
121

Domain decomposition is performed along the largest dimension. A wideband


Gaussian pulse (equation 5.8) is incident on the leading edge of the wing. The incident
eld is linearly polarized along the z-direction and the direction of propagation is one
of forward incidence to the wing. Figure 5.41 shows the scattered eld along various
cross-sections of a part of the wing. Electric eld continuity is maintained along the
split line, thereby showing that the dual grid did not create any unnecessary problems
in treating grid con gurations of this type. Figure 5.42 depicts the intensity of the
Hy eld component of the magnetic eld along the various cross-sections of the dual
grid of a section of the wing. Both gures are indicative of the fact that the primary
scattering e ects take place normal to the planform of the wing as scattered eld
patterns do not show a signi cant di erence along the di erent planes in the spanwise
direction. Next a sinusoidal excitation was used with exactly the same polarization
and incidence as the Gaussian excitation. The scattered eld Ez and Hy patterns are
shown in gures 5.43 and 5.44.

5.7 Scattering from an engine inlet


Since the wing, fuselage and engine are important parts of any aircraft con gu-
ration the last test case pertains to an aircraft engine. In this case an engine inlet was
used as the scatterer. The center body, and the engine cowl are treated as perfectly
conducting. The fan section was also treated as a closed PEC boundary. This seems
to be a reasonable approximation when using wavelengths that are large compared to
the inner annulus height. Figure 5.45 depicts a planar slice of the grid that was used
in this case. It consists of 242 x 40 x 65 grid points. Since one dimension is consid-
erably larger than the other two dimensions, one-dimensional domain decomposition
was utilized along the primary direction for parallelization purposes. The schematic
122

Figure 5.40. Grid for the trapezoidal wing.


123

Figure 5.41. Scattered Ez elds for the trapezoidal wing.

Figure 5.42. Scattered H elds for the trapezoidal wing.


124

Figure 5.43. Scattered Ez elds for the trapezoidal wing.

Figure 5.44. Scattered Hy elds for the trapezoidal wing.


125

in gure 5.46 depicts the zonal distribution in this case along a planar slice. As is
evident outside the cowl this grid is rather coarse. This grid has been used in the
solution of aeroacoustics and aerodynamic type problems. However, there are certain
limitations when the grid is used to solve for electromagnetic elds. Low frequency
excitations would lead to the formations of creeping waves, that would not be cap-
tured as the grid does not completely enclose the engine. High frequency excitation,
on the other hand would be dicult to resolve as one moves away from the inlet and
the center body.
A sinusoidal excitation of 500 Mhz was used in this case. Compared to the
diameter of the inlet cowl and the grid used, this excitation is to be high frequency.
The incident wave travels along the axial direction of the engine and is linearly po-
larized along the transverse direction. Although the geometry is axisymmetric the
scattering problem is not. Figure 5.47 is a Fore Looking Aft view of the scattering
patterns on equi-angular planes along the rotational direction at three di erent axial
locations of the grid that are marked out in Figure 5.47. As is illustrated in Figure
16 the intensity of the scattered elds is greater in planes that have a component of
the incident polarisation tangential to them.
Figure 5.48 shows the scattered Ez eld pattern along a planar slice of the grid.
The scattered eld pattern is clearly seen close to the center body and around the
inlet. Beyond that the grid is very course and the disturbance appears to be damped.
There appear to be strong interference e ects from within the cavity between the
center body and the shroud, and currents seem to oscillate back and forth between
the center body and the fan.
126

Figure 5.45. Grid for the engine inlet.


127

3 8

1 7
2 1
4 3
6 5

-2 -1 0 1 2 3

Figure 5.46. Zonal Distribution for the engine inlet.

Figure 5.47. Scattered Ez elds for di erent zones of the engine inlet.
128

(2D) || Print || 4 Oct 1995 || sol.plt || solution

10

0
-5 0 5

Figure 5.48. Scattered Ez eld for a planar slice of the engine inlet.
Chapter 6

CONCLUSIONS AND FUTURE RECOMMENDATIONS

6.1 Summary and Conclusions


A time domain algorithm has been presented to solve Maxwell's equations over
generic body conformal grids for three dimensional geometries. The algorithm em-
ploys a nite volume approach and solves the integral form of the equations. The dual
nature of Maxwell's equations are well exploited in that the algorithm uses a dual
mesh. One set of grid points is used to solve for the electric eld components and the
dual grid points are used for obtaining the solution of the magnetic eld components.
The staggered nature of the algorithm gives it better dispersion and dissipation char-
acteristics. With a second order staggered scheme resolution normally provided by
fourth order non-staggered schemes can be obtained. In most cases, the dissipation
provided by the grid staggering precludes the need to explicitly add any arti cial
dissipation.
The algorithm is designed for scattering problems and utilizes the separate eld
formalism, where the incident eld is calculated analytically and the scattered eld
is computed from the solution of Maxwell's equations. It uses a four stage Runge-
Kutta scheme for time integration. A time domain near-to-far zone transformation
is implemented to evaluate the response in the far eld.
Parallel implementation of the algorithm is carried out in both the data par-
allel and message passing paradigms. Although both versions of the computer code
performed signi cantly better than the serial version, the message passing paradigm
129
130

was found to be more attractive. The message passing approach provided more con-
trol and exibility in distributing data over the computational resource and load
balancing. The far eld transformation too, was better suited to a message passing
environment. The dual nature of the grid aided in parallelization, since only one of
the elds had to be communicated between processors. There is, however, an overlap
of one layer of cells between domains that is a computational overhead due to the
elds being computed for the very same dual grid points in contiguous domains. The
savings in message passing more than o sets this marginal increase in computational
work.
Unlike algorithms that use stair-stepped grids in computational electromagnet-
ics, the algorithm developed and discussed in this thesis has shown to give good RCS
predictions for bodies with varying degrees of curvature, including those bodies with
sharp and rounded edges. In certain cases, favourable results have also been obtained
for grids that show poor orthogonality at the surface of the scatterer.
Although the nite volume algorithm is found to be more expensive than con-
ventional FDTD algorithms that are leapfrogged in time and space and use Cartesian
meshes, the number of cells required to model moderately complex to complex bod-
ies is normally much less for the nite volume algorithm, thus making them more
attractive in the overall scheme of things.
Perhaps, it should be pointed out that the purpose for the development of the
FVTD algorithm was to complement the FDTD e ort. Simple shapes can be very
eciently modelled using FDTD and the FVTD algorithm would be a very expensive
and impractical option in such cases. The Cartesian meshes required for FDTD mod-
elling are relatively easier to generate than the curvilinear meshes needed for FVTD
modelling. Wave propagation properties are better simulated on regular Cartesian
131

grids than on orthogonal curvilinear meshes.


The four stage time integration scheme makes the scheme seemingly very expen-
sive compared to other nite volume methods that are currently being used. Most of
these methods use upwind based methods that employ smaller time steps and need
to pack the grid very nely in order to get the required resolution and accuracy.
By parallelizing the algorithm in the message passing paradigm with MPL
it has been demonstrated that the algorithm can e ectively handle realistic three-
dimensional scattering targets using a zonal approach.

6.2 Limitations and Drawbacks


One of the most severe limitations of an explicit staggered algorithm such as
that detailed in this thesis is it's dependence on grid induced dissipation. When-
ever the grid has high aspect ratio cells, high frequency errors may be generated.
The use of explicit arti cial dissipation is usually needed to damp out these errors.
Therefore a judicious balance needs to be maintained between the inherent dissipa-
tion and explicitly adding arti cial dissipation. Sometimes, a trial and error method
needs to be adopted and this often entails the cumbersome task of re-generating the
computational grid.
Most of the test cases that have been discussed in this thesis have been obtained
using the message passing algorithm that has been implemented using MPL. By
using MPL, the portability of the code has been lost since MPL is IBM's proprietary
message passing language. However, the modi cations are not dicult to adapt
this code to other message passing schemes. Secondly, by using one-dimensional
domain decomposition the computer code was made problem speci c for most of the
geometries. This required the tedious task of re-writing parts of the computer code
132

when working with di erent geometries.

6.3 Future Work And Recommendations


Long time stability has been a matter of concern for most computer codes and
algorithms in computational electromagnetics. This is mostly due to the fact that the
surface currents grow exponentially due to instabilities caused by interior resonances
[64]. There needs to be more e ort spent in this area, although multi-stage and
multi-level schemes do seem to give better stability characteristics.
The Improvement of higher order spatial accuracy needs to be investigated.
However, it should be pointed out that higher order accuracy would require larger
stencils that would create problems of book-keeping with staggered grids and make
parallelization less attractive due to the additional message passing involved.
It is strongly recommended that the algorithm be tested out on a full scale
airplane con guration where the zonal gridding structure of the airplane is not too
complicated. However, such a simulation would be limited by the number of proces-
sors available on today's message passing computers and may be at frequencies that
are not physically realistic.
As and when an adequate amount of memory becomes available it would be
desirable to add an extra dimension to the array structures and run the scattering
problem at several angles of incidence simultaneously.
It would also be highly desirable to use the FVTD algorithm to solve the inverse
problem of the design of a con guration based on the radar signature rather than to
solve the forward problem of the analysis of RCS after a body is designed.
132

[69] [70] [71] [72] [57] [60] [73] [56] [74] [1] [36] [75] [62] [76] [77] [66] [78] [46] [79]
[19] [20] [18] [35] [80] [25] [81] [82] [54] [55] [83] [73] [48]
REFERENCES

[1] Hwang, K. Advanced Computer Architecture with Parallel Programming. Mc-


Graw Hill, 1993.

[2] Shankar, V. A giga op performance algorithm for solving maxwell's equations of


electromagnetics. AIAA Paper 91-1578. 29th AIAA Aerospace Sciences Meeting
and Exhibit, 1991.

[3] Umashankar, K. R., and Ta ove, A. A novel method to analyse electromagnetic


scattering of complex objects. IEEE Transactions in Electromagnetic Compati-
bility, EMC-24 , pp. 397{405, 1982.

[4] Shankar, V., Hall, W. F., Mohammadian, A., Rowell, C., and Palaniswamy, S.
Advances in time-domain cem using structured/unstructured formulations and
massively parallel architectures. AIAA Paper 95-1963. 26th AIAA Plasmady-
namics and Lasers Conference, 1995.

[5] Rayleigh, J. W. S. Theory of Sound Vol 1 and 2. Dover Publications Inc., 1945.

[6] Fuhs, A. E. The No-See-Um Book Radar Cross Section Lectures. American
Institute of Aeronautics and Astronautics, 1988.

[7] Hitzel, S. M. Aerodynamics and radar-signature - a theoretical approach to es-


timate the radar-signature of complex aircraft con gurations compatible with
aerodynamic panel methods. AIAA Paper 86-1770. 4th AIAA Applied Aerody-
namics Conference, 1986.

133
134

[8] Harrington, R. F. Field Computations by Moment Method. Macmillan, New


York, 1968.

[9] Moore, J., and Pizer, R. Moment Methods in Electromagnetics. Wiley, New
York, 1984.

[10] Goorjian, P. M. Algorithm development for maxwell's equations for computa-


tional electromagnetism. AIAA Paper 90-0251. 28th AIAA Aerospace Sciences
Meeting and Exhibit, 1990.

[11] Kunz, K. S., and Luebbers, R. J. The Finite-di erence Time Domain Method
for Electromagnetics. CRC Press, 1993.

[12] Nguyen, B. T., and Hutchinson, S. A. The upwind leapfrog algorithm scheme for
3-d electromagnetic scattering and its implementation on two massively parallel
computers. 10th Annual Review of Progress in Computational Electromagneti
cs, 1994.

[13] Yee, K. S. Numerical solution of initial boundary value problems involving


maxwell's equations in isotropic media. IEEE Transactions on Antennas and
Propagation, AP-14, 8 , pp. 302{307, 1966.

[14] Ahuja, V., and Long, L. N. A fvtd algorithm for maxwell's equations on mas-
sively parallel machines. 11th Annual Review of Progress in Computational
Electromagneti cs, 1995.

[15] Noack, R. W., and Anderson, D. A. Time domain solutions of maxwell's equa-
tions using a nite volume formulation. AIAA Paper 92-0451. 30th AIAA
Aerospace Sciences Meeting and Exhibit, 1992.
135

[16] Rowell, C., Shankar, V., Hall, W. F., and Mohammadian, A. Advances in
time-domain cem using massively parallel architectures. 11th Annual Review
of Progress in Computational Electromagneti cs, 1995.

[17] Shang, J. S. A characteristic-based algorithm for solving the 3-d time-domain


maxwell equations. AIAA Paper 92-0452. 30th AIAA Aerospace Sciences Meet-
ing and Exhibit, 1992.

[18] Ambrosiano, J. J., Brandon, S. T., Lohner, R., and DeVore, C. R. Electro-
magnetics via the taylor-galekin nite element method on unstructured grids.
Journal of Computational Physics, 110 , pp. 310{319, 1994.

[19] Madsen, N. K., and Ziolkowski, R. W. Numerical solution of maxwell's equations


in the time domain using irregular nonorthogonal grids. Wave Motion, 10 ,
pp. 583{596, 1988.

[20] Lee, R. L., and Madsen, N. K. A mixed nite element formulation for maxwell's
equations in the time domain. Journal of Computational Physics, 88 , pp. 284{
304, 1990.

[21] James, G. L. Geometric Theory of Di raction for Electromagnetic Waves. Pere-


grinus, London, third edition, 1986.

[22] Keller, J. B. Geometrical theory of di raction. Journal of Optical Society of


America, Vol 52, No. 2 , pp. 116{130, 1962.

[23] Kuoyoumijian, R. G. Asymptotic high-frequency methods. Proc. IEEE, Vol 53


, pp. 864{876, 1965.
136

[24] Holland, R. Pitfalls of staircase meshing. IEEE Transactions in Electromagnetic


Compatibility, 38, 12 , pp. 434{439, 1993.

[25] Vichnevetsky, R. Wave propagation analysis of di erence schemes for hyperbolic


equations: A review. International Journal For Numerical Methods in Fluids, 7
, pp. 409{452, 1987.

[26] Yee, K. S., and Chen, J. S. Hybrid nite-di erence time domain and nite volume
time domain in solving maxwell's equations. 11th Annual Review of Progress in
Computational Electromagneti cs, 1995.

[27] Riley, D. J., and Turner, C. D. Unstructured nite-volume modeling in com-


putational electromagnetics. 11th Annual Review of Progress in Computational
Electromagnetics, 1995.

[28] Shostko, A. Three dimensional parallel unstructured grid generation. AIAA


94-0418. 32nd Aerospace Sciences Meeting and Exhibit, 1994.

[29] Lohner, R., and Parikh, P. Three dimensional grid generation by the advancing
front method. Int Journal of Numerical Methods in Fluids, No 8, pp. 1135{1149,
1988.

[30] Frink, N. T., Parikh, P., and Pirzadeh, S. A fast upwind solver for the euler equa-
tions on three-dimensional unstructured meshes. AIAA 91-0102. 29th Aerospace
Sciences Meeting and Exhibit, 1991.

[31] Weinberg, Z., and Long, L. N. A massively parallel solution of the three dimen-
sional navier-stokes equations on unstructured adaptive grids. AIAA 94-0760.
32nd Aerospace Sciences Meeting and Exhibit, 1994.
137

[32] Barth, T. J., and Jespersen, D. C. The design and application of upwind schemes
on unstructured meshes. AIAA 89-0366. 27th Aerospace Sciences Meeting and
Exhibit, 1989.

[33] Halt, D. W., and Agarwal, R. K. Compact higher order characteristic-based euler
solver for unstructured grids. AIAA Journal, Vol 30 No 8, pp. 1993{1999, 1992.

[34] Morano, E., and Mavriplis, D. Implementation of a parallel unstructured euler


solver on the cm-5. AIAA 94-0755. 32nd Aerospace Sciences Meeting and Exhibit,
1994.

[35] Madsen, N. K., and Ziolkowski, R. W. A three-dimensional modi ed nite vol-


ume technique for maxwell's equations. Electromagnetics, 10 , pp. 147{161,
1990.

[36] Freeman, T. L., and Phillips, C. Parallel Numerical Algorithms. Prentice Hall,
1992.

[37] Liu, Z. M., Mohan, A. S., Aubrey, T. A., and Belcher, W. R. Parallelized fdtd
for antenna radiation pattern calculations. 11th Annual Review of Progress in
Computational Electromagneti cs, 1995.

[38] Shang, J. S., Calahan, D. A., and Vikstorm, B. Performance of a nite vol-
ume cem code on multicomputers. AIAA Paper 94-0236. 32nd AIAA Aerospace
Sciences Meeting and Exhibit, 1994.

[39] Ahuja, V., and Long, L. N. A message passing nite volume algorithm for
maxwell's equations on parallel machines. AIAA Paper 95-1967. 26th AIAA
Plasmadynamics and Lasers Conference, 1995.
138

[40] Balanis, C. A. Advanced Engineering Electromagnetics. John Wiley and Sons,


1989.

[41] Kraus, J. D., and Carver, K. R. Electromagnetics. McGraw-Hill, 2 edition, 1973.

[42] Ramo, S., Whinnery, J. R., and Duzer, T. V. Fields and Waves in Communica-
tion Electronics. Wiley, 3 edition, 1993.

[43] Maxwell, J. C. Treatise on Electricity and Magnetism. Dover, 3 edition, 1954.

[44] Pierce, A. D. Linear acoustics. Encyclopedia of Applied Physics, Vol 1, pp. 137{
182, 1991.

[45] Chyczewski, T. S., and Long, L. N. A higher order accurate parallel algorithm
for aeroacoustic applications. AIAA Paper 94-2265, 1994.

[46] Ozyoruk, Y., and Long, L. N. A navier-stokes/kirchho method for noise radi-
ation from ducted fans. AIAA Paper 94-0462. 32th AIAA Aerospace Sciences
Meeting and Exhibit, 1994.

[47] Kordulla, W., and Vinokur, M. Ecient computation of volume in ow predic-


tions. AIAA Journal, 21, No. 6 , pp. 917{918, 1983.

[48] Rai, M. M., and Chakravarthy, S. Conservative high-order-accurate nite-


di erence methods for curvilinear grids. AIAA Paper 93-3380. 11th AIAA Com-
putational Fluid Dynamics Conference, 1993.

[49] Vichnevetsky, R., and Bowles, J. B. Fourier Analysis of Numerical Approxima-


tions of Hyperbolic Equations. Siam, 1982.
139

[50] Jameson, A., Schmidt, W., and Turkel, E. Numerical solutions of the euler
equations by nite volume methods using runge-kutta time-stepping schemes.
AIAA Paper 81-1259. 14th Fluid and Plasma Dynamics Conference, 1981.

[51] Neumann, J. V., and Richtmyer, R. D. A method for the numerical calculations
of hydrodynamical shocks. Journal of Mathematical Physics, 21, pp. {, 1950.

[52] Lax, P. D., and Wendro , B. Systems of conservation laws. Communications in


Pure and Applied Math, 13, pp. 217{237, 1960.

[53] Agarwal, R. K., and Deese, J. E. Transonic wing-body calculations using euler
equations. AIAA 83-0501. 21st Aerospace Sciences Meeting and Exhibit, 1983.

[54] Mur, G. Absorbing boundary conditions for the nite-di erence approximation
of the time-domain electromagnetic- eld equations. IEEE Transactions on Elec-
tromagnetic Compatibility, EMC-23 No. 4 , pp. 377{382, 1981.

[55] Engquist, B., and Majda, A. Absorbing boundary conditions for the numerical
simulation of waves. Mathematics of Computation, 31 No. 139 , pp. 629{651,
1977.

[56] Liao, Z. P., Wong, H. L., Yang, B., and Yuan, Y. A transmitting boundary for
transient wave analyses. Scientia Sinica, 27 No. 10 , pp. 1063{1076, 1984.

[57] King, R. W. P., and Jr., C. W. H. Antennas and Waves: A Modern Approach.
The MIT Press, 1969.

[58] Lighthill, M. J. On sound generated aerodynamically i general theory. Proceed-


ings of the Royal Society, 221A , pp. 564{587, 1952.
140

[59] Williams, F., and Hawkings, D. L. Sound generated by turbulence and surfaces
in arbitrary motion. Philosophical Transactions of the Royal Society, 264A ,
pp. 321{342, 1969.

[60] Lyrintzis, A. S. The use of kirchho 's method in computational aeroacoustics.


FED-Vol 147, Computational Aero- and Hydro-Acoustics. ASME, 1993.

[61] Luebbers, R. J., Kunz, K. S., Schneider, M., and Hunsberger, F. A nite-
di erence time-domain near zone to far zone transformation. IEEE Transactions
on Antennas and Propagation, 39, No. 4 , pp. 429{433, 1991.

[62] CM Fortran Libraries Reference Manual - Version 2.1. Thinking Machines Cor-
poration, 1994.

[63] Wesley, R., Wu, E., and Calahan, D. A. A massively parallel navier-stokes
implementation. AIAA Paper 89-1940, 1989.

[64] Jones, D. S. Methods in Electromagnetic Wave Propagation. Oxford Science


Publications, 2 edition, 1994.

[65] Chan, W. M. User Guide for Hypgen - Version 1.3. 1993.

[66] Woo, A. C., Wang, H. T. G., and Schuh, M. J. Benchmark radar targets for
the validation of computational electromagnetics programs. IEEE Antennas and
Propagation Magazine, 35, No. 1 , pp. 84{89, 1993.

[67] Long, L. N., Khan, M. M. S., and Sharp, H. T. Massively parallel three-
dimensional euler/navier-stokes method. AIAA Journal, vol 29, no. 5, pp. 657{
666, 1991.
141

[68] Raj, P., Sikora, J. S., and Olling, C. R. Three-dimensional Euler/Navier-Stokes


Aerodynamic Method. Technical Report AFWAL-TR-87-3074, 1989.

[69] Wang, H. T. G., Sanders, M. L., and Woo, A. Radar Cross Section Measurement
Data on Low-Cross-Section Targets Part-1. Report NWC TM 7002, 1991.

[70] Anderson, D. A., Tannehill, J. C., and Pletcher, R. H. Computational Fluid


Dynamics and Heat Transfer. Hemisphere Publishing Corporation, 1989.

[71] Hirsch, C. Numerical Computation of Internal and External Flows Volume 1 and
2. John Wiley and Sons, 1988.

[72] Mott, H. Polarization in Antennas and Radar. John Wiley and Sons, 1986.

[73] Freund, J. B., Lele, S. K., and Moin, P. Calculation of the radiated sound eld
using an open kirchho surface. CEAS/AIAA 95-061. First Joint CEAS/AIAA
Aeroacoustics Conference, 1995.

[74] Towne, D. H. Wave Phenomena. Dover Publications Inc., 1988.

[75] Koelbel, C. H., Loveman, D. B., Schreiber, R. S., Jr., G. L. S., and Zosel, M. E.
The High Performance Fortran Handbook. The MIT Press, 1994.

[76] IBM AIX Parallel Environment Programming Primer - Version 2.0. IBM Cor-
poration, 1993.

[77] IBM AIX Parallel Environment Parallel Programming Reference - Version 1.0.
IBM Corporation, 1994.
142

[78] Ta ove, A. Re-inventing electromagnetics: Supercomputing solution of maxwell's


equations via direct time integration on space grids. AIAA Paper 92-0333. 30th
AIAA Aerospace Sciences Meeting and Exhibit, 1992.

[79] Bhattacharyya, A. K., and Sengupta, D. L. Radar Cross Section Analysis and
Control. Artech House, 1991.

[80] Ta ove, A., and Umashankar, K. R. The nite-di erence time-domain (fd-td)
method for electromagnetic scattering and interaction problems. Journal of Elec-
tromagnetic Waves and Applications, 1 , pp. 243{267, 1987.

[81] Chew, W. C. Waves and Fields in Inhomogeneous Media. Van Nostrand Rein-
hold, 1990.

[82] Senior, T. B. A., and Volakis, J. L. Approximate Boundary Conditions in Elec-


tromagnetics. The Institution of Electrical Engineers, London, 1995.

[83] Ozyoruk, Y., and Long, L. N. A navier-stokes/kirchho method for noise radi-
ation from ducted fans. AIAA 94-0462. 32nd Aerospace Sciences Meeting and
Exhibit, 1994.
Appendix A

Di erent forms of Maxwell's equations

Maxwell's equations can be written in a fully conservative vector form in the


physical x,y,z coordinate system as

@Q + @E + @F + @G = S (A.1)
@t @x @y @z
where the solution vector Q can be written as
2 3
66 H x 77
66 77
66 H y 77
66 77
6 H 77
Q = 66 z
77 (A.2)
66 E 77
66 x
77
66 E 77
64 y
5
E z

and the ux vectors E,F and G can be de ned as


2 3 2 3 2 3
66 0 77 66 E 77
1
 z 66 ,  Ey 77
1

66 7 66 7 66 77
66 , 1 Ez 777 66 0 777 66 1

E x 77
66 77 66 7 66 7
6 1
E 77 6 , 1 Ex 777 6 0 777
E = 66  y
7 F = 66 7 G = 66 77 (A.3)
66 0 777 66 , 1 Hz 777 66 1
H 77
66 77 66 7 66  y
7
66 1
H 77 66 0 777 66 , 1 Hx 777
64  z
5 64 5 64 5
,H 1
 y
1

H
x 0

143
144

while the source vector on the right hand side can be written as
2 3
66 0 77
66 7
66 0 777
66 7
6 0 777
S = 66 7 (A.4)
66 , 1 Jx 777
66 7
66 , 1 Jy 777
64 5
,J 1
 z

The system of governing equations can be transformed to generalised body- tted


curvilinear coordinates and written as

@ Q~ + @ E~ + @ F~ + @ G~ = S~ (A.5)
@t @ @ @
where

Q~ = QJ (A.6)

E~ = J1 ( E +  F +  G)
x y z (A.7)

F~ = J1 ( E +  F +  G)
x y z (A.8)

G~ = J1 ( E +  F +  G)
x y z (A.9)

and J = @x;y;z
@;;
is the determinant of Jacobian geometric transformation matrix from
the physical to the computational space.
For an arbitrary direction of wave propagation n the system of Maxwell's equa-
tions can be written as
145

@Q + @F = S (A.10)
@t @n
where Q is de ned as earlier and F is now de ned as
2 3 2 3
66 H
x 77 66 1

(Ez ny , Ey nz ) 77
66 77 66 77
66 H
y 77 66 1

(Exnz , Ez nz ) 77
66 77 66 77
6 H 77 6 1
(Ey nx , Exny ) 77
Q = 66 z
77 F = 66 
77 (A.11)
66 E 77 66 1
(Hy nz , Hz ny ) 77
66 x
77 66 
77
66 E 77 66 1
(Hz nx , Hxnz ) 77
64 y
5 64 
5
E
z
1

(Hxny , Hy nx)
The Jacobian matrix of F with respect to the independent variables in Q can
be written as 2 3
66 0 0 0 0  ,nz ny
77

66 7
66 0 0 0 nz 0 ,nx 777
66 7
66 0 0 0 ,ny nx 0 777
A=6 77 (A.12)
66 0 nz ,ny 0 0 0 77
66  
77
66 ,nz 0 nx 0 0 0 77
64  
5
ny ,nx 0 0 0 0
 

The eigenvalues of the system are 0; 0; c; c; ,c; ,c


The bicharacteristic form of the Maxwell's equations can be written as:
" #
@  c @ D  H  , @H  c @D = ,J z x x
(A.13)
@t @x c @z @y y y

" #
@  c @ D  H  + @H  c @D = ,J y x x
(A.14)
@t @x c @y @z z z

You might also like