1997 (6)
1997 (6)
2, MARCH 1997
Abstract -
This paper presents the application of tools which can be achieved employing neural
non-linear neural optimization networks to the Neural networks have been found recently [ 131-[
analysis of arbitrarily shaped microstrip Patch very useful tool in the solution of different classes of
antennas with a very general bianisotropic mathemtical optimization problems with general, even non
grounded slab. A neural network approach is
presented to solve in a new and efficient way the
linear, constraints. The developed neural networks have been
integral equation which describes, from the proved to be very fast and robust in application to the
electromagnetic point of view, the planar solution of many technical problems that can be expressed as
integrated structure. the optimization task.
The primary advantage, infact, of using neural networks is
I INTRODUCTION that it is potentially stable. This is due to the parallel
distributed processing structure of neural networks as well as
Microstrip antennas have been extensively investigated the high degree of interconnectivity and feedback. Neural
experimentally, analytically, and numerically for decades. networks also offer the characteristics of generalization and
Many numerical methods have been serving the engineers and error tolerance.
researchers in the analysis and design of these conformal
antennas for many years. Among them the moment method 11. FORMULATION OF THE PROBLE
in conjunction with various Integral Equation (IE)
formulations played a major role [1]-[3]. E formulations for Consider the three-dimensional structure illustrated in Fig. 1
where a microstrip Patch antenna Or array is residing On Or I
the analysis of microstrip structures are the most elaborate
techniques available and provide "almost rigorous" solutions. embedded in a substrate recessed in a ground plane. We are
They are, also, the most computer intensive, requiring interested in the CharaCteriZatiOn Of the Scattering and radiation
considerable computation time and computer memory. They PrOPe*eS ofthis antenna fed by an electric Point-sowce.
can be used as "standard references", providing data against
which results provided by approximations may be validated.
Nevertheless, with the increasing availability of computing
power, they have become quite popular for analyzing
microstrip antennas and microwave, millimetre wave circuits.
However, IE methods are associated with field
representations in which the appropriate Green's function for
the specific geometry must be employed, and this, of course,
limits their versatility. Moreover, IE techniques are usually
formulated on the assumption of an infinite substrate, a
model that obviously deviates from the practical
y=d
configuration, leading to inaccuracies for larger bandwidth Grousd PLane
antennas. Furthermore, in the context of IE methods, antenna
excitations are represented using simplified models that differ Fig.] Geometry of a mcrostrip patch antenna
more or less from the actual configurations. Also, we have to
mention the additional IE complexities due to possible An IE description for a microstrip circuit or an antenna is
substrate anisotropies or inhomogeneities in the antenna obtained by specifying the boundary conditions that must be
substrates. To overcome all these limitations, in this paper satisfied by the electromagneticfield. The conditions apply to
we present a new approach to the solution of IE which makes the total fields, formed by an excitation (the impressed field)
use of neural networks. and a scattered part (the induced field). The scattered field is
A crucial point of the method is the accurate determination produced by a set of unknown induced sources (currents and ,
of dyadic Green's functions associated with the integrated charges on the conductors). Within any linear system, the
structure, sustained by arbitrarily oriented electric point- scattered field can be expressed as a superposition integral of
'
source embedded in the grounded substrate. the unknown sources, weighted by Green's functions G(rlr'),
Expressions for the Green functions for a biisotropic chiral that are the fields produced by elementary point-sources:
medium are set up in a systematic way in [4], well suited,
besides, for the extension to very general bianisotropic
grounded slabs [5]-[ 121. E(r) = G(rlr'>J(r')dv' (1)
The complexity of the electromagnetic problem asks for iv7
sophisticated design optimization and sensitivity analysis
This expression represents the electric field at the point r,
Manuscript received March 18, 1996.
the integration being carried out over the variable r'. The
Authorized licensed use limited to: Meerut Institute of Engineering and Technology. Downloaded on February 23, 2009 at 01:14 from IEEE Xplore. Restrictions apply.
1415
integral is evaluated over the volume V' containing the S(rlr') satisfies the differential equation:
sources J(r'). Introducing the integral formulation for the
scattered field into the boundary conditions yields a set of L(E) = P E = jwJ (5)
coupled integral equations for the unknown induced sources,
One or several boundary conditions can be included in the with:
Green's function, thus reducing the number of integral
equations at the cost of greater complexity of the Green's
functions. When a modified Green's function satisfies some
boundary conditions, any field obtained as a superposition
integral with this Green's function satisfies the same
boundary conditions.
The surface currents flowing on the conductors and the
corresponding surface charge distribution are not yet known
and must be determined by imposing the boundary conditions
for the tangential electric field on the surface S' of the upper
conductors:
n
+ [--+1--+ a2
pXxa z2 pZza x2
1 a2 5 r
w2Ey,-w2=lyy
PYY
- +
yx[Ei(r) + E,WI = Z, & J ~ W (2)
where Ei is the incident electric field, J, is the surface current
density on S'induced by Ei, E, is the scattered field sustained
by Js and Z, is the surface impedance of the metallic patch
on S',taking into account the finite conductivity of metallic
sheets:
Z, = (1 + j) (3)
20
w is the operating angular frequency, p is the metal For constructing the Green's function we take the Fourier
permeability and CJ is the effective conductivity of the metal, transform of (6). Thus:
including the possible effect of roughness. The surface
impedance Z,vanishes on a perfect electric conductor.
The electromagneticquantities can be expressed in the two- (7)
dimensional Fourier domain, and subsequent developments
can be carried out in the transformed domain. The analysis
and the solution of the electromagnetic field problem being R = r - r',
becomes simpler if the Fourier transformed domain- or n n
spectral domain- approach is used. This is mainly due to the k = kxx + kyY + kz;,
fact that Green's function convolution integrals in the spectral
domain are turned into algebraic products. In this manner,
also, one avoids the (previously) difficult integration of badly
behaved functions required to obtain the Green's function in
Substitution of (7) in (1) gives an integral representation
the spatial domain. This method has been generalized by the
for the spatial electric field sustained by the integrated planar
authors to study the electromagnetic behavior of microstrip
antennas with planar symmetry realized on a very general structure.
bianisotropic grounded slab [4]-[121. The spatial electric Green's function for integrated
From the electromagnetic point of view, a very general structures with planar symmetry can be recovered in terms of
bianisotropic medium can be characterized by the following the Sommerfeld integrals of the form:
constitutive relations:
D =p E + b H
(4)
B = P E + &H
Authorized licensed use limited to: Meerut Institute of Engineering and Technology. Downloaded on February 23, 2009 at 01:14 from IEEE Xplore. Restrictions apply.
1416
v = (E(r1I,... R N ) ) .
Define the linearly independent sets { @k and { wk \g=
where +k and wk are called expansion functions and weighting The matrix equation (15) is an example of a functional
mapping: y=@(x), with x and y N- and M-dimensional real
functions, respectively. Define a sequence of approximants to
vectors, respectively.
J by: The solution of the matrix equation leads to the
determination of the field of the antenna, and, hence, other
Jn(r') = 5
k= 1
Ik@k(r') (10) antenna parameters can be calculated.
111. NEURAL NETWORKS
A matrix equation is formed in (9) by the condition that,
upon replacement of J by Jn, the left side shall be orthogonal In numerical solutions of Maxwell's equations, integral
to the sequence { Wk} . We have: formulations are attractive because they enable researchers to
address problems without discretizing air regions. However,
(L(J,) - E , wm}= o m = I, 2,...n (11) IEformulations inherently lead to dense systems of equations
and, therefore, they are often thought to be computationally
Substitution of (10) into (11) and use of the linearity
property of the inner product gives the matrix equation of the The main difficulties with integral formulations for
moment method: microstrip antennas are the solution time to evaluate the
Authorized licensed use limited to: Meerut Institute of Engineering and Technology. Downloaded on February 23, 2009 at 01:14 from IEEE Xplore. Restrictions apply.
1417
elements of the matrix equation (15), the amount of memory output [15]. The training process for our problem can be
needed to store the system matrix, and the solution time of viewed as having four phases:
the dense system of linear equations. In practice, the storage
requirements may easily exceed the computer's memory, and 1. PRESENTATION PHASE: Present an input training
the solution time may, also, be unacceptable. To make vector (i.e. the vector Jn) and calculate each successive layer's
integral methods more appealing, effort must be devoted to output until the last layer's output is found.
solve large, dense, linear systems efficiently. In addition,
since parallel computing is an obvious way to increase the 2. CHECK PHASE Calculate the network error vector and
computer's memory capacity and computation speed, it is the sum squared error for the input vector:
important to develop efficient linear-equation solvers for both
sequential and parallel computers.
In this section we discuss how to find and implement
efficient solvers by making use of neural networks. Although
these networks were initially intended to perform cognitive
tasks, that have no precise mathematical descriptions, the where: 6@= ypk - Opk. The subscript p refers to the p-th
networks were subsequently found useful in computational exemplar of the network, Opk is the output of the k-th output-
problems and as function approximators. In recent times the layer unit for the p-th exemplar, ypk are the elements of the
term neural network is used for denoting any massively vector V (i.e. the known tangential component of the
parallel computing architectures, that consist of a large incident electric field), and there are M output-layer units.
number of simple "neural" processors. We solved the real- Stop if the sum squared error for all training vectors is less
valued E,obtained by considering the real and imaginary than the error goal or if the specified maximum number of
parts separately, using (Fig.2) the back propagation network iterations has been reached. Otherwise continue.
architecture together with a suitable generalization of the delta
rule [15]. 3. B ACKPROPAGATION PHASE: Calculate the delta
vector for the output and the hidden layers using the target
vector:
Authorized licensed use limited to: Meerut Institute of Engineering and Technology. Downloaded on February 23, 2009 at 01:14 from IEEE Xplore. Restrictions apply.
1418
The weight-update equations on both layers take on the learning rate is increased. When the learning rate is too high
following forms: to guarantee a decrease in error, it gets decreased until stable
learning resumes.
Networks are, also, sensitive to the number of neurons in
on the output layer: wgJ(t+l)= wg,(t) + q 6& 9,
(17) their hidden layers. Too few neurons can lead to underfitting
awRj as shown in Fig.4. Too many neurons can contribute to
on the hidden layer: wi(t+l) = wh<t)+ q6k,xpi. (18) overfitting, in which all training points are well fit, but the
fitting curve takes wild oscillationsbetween these points.
Then, return to phase 1. All of these details are wrapped out in our MathematicaTM
The architecture of a backpropagation network is not code. The advantage of using backpropagation networks is
completely constrained by the problem to be solved. The shown in Table 1. The integral equation (9) is solved on a
number of network inputs to the network is constrained by PC Pentium 90 MHz. This is done using the numerical
the problem. The number of neurons in the output layer is solver with or without the backpropagation network
constrained by the number of layers between network inputs, architecture implemented. The numerical code with the
and the output layer and the sizes of the layers are up to the backpropagation network architecture implemented is an order
designers. The 2-layer sigmoidlinear network has been of magnitude faster than the other one.
proven to be able to represent any functional relationship
II
between inputs and outputs, if the hidden layer has enough Number of Equations 25 50
neurons. Backpropagation follows gradient descent on the without backpropagation
- _ _ 5.1 30.3 93.5 375
loo 250
error surface to minimize network error. Local minima may with backpropagation 0.6 3.8 10.7 42.2
trap the network. The more neurons in intermediate layers the Table 1 Solution tlme.
more freedom a network has (i.e. more variables to optimize).
The additional neurons increase the chance that even a local This is mainly due to the fact that with the back
minimum will yield a low error. The error surface of a propagation network the unknown coefficients I,, in (10) are
backpropagation neural network may be very complicated. To determined by calculating the gradient of a proper error
illustrate this complexity a typical error surface is shown in surface, i.e. by computing derivatives and not integrals as for
Fig.3. The problem is that nonlinear transfer functions in the Moment Method.
multilayered networks introduce many local minima in the
error surface. As gradient descent is performed on the error IV. CONCLUSIONS
surface, it is possible for the network solution to become
trapped in one of these local minma. This paper investigates a new approach utilizing an
To decrease the training time and the probability that the artificial neural network for solving scattering problems that
network will get stuck in a shallow minimum in the error can be cast in the form of integral equation. Neural networks
surface we added a term, called momentum, to the weight- offer the advantage of superior computational ability due to
update equations. This term has a significant effect on the the high degree of parallelism and interconnectivity. This
learning speed, in terms of the number of iterations required. ability makes neural networks attractive in many other
The idea behind momentum in a neural network is applications in engineering and sciences.
straightforward. After the code adjusts the weights during one The integral equation is cast in a discretized form by
training iteration, it saves the value of that adjustment; when expressing the unknown induced current on the metallic patch
calculating the adjustment for the next iteration, the code adds as a sum of weighted basis functions. The formulation is,
a fraction of the previous change to the new one. In terms of then, developed in terms of the incident field samplings.
an equation (in this case, for the hidden layer weights): In this paper we restrict our attention to the class of
backpropagation networks. The design of the backpropagation
network for the scattering problem proposed has been
wJI(t+l)= wJ1 + q $Jxpl + wJl(t) (19) completely carried out.
The resulting algorithm has been computed exhibiting
where a is called the momentum term, typically a positive considerable speedups.Due to the speedups, the computation
number less than one, and times become tolerable, allowing for the problem size, hence
the accuracy, to be increased.
The implementation of neural networks is finding
increasing interest with the emergence of optical and VLSI
Training time can, also, be decreased by the use of an techniques [151. With the growing interest in implementation
adaptative learning rate which attempts to keep the learning issues, we anticipate the neurocomputer to become
rate as large as possible while keeping learning stable. commonplace in the near future, making the proposed
Adding an adaptative learning rate can decrease training time approach, for solving scattering problems, very promising.
further.
This procedure increases the learning rate, but only to the
extent that the network can learn without large error REFERENCES
increases. Thus, a near optimal learning rate is obtained, [l] K R. Carver and J. W. Mink, "Microstrip Antenna Technology,"
When a larger learning rate could result in stable learning, the IEEE Trans. on Antennas Propagat., vol. AP-29, pp. 2-24, January 1981.
1419
(2.1 R. J. Mailloux, J;. F. McIlvenna, and N. P. Kernweis, "Microstrip Technology Letters, vol. 7, no.5, April 5 1994,pp. 247-250.
Array Technology," IEEE Trans. on Antennas Propagat., vol. AP-29, pp. [9] A. Toscano, and L. Vegni, "Spectral Electric Green's Dyad for a
25-37, January 1981. Grounded Bianisotropic Slab Fed by a Three-Dimensional Point-Source,"
Microwave and Optical Technology Letters, vol. 7 , no. 10, July 1994, pp.
[3] J. R. James and P. S. Hall, Handbook for hkrostrip Antennas, 448-450.
London, Peregrinus, 1989.
[lo] A. Toscano and L. Vegni, "Electromagnetic Waves in Planar
[4] L. Vegni, and A. Toscano, "Spatial Electromagnetic Fields in Chiral Integrated Pseudochiral a
Structures," Progress in Electromagnetic
Integrated Structures via Sommerfeld Integrals," Japanese Transactions on Research, special issue on Wave Interaction with Chiral and Complex
Electronics, Special Issue on Electromagnetic Theory, vol. E78-C, no.10, Media, vol. 9, December 1994, pp. 181-216.
October 1995, pp. 1391-1401.
[ l l ] B. Popovski, A. Toscano and L. Vegni, "Radial and Asymptotic
[5] A., Toscano, L. Vegni, "Spectral Dyadic Green's Function Closed Form Representation of the Spatial Microstrip Dyadic Green's
Formulation for Planar Integrated Structures with a Grounded Chiral Slab," Function," Journal of Electromagnetic Waves and Applications, vol. 9, no.
Joumal of Electromagnetic Waves and Applications, special issue on Wave 112, January 1995, pp. 97-126.
Interaction with C h i d and Complex Media, vol. 6, no. 5/6, May 1992, pp.
751-769. [12] B. Popovski, A. Toscano and L. Vegni, "Asymptotic Closed Form
Representation of the Spatial Microstrip Dyadic Green's Function,"
[6] A. Toscano, L. Vegni, "ElectromagneticField Computation in Planar Microwave and Optical Technology Letters, vol. 8, no. 2, February 1995,
Integrated Structure with a Biaxial Grounded Slab," IEEE Trans Magn., pp. 103-106.
vol. 29, no. 2, March 1993, pp. 17261729.
[13] D. W. Tank and J. Hopfield, "Simple Neural Optimization
[7] A. Toscano, L. Vegni, "Spectral Electromagnetic Modeling of a Networks," IEEE Trans. CAS, vol. 33, 1986, pp. 533-541.
Planar Integrated Structure with a General Grounded Anisotropic Slab,"
IEEE Trans. AntennasPropagat., vol. 41, no. 3, March 1993, pp. 362-370. [14] M. Kennedy and L. Chua, "Neural Networks for non Linear
Programming,"IEEE Tram. CAS, vol. 35, 1988, pp.554-562.
[8] A. Toscano and L. Vegni, "Novel Characteristics of Radiation
Patterns of a Pseudochiral Point-Source Antenna," Microwave and Optical [15] S . Haykin, eds., Neural Networks, IEEE Press, USA, 1994.
4
Fig.3 Typical error surface graph.
2 4 6 8 10 iterations
I
Fig.4 Network error with underfitting Fig.5 Network error without underfitting.
Authorized licensed use limited to: Meerut Institute of Engineering and Technology. Downloaded on February 23, 2009 at 01:14 from IEEE Xplore. Restrictions apply.