0% found this document useful (0 votes)
59 views

Process Modeling and Optimization Using Focused Attention Neural Networks

Uploaded by

vane-16
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Process Modeling and Optimization Using Focused Attention Neural Networks

Uploaded by

vane-16
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ISA

TRANSACTIONS1

ISA Transactions 37 (1998) 41±52

Process modeling and optimization using focused attention


neural networks
James D. Keeler*, Eric Hartman, Stephen PicheÂ
Pavilion Technologies, Inc., 11100 Metric Blvd., #700, Austin, TX 78758, U.S.A.

Abstract
Neural networks have been shown to be very useful for modeling and optimization of nonlinear and even chaotic
processes. However, in using standard neural network approaches to modeling and optimization of processes in the
presence of unmeasured disturbances, a dilemma arises between achieving the accurate predictions needed for modeling
and computing the correct gains required for optimization. As shown in this paper, the Focused Attention Neural
Network (FANN) provides a solution to this dilemma. Unmeasured disturbances are prevalent in process industry
plants and frequently have signi®cant e€ects on process outputs. In such cases, process outputs often cannot be accu-
rately predicted from the independent process input variables alone. To enhance prediction accuracy, a common neural
network modeling practice is to include other dependent process output variables as model inputs. The inclusion of
such variables almost invariably bene®ts prediction accuracy, and is benign if the model is used for prediction alone.
However, the process gains, necessary for optimization, sensitivity analysis and other process characterizations, are
almost always incorrect in such models. We describe a neural network architecture, the FANN, which obtains accuracy
in both predictions and gains in the presence of unmeasured disturbances. The FANN architecture uses dependent
process variables to perform feed-forward estimation of unmeasured disturbances, and uses these estimates together
with the independent variables as model inputs. Process gains are then calculated correctly as a function of the esti-
mated disturbances and the independent variables. Steady-state optimization solutions thus include compensation for
unmeasured disturbances. The e€ectiveness of the FANN architecture is illustrated using a model of a process with two
unmeasured disturbances and using a model of the chaotic Belousov±Zhabotinski chemical reaction. # 1998 Elsevier
Science Ltd. All rights reserved.
Keywords: Neural Networks; Steady state optimization; Disturbance rejection; Process modeling

1. Introduction neural networks include papers on basic algo-


rithms [1,2], variations on basic algorithms [3,4],
Arti®cial neural networks represent a set of theoretical proofs of universal function approx-
powerful mathematical techniques for modeling, imation properties of neural networks [5,6], and
control, and optimization, in which models applications to problem domains, including pre-
``learn'' processes behavior directly from process diction, optimization, and control of industrial
data. Examples of the extensive literature on processes [7±12].
Given data of reasonable quality, building a
neural network model that simply predicts accu-
* Corresponding author. Tel: 1-800-880-5432; e-mail: rately is relatively straightforward [1,13]. How-
[email protected] ever, when modeling industrial processes for

0968-0896/98/$19.00 # 1998 Elsevier Science Ltd. All rights reserved


PII: S0019 -0 578(98)00005 -6
42 J.D. Keeler et al./ISA Transactions 37 (1998) 41±52

purposes such as process understanding, sensitiv- values. For instance, in the distillation column
ity analysis, and optimization, accuracy in the example, signi®cant unmeasured disturbances
process gains (derivatives of outputs with respect would make it impossible to accurately predict the
to inputs) is also essential. With standard neural top and bottom compositions as a function of
network approaches to modeling processes subject reboil steam, re¯ux, feed ¯ow, and column pres-
to unmeasured disturbances, obtaining accuracy sure alone.
in both the predictions and gains is often not pos- To improve prediction accuracy in such situa-
sible. To understand why this is so, consider the tions, a common neural network modeling strat-
following example of a distillation column. egy is to include dependent variables along with
Typical process variables in a distillation column the independent variables as model inputs. In the
are: distillation column example, this would mean
adding the temperatures as model inputs in addi-
. Manipulated variables: reboil steam, re¯ux.
tion to the manipulated variables in order to aid
. Measured disturbance variables: feed ¯ow,
prediction of the top and bottom compositions.
column pressure.
Because dependent variables (e.g., the tempera-
. Output (controlled) variables: top and bot-
tures) re¯ect the e€ects of unmeasured dis-
tom compositions.
turbances (e.g., feed composition), including
In addition, distillation columns are typically dependent variables as inputs to a model ordina-
instrumented to monitor a number of other rily does improve prediction accuracy of the out-
dependent variables (which the operators are not put. However, because the functional relationship
interested in controlling directly): of the independent to the dependent variables is
not represented in the model, and because the
. Dependent (not controlled) variables: over-
independent and dependent variables are usually
head temperature, bottom section tempera-
highly correlated, the gains in such models are
ture, re¯ux temperature.
almost certain to be incorrect. In our example, this
These process variable categories are shown in means that adding the temperatures to the model
Fig. 1. inputs will cause the gains of the top and bottom
Unmeasured disturbances such as feed compo- compositions with respect to the manipulated
sition, weather, catalyst degradation in reactors, variables to be wrong. Consequently, while models
and plant wear, can cause outputs to vary despite containing dependent variables as inputs may
®xed independent variable settings [14]. If the have good ability to predict the output variables,
e€ects of such disturbances are at all signi®cant, a optimization settings and sensitivity analysis are
neural network model containing only the inde- typically inaccurate due to incorrect gains. The
pendent process variables as inputs (the manipu- Focused Attention Neural Network (FANN)
lated and measured disturbance variables, which allows steady-state neural network models to
contain no information about the disturbances), obtain accurate predictions and gains in the pre-
will be unable to accurately predict the output sence of unmeasured disturbances.

Fig. 1. Classi®cation of process variables for modeling a plant.


J.D. Keeler et al./ISA Transactions 37 (1998) 41±52 43

In Section 2, we illustrate the problem in detail


with three simple cases. In Section 3, we describe
the FANN solution. In Section 4, we present two
case studies, and we close with conclusions in
Fig. 2. Case 1: no disturbances. A model with the independent
Section 5. variable u alone as input can obtain both accurate predictions
and the correct gain.

2. The problem
Given sucient data, the trained neural net-
In short, the problem addressed by the FANN work model of Fig. 2 will match the process Eq.
architecture is the following dilemma: (1) and yield correct predictions:
1. Information about unmeasured in¯uences y^ ˆ …a1 ‡ a2 †u …3†
which is re¯ected in dependent process out-
put variables is frequently necessary for
The gain of the model is also correct, matching the
neural network process models to have the
process gain Eq. (2):
required prediction accuracy. However,
2. adding dependent variables as model inputs @
y^ ˆ a1 ‡ a2
causes the gains for the manipulated vari- @u
ables to be inaccurate.
This dilemma is illustrated in the following 2.2. Case 2: unmeasured disturbances, independent
example, which is considered in three cases. model inputs only

2.1. Case 1: no unmeasured disturbances We now alter the process equations of Case 1 by
adding a disturbance d to the state s:
Assume a noiseless, linear plant1 with manipu-
lated input u, dependent variable s, and output y ˆ a1 u ‡ s
variable y:
s ˆ a2 u ‡ d
y ˆ a1 u ‡ s
s ˆ a2 u Eliminating s we have:

Eliminating s we have: y ˆ a1 u ‡ a2 u ‡ d ˆ …a1 ‡ a2 †u ‡ d …4†

y ˆ a1 u ‡ a2 u ˆ …a1 ‡ a2 †u …1† which di€ers from the Case 1 process Eq. (1) by
the addition of the disturbance d. The gain equa-
from which we can compute the process gain of y tion for this process, on the other hand, is the
with respect to u: same as the Case 1 gain Eq. (2):

@y @y
ˆ a1 ‡ a2 …2† ˆ a1 ‡ a 2 …5†
@u @u

Since there are no unmeasured disturbances We here consider the same model structure as in
a€ecting the process, y can be modeled as a func- Case 1 (Fig. 3).
tion of u only, as shown in Fig. 2. The presence of disturbances makes it impos-
sible to accurately model y as a function of u only.
1
A linear plant is chosen for simplicity to illustrate the Assuming the disturbance d is zero-mean, the
dilemma. It is shown later how a nonlinear system is modeled. trained neural network model will be identical to
44 J.D. Keeler et al./ISA Transactions 37 (1998) 41±52

Fig. 3. Case 2: unmeasured disturbances exist, and y is mod- Fig. 4. Case 3: unmeasured disturbances exist, and y is mod-
eled as a function of u only. The predictions for y are inaccu- eled as a function of u and s. With this model structure, the
rate by the amount d. However, the model gain is correct. predictions for y are accurate, but the gain for u is incorrect.

Case 1, Eq. (3): However, the functional dependency of s on u is


not represented in this model; that is, u and s are
y^ ˆ …a1 ‡ a2 †u both treated as independent variables. Hence, in
this model, the gain for u calculates as:
Comparing this to the present process Eq. (4), we
see that the predictions produced by this model @
y^ ˆ a1
will each be incorrect by the amount of the dis- @u
turbance d.
The gain of this model, on the other hand, is which di€ers by a2 from the true gain Eq. (6).
correct, matching the process gain Eq. (5):

@ 2.4. Summary of the above cases


y^ ˆ a1 ‡ a2
@u
A summary of the above cases appears in
2.3. Case 3: unmeasured disturbances, independent Table 1.
and dependent model inputs The FANN modeling structure makes it possi-
ble to build industrial process neural network
Consider the same process as in Case 2: models whose predictions and gains are both
accurate, despite the presence unmeasured dis-
y ˆ a1 u ‡ s …6† turbances. How this is accomplished is the subject
s ˆ a2 u ‡ d of the next section.

Eliminating s gives the same results as in Case 2:


3. The FANN solution
y ˆ a1 u ‡ a2 u ‡ d ˆ …a1 ‡ a2 †u ‡ d …4†
We reiterate the basic dilemma, described in the
@y previous section, which is addressed by the FANN
ˆ a1 ‡ a2 …5†
@u architecture:
1. Information about unmeasured in¯uences
In Case 2, the predictions were inaccurate because
which is re¯ected in dependent process out-
the model had no knowledge of the disturbance, d.
put variables is frequently necessary for
Because the dependent variable s contains the dis-
neural network process models to have the
turbance d, the model can be made to predict
required prediction accuracy (Case 2 above).
accurately by including s as an input to the model,
However,
as shown in Fig. 4.
2. adding dependent variables as model inputs
A trained neural network model of this struc-
causes the gains for the manipulated vari-
ture (again, given sucient data) will produce a
ables to be inaccurate (Case 3 above).
correct model, matching the process Eq. (6):
The FANN structure solves the dilemma by
y^ ˆ a1 u ‡ s modeling the functional relationships among the
J.D. Keeler et al./ISA Transactions 37 (1998) 41±52 45

Table 1
Summary of standard neural network approaches

Independent inputs only Independent and dependent inputs

No unmeasured disturbances Predictions: Correct (*) Correct


Gains: Correct (*) Incorrect
Unmeasured disturbances Predictions: Incorrect Correct
Gains: Correct Incorrect

Summary of the models considered in Section 2, illustrating the dilemma faced by standard neural network approaches when used
to model processes subject to unmeasured distrubances.
(*) This case was not considered because a correct model can be obtained with only independent inputs.

process variables, estimating unmeasured dis- We ®rst assume that the dependent process vari-
turbances, and thereby providing accurate predic- ables s are, like the outputs y, functions of u and d:
tions and gains. This structure is now described.
s ˆ f…u; d†
3.1. The FANN structure
As in the case of y, because disturbances d are
First, we assume that the vector of outputs y is present, s cannot be predicted from u alone.
given by a function, in general nonlinear, of the We proceed by expanding f as follows:
vector of manipulated variables u and the vector
of unmeasured disturbances d: s ˆ f…u; d† ˆ g…u† ‡ h…d† ‡ l…u; d† …10†

y ˆ F…u; d† …7† where g, h, and l are in general nonlinear func-


tions. Assuming that the e€ects of coupling the
(If there were no unmeasured disturbances, then a manipulated variables and disturbance variables is
Case 1 model would suce.) Because we do not weak, s may be approximated as:
have a direct measurement of d, we consider an
approximation to Eq. (7) that uses the vector of s  g…u† ‡ h…d† ˆ g…u† ‡ D …11†
measured dependent process variables s in place
of d: where we have de®ned, for convenience, the
y  J…u; s† …8† (uncoupled) e€ects of the unmeasured dis-
turbances as:
The accuracy of the approximation depends upon
the dependent variables s re¯ecting the e€ects of D ˆ h…d†
the unmeasured disturbances d.
Consider now a model trained to represent Eq. Using these relations we may eliminate s from Eq.
(8). For clarity we denote this model as: (8).

^ s†
y^ ˆ J…u; …9† y  J…u; s† ˆ J…u; g…u† ‡ D† ˆ Z…u; D† …12†

A model of this form was considered in Case 3 of Eqs. (11) and (12) now have the desired functional
Section 2, which illustrated the fact that such a dependenciesÐthe dependent process variables s
neural network model results (given sucient and y are expressed as functions of the indepen-
data) in accurate predictions of y, but in incorrect dent variables u and D.
gains. We denote a model representing Eq. (11) as:
We now describe how the gains may be made
^
s^ ˆ g^ …u† ‡ D …13†
correct without sacri®cing prediction accuracy.
46 J.D. Keeler et al./ISA Transactions 37 (1998) 41±52

Fig. 5. The FANN architecture, described by Eqs. (13) and (14) Model 1 and Model 2 are neural network models. This structure
properly models the dependent variables s and outputs y as functions of the two types of independent variables, u (manipulated) and DÃ
(estimated e€ects of unmeasured disturbances).

Thus, estimating s simply requires summing the zation, DÃ is held ®xed, and u is adjusted to achieve
estimates gÃ(u) and DÃ. As is discussed below, gÃ(u) is the desired setpoint for y. (This assumes that the
obtained by training a model from u to s. The unmeasured disturbances are varying slowly.) By
computation of DÃ is discussed in the next section. ®xing DÃ the optimization takes into account the
Equations (9) and (13) together de®ne the e€ects of the unmeasured disturbances.
FANN model structure. Combining Eqs. (9) and
(13) in the manner of Eq. (12), the FANN archi- 3.2. Computing DÃ
tecture can be expressed as:
The question remains as to how to compute DÃ.
^
y^ ˆ Z…u; ^
D† …14† Rearranging Eq. (11), we see that D can be given
by
The FANN model is shown in Fig. 5, where both
Model 1 and Model 2 are neural network models. D  s ÿ g…u† …15†
Model 1 is trained to represent gÃ(u) by training
a model with s as outputs and u as inputs. This is To estimate D with a model DÃ, we note that the
analogous to the Case 2 model of Section 2, with s dependent variable s is measured and that g(u)
as outputs instead of y. Model 1 will be inaccurate may be modeled by gÃ(u) as described above.
in its predictions of s to the degree that unmea- Therefore, DÃ may be computed simply as:
sured disturbances are present, but (given su-
cient data) its gains with respect to s will be ^ ˆ s ÿ g^ …u†
D …16†
correct. This is precisely what is desired. Model 2
is trained with y as outputs and u and s as inputs This relationship is shown in Fig. 6.
(in the training mode, sà is identical to s, by de®ni- Estimation of the e€ects of unmeasured dis-
tion of DÃ, described in the next section). turbances allows for accurate predictions as well
Thus, the FANN modeling structure correctly as correct calculation of process gains as a func-
treats both u and DÃ as independent variables, and tion of the independent variables and the unmea-
represents the functional dependence of s and y sured disturbances. These factors enable
upon those independent variables. During optimi- compensation for unmeasured disturbances during
optimization.

3.3. The FANN model applied to the Section 2


example

We now apply the FANN model to the example


Fig. 6. Estimation of D in the FANN architecture, corre- process of Section 2 (with disturbance, Cases 2
sponding to Eq. (16). Model 1 is the same as in Fig. 5. and 3):
J.D. Keeler et al./ISA Transactions 37 (1998) 41±52 47

y ˆ a1 u ‡ s …6† (4), with s eliminated, and we see that the gain is


s ˆ a2 u ‡ d correct, matching Eq. (5):

Again, with s eliminated we have: @


y^ ˆ a1 ‡ a2
@u
y ˆ a1 u ‡ a2 u ‡ d ˆ …a1 ‡ a2 †u ‡ d …4†
Thus, the FANN architecture correctly computes
@y both the predictions and the gains in the presence
ˆ a1 ‡ a2 …5†
@u of unmeasured disturbances.
First we apply the structure of Fig. 6 to estimate
D. The trained neural network Model 1 will yield 3.4. Summary
(given sucient data):
Table 2 extends Table 1 to include the FANN
^ ˆ a2 u
g…u† architecture.
In summary, the FANN architecture provides
The subtraction in Fig. 6 gives: correct predictions and correct gains in models of
processes subject to unmeasured disturbances.
D^ ˆ s ÿ g…u†
^ ˆ a2 ‡ d ÿ a 2 u ˆ d It should be noted that the FANN structure
di€ers signi®cantly from that of adding a bias to a
The addition in Fig. 5 gives: model to compensate for model mismatch. The
purpose of the FANN structure is to compensate
s^ ˆ g…u†
^ ‡ d ˆ a2 u ‡ d ˆ s for unmeasured disturbances, not for model mis-
match. The FANN architecture performs feed-
The predictions for y, given by the trained Model forward estimation of unmeasured disturbances
2 in Fig. 5, will be correct (again, given sucient and compensates for them during optimization.
data), matching the process Eq. (6): Such compensation cannot be achieved by model
biasing, which does not change the model gains.
y^ ˆ a1 u ‡ s As with any model, model biasing may be used
with FANN models to compensate for bias mis-
With respect to predictions, then, the FANN match. If model mismatch is due to non-stationary
model is identical to the Case 3 model (Fig. 4). behavior in the plant, bias adjustment may be
With respect to the gains, however, the models are useful in the short-term. For the longer-term,
not alike. As shown in Fig. 5, s is not treated as an mismatch should be handled via on-line model
independent variable in a FANN model; instead, s retraining. Using these tools coupled with the
is correctly modeled as a dependent variable. FANN architecture provides the full capabilities
Therefore, the gain is calculated according to Eq. required to achieve e€ective optimization in linear

Table 2
Summary of neural network approaches including FANN

Independent inputs only Independent and dependent inputs FANN architecture

No unmeasured distrubances Predictions: Correct (*) Correct (*) Correct


Gains: Correct (*) Incorrect (*) Correct
Unmeasured disturbances Predictions: Incorrect Correct Correct
Gains: Correct Incorrect Correct

The FANN architecture compared to the models considered in Section 2. Only the FANN architecture obtains both correct
predictions and correct gains in the presence of unmeasured disturbances.
(*) These cases were not considered because a correct model can be obtained with only independent inputs.
48 J.D. Keeler et al./ISA Transactions 37 (1998) 41±52

and nonlinear production units a€ected by mea- s2 ˆ 4u21 ‡ d2 ‡ 2 …18†


sured and unmeasured disturbances.
y ˆ s21 ‡ s2

4. Case studies To generate data, the three independent variables


u1 , u2 , and u3 were chosen from a random uniform
4.1. An example with two disturbances distribution on [0,1]. Only u1 and u2 a€ect the
process, but all three variables were included in
Our ®rst example demonstrates the ability of a the models. The unmeasured disturbance variables
FANN to accurately model both the output and d1 and d2 vary slowly compared to the indepen-
the gains of a process, and to estimate and com- dent variables. For d1 and d2 , the frequency of the
pensate for unmeasured disturbances to compute cosine terms were about 1/3 the frequency of the
optimization setpoints. The process is modeled by sine terms. As the equations indicate, the depen-
the following set of equations: dent variables s1 and s2 are a function both of
the independent variables u1 and u2 and of
d1 ˆ sin…!1 t† ‡ cos…!2 t† the unmeasured disturbances d1 and d2 , and the
d2 ˆ sin…!3 t† ‡ cos…!4 t† …17† output variable y is a function of the dependent
s1 ˆ u1 ‡ u2 ‡ 0:0u3 ‡ 0:5d1 variables s1 and s2 .

Fig. 7. A portion of the time series of the variables in this example.


J.D. Keeler et al./ISA Transactions 37 (1998) 41±52 49

The time series of the variables are shown in


Fig. 7. The process was modeled in a FANN
architecture. Using the trained model, y was given
a setpoint equal to its average value (4.7), and, at
each time step, optimized values for the indepen-
dent variables were computed using a nonlinear
programming algorithm. The resulting RMS
deviation of y from its setpoint over the entire
10,000 data points was 0.008. Such accuracy in
feed-forward optimization is only achievable with
a model whose predictions and gains are both
accurate. The optimization accuracy achieved in Fig. 9. Actual versus FANN model estimates of the unmea-
this example demonstrates that the predictions sured disturbances d2 .
and gains in the FANN model are both highly
accurate despite the presence of unmeasured dis-
turbances in the process. Recall that the disturbances are ®xed during
We now look in detail at the ability of the optimization. Let u1 and u2 denote the optimized
FANN model to estimate the e€ects of unmea- settings of the independent variables, and let s1
sured disturbances. and s2 denote the values of the dependent vari-
Recall from Eq. (16) that disturbance e€ects are ables given by Eqs. (17) and (18) for the optimized
estimated using Model 1 of Fig. 6 along with the settings u1 and u2 :
initial values of the independent and dependent
variables. Denote these estimates by d10 and d20 . The s1 ˆ u1 ‡ u2 ‡ 0:5d1
scatterplots of Figs. 8 and 9 show the accuracy of
s2 ˆ 4u21 ‡ d2 ‡ 2
the estimated disturbances compared to the actual
disturbances d1 and d2 .
This accurate estimation of the disturbances Let s01 and s02 denote the FANN model estimates of
allows the FANN model to compensate for their the dependent variables as given by Eq. (13) for
e€ects and achieve the optimization accuracy noted the optimized settings u1 and u2 . The scatterplots
above. of Figs. 10 and 11 show the accuracy of these
We now examine the accuracy of the FANN estimates compared to the actual values given by
model in predicting the dependent variables when the above equations.
the independent variables are set to their opti-
mized values.

Fig. 8. Actual versus FANN model estimates of the unmea- Fig. 10. Actual versus FANN model estimates of the optimized
sured disturbance d1 . values of the dependent variable s1 .
50 J.D. Keeler et al./ISA Transactions 37 (1998) 41±52

active chemical species are involved, but the


dynamics can be approximated by a Poincare sur-
face of section return map that is a very simple
relation [7]. We simplify the dynamical equations
governing the BZ reaction signi®cantly and
assume that the essential dynamics can be descri-
bed near the bifurcation region by a quadratic
return map of the form:

y…t ‡ 1† ˆ s…t†y…t†…1 ÿ y…t††

where y…t† is the concentration at time t, s…t† is the


Fig. 11. Actual versus FANN model estimates of the optimized
values of the dependent variable s2 .
bifurcation parameter that is modulated over time,
and y…t ‡ 1† is the value of the concentration on
the next step of the return map. This simple iterated
4.2. The Belousov±Zhabotinski reaction map retains the chaotic characteristics of the origi-
nal dynamics. We take y…t† to be the output of this
In this example we model and control a decep- process and s…t† to be a dependent variable. As
tively simple-appearing yet extremely dicult usual, we let s…t† be a function of a manipulated
process known as the Belousov±Zhabotinski (BZ) variable u…t† and an unmeasured disturbance d…t†:
chemical reaction. This reaction has been shown
to display chaotic oscillations in the bromine con- s…t† ˆ a1 cos…w1 u…t†† ‡ a2 cos…w2 d…t††
centration in a continuous stirred tank reactor
[15]. Chaotic reactions are dicult to control due Here w2 is small compared to w1 , and s…t† is scaled
to their inherent sensitive dependence to initial to range from 2.0 to 4.0, which acts to modulate
conditions and to disturbances. the process in and out of the chaotic regime. The
The BZ reaction goes through period doubling time series of s…t† and y…t† are shown in Fig. 12.
to chaos via successive pitchfork bifurcations The task is not only to predict but also to
under certain operating conditions. Several dozen maintain y…t ‡ 1† at its setpoint given u…t†, s…t†,

Fig. 12. A portion of the time series of the dependent variable, s, and the output variable, y, from the modulated BZ reaction map.
The system displays bursts of chaotic activity as s increases above 3.4.
J.D. Keeler et al./ISA Transactions 37 (1998) 41±52 51

and y…t†, as shown in Fig. 13. First, we trained a reaction. Fig. 14 shows the extremely poor results
Case 2 neural network (Fig. 3) to model the BZ of one-step-ahead control to a setpoint of p ˆ 0:63
reaction. As expected, due to the unmeasured dis- using the Case 3 model.
turbance d…t†, the model was unable to accurately Lastly, we trained a FANN model on the BZ
predict the output y…t†, and the accuracy of the reaction. The prediction accuracy was identical to
model was only r2 ˆ 0:90. This indicates that the the Case 3 model (r2 ˆ 0:99). As Fig. 14 shows,
unmeasured disturbance is signi®cant and needs to control by the FANN model is dramatically better
be estimated. than the Case 3 model (65% smaller RMS error
Next, we trained a Case 3 neural network overall). Because the FANN model correctly
(Fig. 4) to model the process, which included the represents the behavior of the dependent variable
dependent variable s…t† as a model input. As s…t† and the output y…t† in response to changes to
expected, the prediction accuracy improved to a the manipulated variable, the gain function for
high r2 ˆ 0:99, but the gain for u…t† was inaccurate u…t† is correct, and the optimized settings maintain
(0.0025 the size of the gain for s…t†), and hence the the output y…t† at its setpoint despite the presence
model was ine€ective when used to control the BZ of ¯uctuating unmeasured disturbances.

Fig. 13. (a) The mathematical structure of the simulation of the BZ plant. The dependent variable s…t† is a function of the manipulated
variable u…t† and the unmeasured disturbance d…t†. The output at the next time step, y…t ‡ 1†, depends explicitly only on s…t† and y…t†.
(b) The FANN model does not receive the unmeasured disturbance d…t† as an input, and only u…t† is modi®able. The system computes
optimal values for u…t† using a nonlinear programming algorithm which maintains the output y…t ‡ 1† at the setpoint P.

Fig. 14. BZ process dynamics under the one-step-ahead control of the Case 3 neural network model (Fig. 4) and the FANN model
(Figs. 5 and 6). For both cases, the setpoint is 0.63, and the control method is a standard non-linear programming code. The process
controlled by the Case 3 model di€ers only slightly from the original uncontrolled process (not shown), even though the manipulated
variable (also not shown) takes on its extreme values. This failure is due to the wrong gain for the manipulated variable resulting from
the inappropriate model structure. Using the FANN model, in which the gains are correct, results in the controlled output deviating
from the setpoint only slightly in the most chaotic region, a vast improvement over both the original dynamics and the Case 3 model
control.
52 J.D. Keeler et al./ISA Transactions 37 (1998) 41±52

5. Conclusions Microstructure of Cognition Vol. 1: Foundations, MIT


Press/Bradford, Cambridge, MA, 1986, 318±362.
[2] J. Moody, C. Darken, Fast learning in networks of
We have described and demonstrated a neural-
locally-tuned processing units, Neural Computation 1
network modeling structure, the FANN, which, in (1989) 281±294.
contrast to standard neural network modeling [3] A.S. Weigend, B.A. Huberman, D.E. Rumelhart, Predict-
approaches, is able to provide both accurate pre- ing the future: a connectionist approach, International
dictions and correct gains in models of processes Journal of Neural Systems 1 (1990) 193.
[4] E. Hartman, J.D. Keeler, Predicting the future: advan-
subject to unmeasured disturbances. The FANN
tages of semi-local units, Neural Computation 3 (1991)
architecture makes use of dependent process out- 566±579.
put variables to obtain predictive accuracy, while [5] K. Hornik, M. Stinchcombe, H. White, Multilayer feed-
properly representing the functional relationships forward networks are universal approximators, Neural
among the variables necessary to obtain accurate Networks 2 (1989) 359±366.
[6] E. Hartman, J.D. Keeler, J. Kowalski, Layered neural
gains. We demonstrated in two case studies,
networks with Gaussian hidden units as universal
including an extremely dicult chaotic reaction approximators, Neural Computation 2 (1990) 210±215.
problem, that the FANN architecture provides [7] J.D. Keeler, Prediction and control of chaotic chemical
feed-forward compensation for unmeasured dis- reactions via neural network models. Conference on Arti-
turbances during optimization. Because unmea- ®cial Intelligence in Petroleum Exploration and Produc-
tion, Plano, TX 1993, 31±38.
sured disturbances are prevalent in process
[8] W.T. Miller, R.S. Sutton, P.J. Werbos (Eds.), Neural
industry plants, such compensation is often essen- Networks for Control, MIT Press, Cambridge, MA, 1990.
tial for e€ective optimization. [9] K.S. Narendra, K. Parthasarathy, Identi®cation and con-
The FANN structure di€ers signi®cantly from trol of dynamic systems using neural networks, IEEE
that of adding a bias to a model to compensate for Transactions on Neural Networks 1 (1990) 4±27.
[10] E.D. Sontag, ``Feedback stabilization using two-hidden-
model mismatch. The purpose of the FANN
layer nets'', IEEE Transactions Neural Networks 3 (1992)
structure is to compensate for unmeasured dis- 981±990.
turbances, not for model mismatch. The FANN [11] L. Ungar, E. Hartman, J. Keeler, G. Martin, Process
architecture performs feed-forward estimation of modeling and control using neural networks, in: G. Ste-
unmeasured disturbances and compensates for phanopoulos, V. Venkatasubramanian, J. Davis (Eds.),
Proceedings of Intelligent Systems in Process Engineering.
them during optimization. ISPE '95, Snowmass, CO, AIChE Symposium Series,
Other important aspects of optimization that 1995, 57±67.
were not addressed in this paper, such as dead [12] P.J. Werbos, Neurocontrol and fuzzy logic: connections
times and optimization constraints, are straight- and designs, Proceedings of the 2nd Joint Technology
forward to incorporate and implement in the Workshop on Neural Networks and Fuzzy Logic, April
1990, NASA Conference Publication 10061; IJAR 6 (2)
FANN paradigm. 185±192.
[13] J.D. Keeler, R.B. Ferguson, Commercial applications of
soft sensors (TM). International Forum for Process Ana-
lytical Chemistry (IFPAC) Conference, Orlando, FL,
References 1996, 81±88.
[14] R. Weber, C. Brosilow, The use of secondary measure-
[1] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning ments to improve control, AIChE Journal 18 (1972) 614±623.
internal representations by error propagation, in D.E. [15] J.C. Roux, Experimental studies of bifurcations leading to
Rumelhart, J.L. McClelland and the PDP Research Group chaos in the Belousov±Zhabotinsky reaction, Physical 7D
(Eds.), Parallel Distributed Processing: Explorations in the (1983) 57±68.

You might also like