wan2011
wan2011
a r t i c l e i n f o a b s t r a c t
Article history: Advanced neuro-fuzzy modeling, namely an adaptive network-based fuzzy inference system (ANFIS), was
Received 2 February 2010 employed to develop models for the prediction of suspended solids (SS) and chemical oxygen demand
Received in revised form 4 November 2010 (COD) removal of a full-scale wastewater treatment plant treating process wastewaters from a paper mill.
Accepted 13 December 2010
In order to improve the network performance, fuzzy subtractive clustering was used to identify model’s
Available online 23 December 2010
architecture and optimize fuzzy rule, meanwhile principal component analysis (PCA) was applied to
reduce the input variable dimensionality. Input variables were reduced from six to four for COD and SS
Keywords:
models, by considering PCA results and linear correlation matrices among input and output variables.
Adaptive network-based fuzzy inference
system
The results indicate that reasonable forecasting and control performances have been achieved through
Wastewater treatment the developed system. The minimum mean absolute percentage errors of 1.003% and 0.5161% for CODeff
Prediction and SSeff could be achieved using ANFIS. The maximum correlation coefficient values for CODeff and SSeff
Principal component analysis were 0.9912 and 0.9882, respectively. The minimum mean square errors of 1.2883 and 0.0342, and the
minimum RMSEs of 1.135 and 0.1849 for CODeff and SSeff could also be achieved.
Crown Copyright © 2010 Published by Elsevier B.V. All rights reserved.
1568-4946/$ – see front matter. Crown Copyright © 2010 Published by Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2010.12.026
J. Wan et al. / Applied Soft Computing 11 (2011) 3238–3246 3239
applying it to forecast the residual chlorine in the drinking water online process monitoring over the last few decades [17]. The pur-
tank and distribution system of the city of Sainte-Foy. Holubar pose of PCA is to identify linear correlations between correlated
et al. [9] applied neural networks based on the feed forward neu- variables aiming at data dimensionality reduction. The method
ral network trained with the back propagation algorithm to model generates a new set of variables, called principal components.
and control methane production in anaerobic digesters. The model Each principal component is a linear combination of the origi-
was trained using data generated from four anaerobic continuous nal variables. All the principal components are orthogonal to each
stirred tank reactors operating at steady state. other, so there is no redundant information. The principal com-
Although ANN can predict the effluent from WWTPs suc- ponents as a whole form an orthogonal basis for the space of the
cessfully, traditional neural network schemes still have several data [18]. In recent years, the PCA approach has been successfully
limitations which are resulted from possibility of getting trapped applied to wastewater treatment process monitoring and control
in local minimum, and the choice of model architecture. If the [19–22].
predicting performance can be further promoted, better operation Taking the wastewater treatment project for the DongGuan
strategy can be formed. To overcome these limitations of traditional Papermaking as a case, the main objective of this work was to eval-
ANNs, and to increase their reliability, many new training algo- uate the ANFIS modeling as a valid input–output model to predict
rithms have been proposed such as adaptive neuro-fuzzy inference the treatment performance of a paper mill wastewater treatment.
system (ANFIS) [10]. ANFIS combines fuzzy logic control (FLC) with In order to improve the network performance, fuzzy subtractive
artificial neural network (ANN) and realizes fuzzy logic by fuzzy clustering and principal component analysis (PCA) were used to
neural network. Meanwhile, the controller can get hold of fuzzy identify model’s architecture and optimize fuzzy rule. Meanwhile
rules and optimize its subjection function online by self-learning for comparison, ANN was also employed to predict the effluent in
ability of the neural network. Application of fuzzy neural network this study.
in wastewater treatment, it can acquire better effect.
Recently, active researches have been carried out in fuzzy- 2. Materials and methods
neural control [11–13]. Tay and Zhang [14] integrated fuzzy
systems and neural networks in modeling the complex process of 2.1. Wastewater treatment plant
anaerobic biological treatment of wastewater. They illustrated the
power of the technique in two case studies of upflow anaerobic A paper mill wastewater treatment plant (Fig. 1), located at the
sludge blanket and anaerobic fluidized bed reactor. The fuzzy- Dongguan city Guangdong province, was used as a demonstration
neural model simulated the system performance well and provided site for assessing the application of this hybrid fuzzy controller.
satisfactory prediction results based on observed past information, In this WWTP, the treatment processes were comprised of bar
although a disadvantage of the model was its high dependence on rack, equalization tank, primary settling tank, high efficient reactor
the quality of the training data. Steyer et al. [15] used the fuzzy (researched and developed by South China University of Technol-
logic and the artificial neural networks to on-line examine and ogy), hydrolysis and anaerobic tanks, aerated submerged biofilm,
analyze the question which appeared processing the 120 L grape secondary settling tank. The flow rate was 30,000 cubic meters
wine wastewater in the anaerobic digestion fluidized bed reactor. per day (CMD). The annual wastewater discharge amount from
Primary data like pH, the temperature, the backflow quantity, the the mill was 1.812 × 107 tons: They include 23696.56 tons of COD;
flow of the influent. According to the fuzzy logic which may dis- 3695.48 tons of BOD; 6.04 tons of volatile phenol.
tinguish, the characteristic vector was divided into the appointed The monitoring and control system is based on probes from
category. Then the process condition was classified by the artificial HACH® , cards and interfaces from Advantech® . The plant is
neural networks. Chen and Chang [16] integrated fuzzy systems equipped with DO-temperature (D53)) and pH (DRD1P5) probes,
and neural networks in modeling the complex process of aeration and COD (CODmax ) and NH4 + (Amtax compact) on-line monitoring
in a submerged biofilm wastewater treatment process. instrument. The signals, filtered in a transmitter, are captured by
As a multivariate statistical data analysis technique, principal a data acquisition card (ADAM4017, Advantech, China). The con-
component analysis (PCA) has become increasingly popular for trol is conducted using a power relay output board (ADAM4024,
3240 J. Wan et al. / Applied Soft Computing 11 (2011) 3238–3246
which represents the firing strength of each rule. The firing strength
means the degree to which the antecedent part of the rule is satis-
fied.
Layer 3: The nodes in this layer are also fixed nodes labeled ,
indicating that they play a normalization role in the network. The
outputs of this layer can be represented as
wij
o3ij = wij = , i, j = 1, 2 (4)
w11 + w12 + w21 + w 22
which are called normalized firing strengths.
Layer 4: Each node in this layer is an adaptive node, whose out-
put is simply the product of the normalized firing strength and a
Fig. 2. ANFIS structure for a two-input Sugeno model with four rules. first-order polynomial (for a first-order Sugeno model). Thus, the
outputs of this layer are given by Eq. (5).
Advantech, China) which allowed an optimal equipment function- o4ij = wij fij = wij (pij x + qji y + rij ), i, j = 1, 2 (5)
ing. The software consisted of user-friendly interfaces and was able
Parameters in this layer are referred to as consequent parame-
to repeat over time a previously defined operation cycle by control-
ters.
ling pumps, mixing device and air supply. All chemical analytical
methods used in this study were according to standard method. Layer 5: The single node in this layer is a fixed node labeled
, which computes the overall output as the summation of all
incoming signals, i.e.,
2.2. Adaptive neuro-fuzzy inference system (ANFIS)
2
2
2
2
ANFIS is a multilayer feed-forward network which uses neural z = O15 = Wij fij = Wij (pij x + qij y + rij )
network learning algorithms and fuzzy reasoning to map inputs i=1 j=1 i=1 j=1
into an output. It is a fuzzy inference system (FIS) implemented
in the framework of adaptive neural networks. Fig. 2 shows the = (Wij x)pij + (Wij y)qij + (Wij )rij (6)
architecture of a typical ANFIS with two inputs, two rules and one
output for the first-order Sugeno fuzzy model, where each input is which is a linear combination of the consequent parameters when
assumed to have two associated membership functions (MFs). the values of the premise parameters are fixed. It can be observed
For a first-order Sugeno fuzzy model [23], a typical rule set with that the ANFIS architecture has two adaptive layers: Layers 1 and 4.
four fuzzy if–then rules can be expressed as Layer 1 has modifiable parameters {ai , bi , ci } and {aj , bj , cj } related
to the input MFs. Layer 4 has modifiable parameters {pij , qij , rij }
Rule 1: If x is A1 and y is B1 then f1 = p1 x + q1 y + r1 pertaining to the first-order polynomial. The task of the learning
Rule 2: If x is A2 and y is B2 then f2 = p2 x + q2 y + r2 algorithm for this ANFIS architecture is to tune all the modifiable
where A1 , A2 , B1 and B2 are the MFs for the inputs x and y, respec- parameters to make the ANFIS output match the training data.
tively, pij , qij and rij (i, j = 1, 2) are consequent parameters [10,24]. Learning or adjusting these modifiable parameters is a two-step
process, which is known as the hybrid learning algorithm. In the
As can be seen from Fig. 2, the architecture of a typical ANFIS forward pass of the hybrid learning algorithm, the premise param-
consists of five layers, which perform different actions in the ANFIS eters are hold fixed, node outputs go forward until Layer 4 and the
and are detailed below. consequent parameters are identified by the least squares method.
Layer 1: All the nodes in this layer are adaptive nodes. They gen- In the backward pass, the consequent parameters are held fixed, the
erate membership grades of the inputs. The outputs of this layer error signals propagate backward and the premise parameters are
are given by updated by the gradient descent method. The detailed algorithm
and mathematical background of the hybrid learning algorithm can
o1Ai = uAi (x) i = 1, 2 be found in Jang [10].
(1)
o1Bj = uBj (x) j = 1, 2
2.3. The index
where x and y are crisp inputs, and Ai and Bj are fuzzy sets such as
low, medium, high characterized by appropriate MFs, which could Mean square error (MSE), root mean square normalized error
be triangular, trapezoidal, Gaussian function or other shapes. In this (RMSE), mean absolute percentage error (MAPE) and correlation
study, the generalized bell-shaped MFs (Eq. (2)) defined below are coefficient (R) are used as a performance index to evaluate the
utilized prediction capability of ANFIS and ANN trained by each data set.
1 The MSE performance index was defined as
uAi (x) = i = 1, 2
1
2bi n
1 + ((x − ci )/ai )
(2) MSE = (ŷ − y)
2
(7)
1 n
uBj (x) = j = 1, 2
1 + ((x − cj )/aj ))2bj i=1
Fig. 4. Variations of wastewater parameters measured for the ANFIS and ANN mod-
els.
whereas the validation error of the normal batch operations were 4. Results and discussion
comparable to the training error.
Mean absolute percentage error (MAPE): 4.1. Data collection and preprocessing
1 At − Ft
n
MAPE =
n
A
× 100 (9) The data from 15th of March 2006 to 21st of December 2006
t were obtained from the plant and used to develop two separate
i=1
N N ANFIS models, as shown in Fig. 4. They were collected from the
where Ā = N1 A and F̄ = N1
t=1 t
F are the average values of At
t=1 t
treatment plant to form daily composite samples for analysis and
and Ft over the training or testing dataset. The smaller RMSE and investigated every two to three days and their total numbers were
MAPE mean better performance. 150. Among the total numbers of data, the numbers for training
Correlation coefficient (R): and testing (predicting) were 120 and 30, respectively. Also there
n was a validation data (see Fig. 4), whose number is 20. In China,
(At − Ā)(Ft − F̄)
R= t=1
(10) the effluent regulation limits of SSeff and CODeff were 30 mg/L and
n 2 N 2 100 mg/L, respectively. The effluent quality from this WWTP met
(A − Ā) ·
t=1 t
(F − F̄)
t=1 t the Effluent Standard of China.
N N The main objective of the data preprocessing is to determine the
where Ā = N1 A and F̄ = N1
t=1 t
F are the average values of At
t=1 t suitable locations for the data acquisition required for the model-
and Ft over the training or testing dataset.
ing activities. This is a standard procedure for the networks data
preparation. The main objective here is to ensure that the statisti-
3. Model architecture and model components cal distribution of the values for the net input and output is roughly
uniform. The data sets are often scale so that they always fall within
The schematic architecture of the neural fuzzy model is depicted a specified range or they are normalized so that they have zero
in Fig. 3. It consists of the five key components: inputs and out- mean and unitary variance. These data were normalized by
puts database and preprocessor, a fuzzy system generator, a fuzzy
inference system, and, an adaptive neural network representing s(i) − min(s)
S(i) = (11)
the fuzzy system. The fuzzy inference system and its associated max(s) − min(s)
adaptive network are a Sugeno fuzzy inference system and an adap-
tive network-based fuzzy inference system (ANFIS) [10]. Input and 4.2. Analysis of historical process data
output variables are selected or generated from the variables com-
monly used for system description. A database that contains system PCA was performed to clarify and evaluate the relation-
performance information is a prerequisite for model development. ships among model variables. The percentage of process variance
Generally, it is developed by collecting regularly monitored param- explained as a function of the number of principal components is
eters. The quality of the training database is critical for the model shown in Fig. 5(a) and Tables 1 and 2. As can be noted from this
to produce correct information about the system. In order for the figure, six PCs were extracted from the PCA. The transformation
model to describe the system accurately, the database should con- matrix of PC was defined as: pcs = [pcs1 pcs2 . . . pcs6] X. So the
3242 J. Wan et al. / Applied Soft Computing 11 (2011) 3238–3246
Table 3
Determination of the appropriate ANFIS and ANN models.
ANFIS ANN
Fig. 6. Relation figure of five primary GUIs and fuzzy inference systems.
After the model was trained, the inference was performed in
accordance with 16 fuzzy linguistic rules and 15 fuzzy linguistic
rules for ANFIS COD and ANFIS SS, respectively (see Fig. 6(b)). Those 4.4.1. Simulation of CODeff
rules were obtained after the network was trained. Some other All MAPE, R, MSE, and RMSE values for CODeff are also shown in
rules were also included heuristically in terms of comparing out- Table 4. When training, MAPE between the predicted and observed
put values in accordance with input values. In addition, defuzzified values of CODeff was 1.003% using ANFIS, but it was 1.7934% using
results and graphical outputs can be derived. Fig. 7(a)–(c) illustrates ANN. When validating, the MAPE was 2.815% using ANFIS, but it was
an example of Surface Viewer screen obtained from Fuzzy Logic 9.007. When predicting, the MAPE lay 3.2812% adopting ANFIS, but
Toolbox. Two- or three-dimensional graphic results of variables it was 5.7356% when using ANN. When training, R value was 0.9912
can be plotted and compared. Fig. 6(c) shows the results of applied using ANFIS, but it was 0.9758 using ANN. When validating, R value
rules and their corresponding outputs according to the mass center was 0.9459 using ANFIS, but it was 0.5371 using ANN. When pre-
of variables. Using the interface, defuzzified values for output vari- dicting, R value was 0.9093 using ANFIS, but it was 0.7719 using
ables can be derived by changing input values manually. Different ANN. MSE and RMSE values also showed that the predicting per-
output values can be obtained from the Rule Viewer according to formance of ANFIS prevailed. The MSE value of 1.2883 using ANFIS
the given input values. To get defuzzified output values for all the was lower than that of 3.5151 using ANN when model training.
real input values is not flexible using the interface. For that reason a When validating, the MSE value of 4.8321 using ANFIS was lower
program is written using Matlab codes to drive defuzzified output than that of 57.053 using ANN. When predicting, the MSE value of
results in accordance with real input values. 8.4034 using ANFIS was also lower than that of 31.639 using ANN.
3244 J. Wan et al. / Applied Soft Computing 11 (2011) 3238–3246
Table 4
Predicting performance using ANFIS and ANN.
CODeff
MAPE (%) Train 1.003 0.026 1.7934 1.2016
Validate 2.8145 9.3703 8.0072 7.9522
Predict 3.2812 10.04 5.7356 11.877
R Train 0.9912 0.9999 0.9758 0.9889
Validate 0.9459 0.4517 0.5371 0.5074
Predict 0.9093 0.4487 0.7719 0.3809
MSE Train 1.2883 0.0012 3.5151 1.6196
Validate 4.8231 64.411 57.053 53.475
Predict 8.4034 71.929 31.639 112.81
RMESE Train 1.135 0.0347 1.8749 1.2726
Validate 2.1962 8.0256 7.5534 7.3127
Predict 2.8989 8.4811 5.6249 10.621
SSeff
MAPE (%) Train 0.5161 0.0135 1.2616 0.6063
Validate 1.0458 2.6906 2.2147 4.0616
Predict 1.9726 4.6121 3.6844 5.2797
R Train 0.9882 0.9999 0.9413 0.9882
Validate 0.9292 0.5010 0.6669 0.3661
Predict 0.9023 0.5951 0.7725 0.5315
MSE Train 0.0342 2.69e−5 0.1655 0.0338
Validate 0.0927 0.6221 0.3812 1.4269
Predict 0.3207 1.4369 0.9878 2.1139
RMESE Train 0.1849 0.0052 0.4608 0.1841
Validate 0.3045 0.7887 0.6174 1.1945
Predict 0.5663 1.1987 0.9939 1.4539
ANFIS, but it was 3.6844% when using ANN. When training, R value
was 0.9882 using ANFIS, but it was 0.9413 using ANN. When val-
idating, R value was 0.9292 using ANFIS, but it was 0.6669 using
ANN. When predicting, R value was 0.9023 adopting ANFIS, but it
was 0.7725 using ANN. MSE and RMSE values also showed that
the predicting performance of ANFIS prevailed. The MSE values of
0.0342 using ANFIS wad lower than that of 0.1655 using ANN when
model training. When validating, the MSE value of 0.0927 using
ANFIS was lower than that of 0.3812 using ANN. When predicting,
When training and validating, the RMSE value of 1.135 and 2.1962
using ANFIS was lower than that of 1.8749 and 7.5534 using ANN.
The RMSE value of 2.8989 using ANFIS was also lower than that
of 5.6249 using ANN when predicting. Fig. 8(a) and (b) shows the
training and predicting results using ANFIS and ANN, respectively.
Acknowledgements
[16] J.C. Chen, N.B. Chang, Mining the fuzzy control rules of aeration in a submerged [21] D.J. Choi, H. Park, A hybrid artificial neural network as a software sensor for
biofilm wastewater treatment process, Engineering Applications of Artificial optimal control of a wastewater treatment process, Water Research 35 (2001)
Intelligence 20 (2007) 959–969. 3959–3967.
[17] J. Lennox, C. Rosen, Adaptive multiscale principal components analysis for [22] C.K. Yoo, P.A. Vanrolleghem, I.B. Lee, Nonlinear modeling and adaptive moni-
online monitoring of wastewater treatment, Water Science and Technology toring with fuzzy and multivariate statistical methods in biological wastewater
45 (2002) 227–235. treatment plants, Journal of Biotechnology 105 (2003) 135–163.
[18] I.T. Jolliffe, Principal Component Analysis, 2nd ed., Springer, New York, 2002. [23] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications to
[19] G. Civelekoglu, A. Perendeci, N.O. Yigit, M. Kitis, Modeling carbon and modeling and control, IEEE Transactions on Systems Man and Cybernetics 15
nitrogen removal in an industrial wastewater treatment plant using an (1985) 116–132.
adaptive network-based fuzzy inference system, Clean 35 (2007) 617– [24] Y.M. Wang, M. Taha, Elhag, an adaptive neuro-fuzzy inference system for bridge
625. risk assessment, Expert Systems with Applications 34 (2008) 3099–3106.
[20] C. Rosen, M. Larsson, U. Jeppsson, Z. Yuan, A framework for extreme-event [25] J.H. Tay, X. Zhang, Neural fuzzy modeling of anaerobic biological wastewater
control in wastewater treatment, Water Science and Technology 45 (2002) treatment systems, Journal of Environmental Engineering – ASCE 125 (1999)
299–308. 1149–1159.