Short-Term Air Quality Prediction Using A Case-Based Classifier
Short-Term Air Quality Prediction Using A Case-Based Classifier
www.elsevier.com/locate/envsoft
Received 16 March 2000; received in revised form 30 July 2000; accepted 28 September 2000
Abstract
In the frame of air quality monitoring of urban areas the task of short-term prediction of key-pollutants concentrations is a daily
activity of major importance. Automation of this process is desirable but development of reliable predictive models with good
performance to support this task in operational basis presents many difficulties. In this paper we present and discuss the NEMO
prototype that has been built in order to support short-term prediction of NO2 maximum concentration levels in Athens, Greece.
NEMO is based on a case-based reasoning approach combining heuristic and statistical techniques. The process of development of
the system, its architecture and its performance, are described in this paper. NEMO performance is compared with that of a back
propagating neural network and a decision tree. The overall performance of NEMO makes it a good candidate to support air
pollution experts in operational conditions. 2001 Elsevier Science Ltd. All rights reserved.
Keywords: Short-term NO2 concentration prediction; Case-based reasoning (CBR); Urban air quality; Athens; Air monitoring operational data
modelling; Air Quality Management Operational Centre
1364-8152/01/$ - see front matter 2001 Elsevier Science Ltd. All rights reserved.
PII: S 1 3 6 4 - 8 1 5 2 ( 0 0 ) 0 0 0 7 2 - 4
264 E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272
The problem of NOx and other major pollutant con- In the next section we discuss data modelling and data
centration prediction, using heuristics and other AI tech- transformation issues, which are a key preliminary phase
niques, has been the aim of many researchers during the for building any CBR system.
last years. This is intensified since urban air pollution
have become an issue in many metropolitan areas of the
world like Los Angeles, Mexico City, Athens etc. 2. Operational air pollution data modelling
(Breiling and Alcamo, 1992). The tools used so far in
operational basis, are based on various mathematical and At the AQOC, two main predictive tasks are under-
computational modelling techniques (Bartzis, 1995), taken daily. These relate with prediction of daily peak
which, however, usually require complex input data and concentrations of NO2 and O3 for the same day (12 h
considerable computational resources, so they are not forecast), and the next day (24 h forecast). Until now,
suitable for the fast prediction task. On the other hand human experts have exclusively performed these predic-
statistical forecasting techniques (e.g. regression models tive tasks. The predictive decisions were based upon the
or neural networks) as well as expert systems (Simon et most recent meteorological data, as well as upon other
al., 1995; Avouris, 1995), have been proposed as fast relevant information, such as traffic in the area, and the
prediction tools. In particular, several research papers behaviour of emission sources (the later constituting to
have been published discussing the role that neural net- a so-called social factor).
works and AI techniques can play in predicting photoch- As for NO2, the AQOC authorities have defined four
emical pollution: Lee (1995) tried predicting atmos- levels of pollution, based on maximum mean hourly
pheric ozone levels using neural networks. This has also measurement of at least one monitoring station in the
been the aim of Perantonis et al. (1994) in experimental area, shown in Table 1.
basis. Both papers reported good results. A similar In the frame of the reported research, a prototype has
approach was that of Ruiz-Suarez applied on Mexico been developed to predict these indices, based exclus-
city atmospheric data (Ruiz-Suarez et al., 1994, 1995), ively on the currently daily available data in the AQOC.
and by Yi and Prybutok (1996) over an industrialised The Air Quality Monitoring Authority for the area has
urban area, who also contacted research on ozone short- been our main source of data. This public organisation
term prediction. Mlakar and Boznar (1994) and Boznar maintains an air-quality monitoring network consisting
and Mlakar (1995) have presented a study on the usage of 12 monitoring stations in the Athens basin, equally
of neural networks for short-term air pollution prediction distributed throughout the area. Measurements recorded
while Neagu et al. (2000) proposed a hybrid fuzzy neural by this network are the hourly mean values of NO, NO2,
network combined with explicit domain model for the CO, SO2, O3 concentrations, wind speed, wind direction,
same problem. Related work examples exist, dealing temperature and humidity. Additional data have come
with the daily maximum temperature forecasting using from the National Meteorological Service (NMS), which
machine learning approaches (Abdel-Aal and Elhadidy, provide area meteorological forecast, plus information
1996). However, these approaches are hard-wired to the related with conditions in the upper atmosphere, tem-
air pollution conditions of model building time, since perature inversion information etc.
they do not contain automatic mechanisms for adjusting We extracted data concerning future wind speed and
the developed computational models to the long-term direction, temperature and temperature inversion below
changes which characterise air pollution of a typical 150 m over the surface from NMS prediction bulletins.
urban area, so it is expected that their performance will To this data, the recorded precipitation and solar radi-
deteriorate in the long run. ation levels, gathered by the National Observatory of
The approach reported in this paper is that of a case- Athens have been added. These correspond to data that
based reasoning (CBR) system, which uses examples of at run time are inserted as observations of the experts
previous similar incidents in order to classify the related to general meteorological conditions in the area.
expected levels of maximum concentration of the current These data were inserted into a pollution data ware-
day. A characteristic of such a system is that it is con- house, covering a 2-year time period. Several problems
stantly fed with new cases, so it is expected to adapt its though regarding the validity and the integrity of these
behaviour to the long-term changes of air pollution. data still existed. In order to make a useful data set from
For development of the system a long data modelling
process was necessary, followed by testing using data Table 1
sets from the AQOC. The originality of the experiments Pollution levels based on NO2 concentrations for the Athens area
described, is mainly attributed to the fact that the data
sets contain commonly available data, coming from Level 1 (low) 0–200 µg/m3
Level 2 (medium) 200–350 µg/m3
monitoring networks and therefore the conducted experi- Level 3 (high) 350–500 µg/m3
ments aim at testing use of the techniques in oper- Level 4 (alarm) Over 500 µg/m3
ational conditions.
E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272 265
this database, pollution experts were contacted in order The wind factor coming from the NMS meteorologi-
to build adequate abstractions of the raw monitoring cal forecast is also classified according to wind direction.
data. Using this approach, incomplete or noisy data are The wind forecast provided by NMS contains qualitative
handled as well. The objective of this stage was to pro- values concerning direction and speed of winds, as they
duce a final working data set, which had to drive the are evolving in the time window of the corresponding
prediction algorithms through the training and testing weather bulletin. The corresponding wind factor is
phases, giving the best possible results. In the final data deduced according to Table 2.
model only the features that were more relevant to NO2
pollution levels were included, as described in the fol- 2.2. The temperature inversion factor
lowing.
The features finally included in the data model were The temperature inversion plays a favourable role in
the most recent NO and NO2 mean hourly concentration the evolution of air pollution episodes. This phenomenon
measurements, prior to prediction time (morning causes high concentrations of photochemical pollutants,
measurements between 7 a.m. and 10 a.m.), and a set of especially when no wind blows at the same time. As the
meteorological attributes. The latter were the wind factor inversion temperature height is lower, the conditions are
(a function of the wind speed and wind direction), the more favourable for an episode to occur. The tempera-
temperature inversion factor, the precipitation factor ture inversion factor is a function of the temperature dif-
and the solar radiation factor. These factors were com- ference and the inversion height, following the defi-
puted according to the heuristic functions that were built, nition, contained in Table 3.
expressing a measure of how “favourably” each feature
contributed to the NO2 episode evolution. The heuristics 2.3. Rain factor
used for defining these factors were based on experts’
advice and they were stored in the Knowledge Base of Rain and other phenomena are described according to
our system. Each factor definition was also backed by a heuristic rain factor. There are two sources of data,
statistical analysis, the results of which confirmed or on which this factor can be based. One from the NMS
adjusted accordingly the field experts’ advice. The forecast bulletins, and another one from the National
defined factors can often have complementary nature. Observatory measurements. Both produce a binary-
This is because measurements of the same feature, com- valued rain factor.
ing from various sources are often used in operational As for the latter, two precipitation measurements were
conditions. So for instance meteorological data can come available concerning the duration and the height of the
from Forecast Bulletins, experts observations, the pol- precipitation. The rule according to which the rain factor
lution monitoring network etc. For this reason, in the has been computed, is the following:
following discussion reference is made to the data origin,
where this case holds. rain height
If ⬎0.5 mm/h then (rain factor)⫽0
The definitions of the main abstractions contained in rain duration
the pollution data model are described next. else (rain factor)⫽1,
2.1. The wind factor where 0 stands for rain, 1 stands for no rain. The factor
produced out of the NMS forecast data has been com-
There are two data sources of wind-related data: the puted according to Table 4.
field measurements made by the Pollution Monitoring The above factors were combined with other pollution
Network; and the predictions made by the NMS. The measurements into a Pollution Operational Data Model,
wind factor coming from the actual measurements has the schema of which is shown here.
been a synthesis of wind speed and wind direction. Two
different heuristic functions have been defined, shown 1) Date
in Fig. 1, relating the wind speed and direction to the 2) Code of measurement station
wind factor (wf). As shown in Fig. 1, the first curve 3) NO hourly concentration at 8 a.m. (measurement)
concerns winds of eastern, northern and north-eastern 4) NO hourly concentration at 9 a.m. (measurement)
directions, and the other southern, western, and south- 5) NO hourly concentration at 10 a.m. (measurement)
western directions. The reason for this discrimination is 6) NO maximum hourly concentration after 10 a.m.
the fact that the latter category of winds is more favour- 7) Hour that the NO maximum hourly concentration
able towards a NO2 episode. Those winds most of the occurred
times are associated with the transportation of hot air 8) NO2 hourly concentration at 8 a.m. (measurement)
masses from northern Africa at high altitude and thus 9) NO2 hourly concentration at 9 a.m. (measurement)
can lead to temperature inversion and correspondingly, 10) NO2 hourly concentration at 10 a.m. (measure-
to higher pollution. ment)
266 E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272
systematically recorded repository of past cases of the In order to narrow the search space, some heuristic
problem to be solved. When the system is presented with knowledge has been included in form of rules. Examples
a new problem, it retrieves the most similar past cases of such rules are: “when strong winds are forecasted for
from the database which subsequently are adapted to the the day under prediction, it is known that the main mass
present conditions, in order to provide the new solution of the photochemical products such as the NO2 and O3
(Aamodt and Plaza, 1994; Kolodner, 1993; Watson and would be transported out of the area”. Also, “when there
Marir, 1994) NEMO development has followed this is a coincidence of high solar radiation and high ambient
principle and was based on our past experience of a pre- temperature, there is a significant possibility of NO2 epi-
vious more research-oriented prototype AIRQUOP sode development”.
(Lekkas et al., 1994), developed for the same application NEMO in its present version can predict the maximum
domain. Fig. 2 shows a block diagram of NEMO. All NO2 concentration of a previously specified measuring
metrics (modification heuristics, similarity metrics) have station area. When searching for past cases, the case base
been replaced by the previously described “knowledge consists of the filtered part of the whole database that
base”. Due to the nature of the problem the “test sol- refers to the specified station.
ution” phase is replaced with measurement of success of For each pair of a new case and a past case, a vector
the prediction when compared with real measurements. made up from the attribute differences is produced. From
At the top level, NEMO consists of three main modules. the differences that are related to the meteorological
Two of them deal with retrieval and filtering of cases attributes, the weather index is computed, while from
similar to the new case, while the third one adapts the the pollutant emissions related differences, the respective
solutions proposed by the remaining similar cases to pollution index is computed, as seen in Fig. 3.
form the proposed solution for the new case. Since the Decisions about whether a past case is a relevant case
attributes of each case are numerical, the indexes are of are taken if the sum of these two major indexes is greater
statistical nature rather than based on a symbolic index, than a pre-defined threshold value. Nevertheless the user,
like a vocabulary of words as in other CBR systems. who should be a pollution expert, has the power to alter
Therefore the retrieval strategy is based on aggregate the list of similar cases proposed by the system, remov-
matching among the cases rather than dimensional ing a case from the list of similar cases found, or adding
matching. According to the later, the attributes of each a new one that is considered to be relevant in predicting
case are matched one by one, while the former implies the maximum pollutant concentration. Fig. 4 presents a
the computation of a numerical evaluation function that part of the system’s user interface available during the
combines the degree of match along each dimension inspection of a new case.
with a value representing the importance of the dimen- A similar case adaptation is based on parameter
sion (Kolodner, 1993). A typical flat memory was used adjustment where changes in parameters in an old sol-
to store the system cases, while the search method ution are made in response to differences between prob-
implemented was a serial search on the case indexes. lem specifications in an old and a new case. This
Due to the lack of internal case explanation, the human approach is similar to the one used in Persuator (Sycara,
intervention is asked during the presentation of the simi- 1988). The adaptation in NEMO is handled by a statisti-
lar cases to discard irrelevant cases that the system has cally driven formula, which exploits the fact that the
retrieved. While this possibility is foreseen in the algor- daily NO2 peak is usually an evolution in time of the
ithm, it has not been used in the experiments described morning (until 10 am) NO2 maximum.
here, in order to concentrate our tests in measuring the
algorithm performance without any user intervention.
Fig. 2. CBR System (NEMO) block diagram. Fig. 3. NEMO: temporal relation of the prediction factors involved.
268 E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272
Fig. 4. NEMO: user interface — wind speed and wind direction presented on the area map.
Let the jth case in the case base to be written as Xj 3.1. Fine-tuning the case-based classifier
where Xj is the vector (x1, x2,…, xl,…, xm, dp, dw, y)j and
x1j, x2j,…, xmj are the m input attributes for the jth case, The NEMO system was trained over the above-
xlj is the attribute that indicates the maximum morning described training set. Since the CBR methodology does
NO2 concentration, and yj is the real value that is to be not include any learning cycle, the training set was used
predicted. If in the process the similar case selector and only for parameter adjustment and general data cali-
the filter have retrieved k similar cases for the nth case bration. This involved consecutive runs of the system,
under prediction, then the value yn is computed as fol- during which the system parameters where fine-tuned,
lows: until acceptable predictions were obtained. Then the final
冉 冊
system was tested using the testing set. Examples of
冘
k
dp·wp+dw·ww these parameters are Cmax, Cmin, that is, the maximum
(yi−xli)·
wp+ww and minimum number of past cases retrieved, the attri-
yn⫽xln⫹
i⫽1
冘冉 冊
k
(1) bute weights Wi, the high and low attribute thresholds,
dp·wp+dw·ww the similarity criteria, the adaptation methods used, etc.
i⫽1
wp+ww The methodology followed involved tuning a para-
meter at a time. In Table 5 the values of several variables
where dp is the pollution index between the nth day and
are presented along with their description as they have
its ith day of the similar days in the case base, dwj is the
been set at the end of the fine-tuning process. This has
weather (meteorological) index between those two days,
been a long tedious process that determines optimal set-
wp and ww represent the predefined weights for the pol-
tings for the NEMO system for the given training data
lution index and the weather index respectively. The fac-
set.
tor
冉dp·wp+dw·ww
wp+ww 冊 4. Classifier performance
is used as a closeness measure between the nth and the First a description of the testing set is provided. The
ith days and behaves as a weighted average, favouring testing set contains 240 cases (30% of the available data
the close days to the more distant ones. Formula (1) is set). The distribution of the NO2 concentration levels is
a typical similarity function of a CBR system, derived this data set is shown in Table 6.
after long experimentation with the test data and taking It should be stressed at this point that the success cri-
into consideration the physical meaning of the attri- terion for the system should contain a bias towards
butes involved. accurate prediction of the rarer pollution episodes (level
The result of experimentation reported here is an evol- 4) in order for the system to have any operational value.
ution of the simpler technique described in Lekkas et In other words the predictive accuracy at levels 3 and 4
al. (1994), resulting in better performance of the NEMO is more desirable than that at levels 1 and 2. Table 7
system than the original AIRQUAP CBR system. shows the NEMO system performance.
E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272 269
Table 5
Parameters settings for NEMO
Table 8
According to the results shown in Table 7, the system Comparative system performance of NEMO, DT and ANN
was able to make a prediction for 236 cases (98.3%) and System
made no prediction in four cases (1.7%). It predicted
correctly the pollution level in 169 cases (70%). How- Criterion NEMO DT ANN
ever, relaxing the success criterion by including the
adjacent classes as correct predictions, it missed only Overall success 169 (70%) 183 (76.2%) 166 (69.1%)
Relaxed overall success 233 (97.1%) 236 (98.3%) 232 (96.7%)
three cases (level 2 values were predicted as level 4) or Level 4 success 4 (50%)a 8 (100%) 4 (50%)
1.2%. Finally according to the level 4 criterion, that is,
correct predictions of level 4 values in Table 7 only four a
Four missed cases that NEMO was unable to predict.
270 E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272
冘 冘
to have any general validity. However, the CB classifier mi mi+1
NEMO presents an important advantage over the other (xi,j −yi,j ) (xi+1,j −yi+1,j )
two competing systems, that it is designed to maintain j⫽1
⬎
j⫽1
this performance over time and perhaps improve, as the mi mi+1
number of stored cases increases. To the contrary, the
other two techniques present no adaptive behaviour over should in general be true.
the time so performance can be expected to deteriorate During three runs of NEMO, each with the same but
randomly shuffled data set, three different series of (xi,j,
as the air pollution conditions change. It should be added
that a difficulty in predicting the second level NO2 peaks yi,j) pairs have been produced. We partitioned them into
by all three models was observed. However, this was 18 time-periods, containing 44 cases each. For every per-
due to the fact that the algorithms could not distinguish iod, the mean of the difference (xi,j⫺yi,j)2 was computed,
clearly between a first level case and a second level one. as well as the mean per level of yi,j, and the error at a
That implies that the causing circumstances and the 95% of confidence interval for the mean. These indices
related attributes were very similar for these two classes. are shown in a graphical way in Fig. 5.
The comparative tests of the three systems were run From Fig. 5 one can deduce a decreasing trend in the
on a standard personal computer (Pentium II, 64 MB time evolution of average error of selected time periods,
RAM). It is interesting to compare the computational following a downward slope sinusoidal in shape. From
resources used for training the developed modules and this one can conclude that the assumption holds true,
especially in the training set comprising the first 14 per-
producing the prediction. The most important phase for
all three systems is the training and fine tuning phase. iods. However, a larger data set is needed for a more
This is an interactive phase, as described in Section 3.1, decisive prove. Although there are certain periods, that
seem to interrupt the pattern, which can be attributed to
which in the case of our experiments lasted 2 h for the
DT, 1 day for the neural network and 3 days for the abrupt changes of the patterns of meteorology and air
NEMO system. On the other hand the run time perform- pollution indices, as for instance is the case with seasons
changes, the charts show a tendency towards decrease
ance of the three systems was very similar. All three
systems were able to produce their prediction in ⬍1 min. of the mean error. So in general it can be observed that
Only in very few cases when the number of past cases even during the limited time represented by the testing
data there seem to be indications of system perform-
was very high did NEMO prediction time went up to 2
min. This performance meets the operational require- ance improvement.
ments of the AQOC and makes the described system a All cases in the data set used during the reported
good candidate for fast prediction, which can be issued experiments, were derived from one of the principal air
frequently, for example, on an hourly basis. pollution monitoring stations of central Athens
(Patission st.). This has been the Station where most NO2
episodes were monitored during the past years and where
some of the higher NO2 concentrations have been meas-
ured. So NEMO related its prediction to a specific moni-
5. Dynamic performance toring station for which it then gradually developed a
Case Base. A future application of the predictive NEMO
algorithm for a network of monitoring stations means
During the development of NEMO, the assumption
was made that as more cases are processed by the system deployment of a set of NEMO modules, each one of
the response should improve. This is because as the Case which is tied to a specific monitoring station. A unit, that
should subsequently perform correlation of the results
Base is filled with more past cases, the likelihood is that
the system will find cases with similar characteristics and according to the distribution of the hourly concentrations
produce more accurate prediction increases. This over that area, should control these modules. In this way
an overall prediction of the NO2 levels in the area will
assumption leads to the expectation of a measurable
improvement of performance. In order to test this be produced.
assumption the time series of the data set was divided We assume that a larger data set would present a bet-
ter distribution of cases, so that more representative epi-
into several consecutive periods. If the hypothesis were
true, then the mean of squares of the difference between sode cases would exist, closer to the episode occurrence.
the prediction and the actual NO2 maximum concen-
tration of every period should follow a decreasing slope
as the period number increases. 6. Conclusions
If the pair (xi,j, yi,j) is the jth case of the ith zone,
where xi,j is the prediction that NEMO gave and yi,j is By interpreting the results of the performed experi-
the actual outcome, and each time-period contains mi and ments, a conclusion can be reached that the NEMO
mi+1 cases, respectively, then: prototype presents characteristics that make it a good
E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272 271
candidate for deployment in an Air Quality Monitoring nificant, considering that the human experts at the Greek
Centre under operational conditions. The experiments AQOC do not exceed these accuracy levels. This fact
conducted showed that in complex and multi-dimen- makes the system proposed here a useful tool for oper-
sional domains of real world problems as in the case ational conditions.
with air quality prediction, tree-induction classifiers also The described technique can be applied in many simi-
perform well. While the NEMO CBR algorithm perfor- lar air quality monitoring problems. While the prototype
med well for low NO2 emissions predictions it was developed has been fine-tuned through the described
unable to produce a prediction in some of the high emis- process for the specific case of Athens, the process can
sion cases. However, even in this case the prototype did be repeated in new problems. Some parts of the
not make false classifications. developed model, like the wind factor and the tempera-
The fact that the algorithm has not been tested on a ture inversion factor, are fundamental concepts of the
historical database containing a great number of air pol- photochemical phenomena involved and thus of general
lution episodes (which is not available) limits its role to applicability, however, the values of these factors are
that of a decision support system. The overall perform- directly related to the special conditions of the area, like
ance, though, of the algorithm can be described as sig- for instance the prevailing winds, and therefore they
272 E. Kalapanidas, N. Avouris / Environmental Modelling & Software 16 (2001) 263–272
should be re-calculated in the case of a different setting. Abdel-Aal, R.E., Elhadidy, M.A., 1996. Modelling and forecasting the
Also the long tedious parameter-setting phase described daily maximum temperature using abductive machine learning.
Oceanographic Literature Review 43(1).
above, should be repeated in the case of the new data set. Avouris, N.M., 1995. Co-operating knowledge-based systems for
The proposed model is complementary to more tra- environmental decision support. Knowledge-Based Systems 8 (1),
ditional modelling software, often used in AQOCs. The 39–54.
NEMO classifier can be used as a fast predictor of the Bartzis, J.G., 1995. Environmental monitoring and simulation, environ-
likelihood of an episode. In the case of positive predic- mental informatics — methodology and applications of environ-
mental information processing. In: Avouris, N.M., Page, B. (Eds.),
tion for a specific area, a trigger for the run of a more Environmental Informatics. Kluwer Academic, Dordrecht, pp.
accurate dispersion model can be set, which can lead to 237–255.
a more detailed picture of the situation, to proposals for Boznar, M., Mlakar, P., 1995. Neural networks — a new mathematical
adequate counter-measures etc. Also more frequent fast tool for air pollution modelling. In: Air Pollution Theory and
predictions can be produced this way, even during the Simulation International Conference on Air Pollution. Proceedings
vol. 1. Computational Mechanics, Billerica, MA, pp. 259–266.
run of the mathematical models, facilitating and Breiling, M., Alcamo, J., 1992. Emergency Air Protection: A Survey
adjusting their convergence. of Smog Alarm Systems. IIASA technical report of Leader project,
The adaptive nature of CBR algorithms, an intrinsic Laxenburg, Austria.
characteristic of the method, which was proven even in Kalapanidas, E., Avouris, N., 1999. Applying machine learning tech-
the case of the used data set, is also an important feature niques in air quality prediction. In: Proc. ACAI 99, Chania, pp.
58–64.
of the proposed hybrid architecture. Kolodner, J.L., 1993. In: Case-based Reasoning. Morgan Kaufmann,
The described system can be easily implemented on San Mateo, CA, pp. 545–555.
a low-cost computing platform. It delivers prediction at Lee, S.S., 1995. Predicting atmospheric ozone using neural networks
run time and unlike other modelling techniques does not as compared to some statistical methods. In: Proceedings of the
require a dense monitoring network and can deal with 1995 IEEE Technical Applications Conference and Workshops,
NORTHCON’95. Portland OR, USA.
noisy data or uncertainties since the defined abstractions
Lekkas, G.P., Avouris, N.M., Viras, L.G., 1994. Case-based reasoning
in the data model hide the noise of the raw data. On the in environmental monitoring applications. Appl. Artificial Intelli-
other hand, NEMO cannot contribute to the understand- gence 8 (3), 359–376.
ing of the phenomenon; it produces only a qualitative Mlakar, P., Boznar, M., 1994. Short-term air pollution prediction on
indication and is unable to produce a solid explanation the basis of artificial neural networks. In: International Conference
of its response. The system, contrary to other machine on Air Pollution, Proceedings 1. Computational Mechanics, Sou-
thampton, pp. 545–552.
learning techniques such as ANN and DT, can be Neagu, C.D., Kalapanidas, E., Avouris, N., Palade, V., 2000. Neuro-
adapted to time-evolving problems not needing frequent symbolic integration in a knowledge-based system for air quality
adjustments of the system. prediction. Applied Intelligence (in press).
Perantonis, S.J., Vassilas, N., Amanatidis, G.T., Varoufakis, S.J.,
Bartzis, J.G., 1994. Neural network techniques for SO2 episode pre-
Acknowledgements diction. In: Grying, S.-E., Milan, M.M. (Eds.), Proceedings of the
20th Int. Tech. Meeting on Air Pollution Modelling and its Appli-
cations (Valencia, Spain, Nov. 1993). Plenum, New York, pp.
The data used in this study have been supplied by 305–313.
the Athens Air Quality Monitoring Centre (PERPA), the Ruiz-Suarez, J.C., Mayora-Ibara, O.A., Smith-Perez, R., Ruiz-Suarez,
National Observatory of Athens (NOA) and the National L.G., 1994. Neural network-based prediction model of ozone for
Meteorology Service of Greece. Special thanks are due Mexico City. In: International Conference on Air Pollution, Pro-
to Dr L. Viras, air pollution expert at PERPA, for his ceedings 1. Computational Mechanics, Southampton, pp. 343–400.
Ruiz-Suarez, J.C., Mayora-Ibara, O.A., Torres-Jimenez, J., Ruiz-Sua-
continuing advice and support during this ongoing rez, L.G., 1995. Short-term ozone forecasting by artificial neural
research effort. Financial support has been provided by networks. Advances in Engineering Software 23 (3), 143–149.
the Greek Ministry for the Environment and Public Simon, K.H., Jaeschke, A., Manche, A., 1995. Environmental appli-
Works (EPPER4 Project - Specifications of the Athens cations of expert systems technology. In: Avouris, N.M., Page, B.
Air Quality Monitoring Centre) and General Secretary (Eds.), Environmental Informatics. Kluwer Academic, Dodrecht,
pp. 93–109.
of Research and Technology (Project PENED). Sycara, E.P., 1988. Using case-based reasoning for plan adaptation
and repair. In: Proceedings: Workshop on Case-based Reasoning
(DARPA), Clearwater, Florida. Morgan Kaufmann, San Mateo.
References Watson, I., Marir, F., 1994. Case-based reasoning: a review. Knowl-
edge Engineering Review 9 (4), 327–354.
Aamodt, A., Plaza, E., 1994. Case-based reasoning: foundational Yi, J., Prybutok, V.R., 1996. A neural network model for the prediction
issues, methodological variations, and system approaches. AI Com- of daily maximum ozone concentration in an industrialised urban
munications 7(1). area. Environmental Pollution 92 (3), 349–357.