Adaptive Deep Learning-Based Air Quality Prediction Model Using The Most Relevant Spatial-Temporal Relations
Adaptive Deep Learning-Based Air Quality Prediction Model Using The Most Relevant Spatial-Temporal Relations
ABSTRACT Air pollution has become an extremely serious problem, with particulate matter having a
significantly greater impact on human health than other contaminants. The small diameter of fine particulate
matter (PM2.5) allows it to penetrate deep into the alveoli as far as the bronchioles, interfering with a
gas exchange within the lungs. Long-term exposure to particulate matter has been shown to cause the
cardiovascular disease, respiratory disease, and increase the risk of lung cancers. Therefore, forecasting air
quality has also become important to help guide individual actions. This paper aims to forecast air quality
for up to 48 h using a combination of multiple neural networks, including an artificial neural network,
a convolutional neural network, and a long-short-term memory to extract spatial-temporal relations. The
proposed predictive model considers various meteorology data from the previous few hours as well as
information related to the elevation space to extract terrain impact on air quality. The model includes trends
from multiple locations, extracted from correlations between adjacent locations, and among similar locations
in the temporal domain. Experiments employing Taiwan and Beijing data sets show that the proposed model
achieves excellent performance and outperforms current state-of-the-art methods.
I. INTRODUCTION strength every hour [5], and there are insufficient sensors
Increasing attention has been given to air quality degen- deployed to provide emission data from factories or vehi-
eration, with particulate matter (PM) having a significant cles. Recent studies have shown it is critical that time and
egregious impact on human health. The small diameter of space be explicitly considered to analyze air quality [6]–[8].
fine particulate matter (PM2.5) allows it to penetrate deep Particulate matter has high cyclicality and is easily affected
into the alveoli as far as the bronchioles, interfering with gas by space, stagnating or diffusing to pollute surrounding envi-
exchange within the lungs. Xing showed that long term expo- ronments. If PM is only analyzed in the time domain, this
sure to particulate matter increased the risk of the cardiovas- may neglect impacts and relationships between other regions;
cular disease, respiratory disease, and lung cancer [1]. With whereas considering only spatial relationships may omit PM
increasing public health consciousness, many cities have diffusion from over time. Therefore, time and spatial rela-
established air quality monitoring locations. However, most tions must be simultaneously considered to accurately model
services only show the current air quality and do not forecast PM diffusion.
air quality. Air quality prediction is essential to help guiding Data mining provides new methods to analyze air quality
individual actions limiting PM2.5 exposure, e.g., choosing in the absence of physical models [9]–[11], and may identify
outdoor or indoor activities. hidden information in the collected data. Furthermore, pre-
However, accurate air quality forecasting is hindered by a diction speed is far quicker once a model is trained than for
complex array of factors [2]–[4], including emissions, traffic traditional physical models. Therefore, we propose a model
patterns, and meteorological conditions. Meteorologists are to provide 48 hours air quality index (AQI) predictions every
still substantially limited to provide reliable wind pattern hour at every monitoring location. As shown in Figure 1,
predictions, which can vary considerably in direction and we forecast 48-hours predictions from the current time, tc ,
2169-3536
2018 IEEE. Translations and content mining are permitted for academic research only.
38186 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model
series, hence we applied DTW to calculate location tempo- may help to improve prediction. We define the feature
ral distances and used k-nearest neighbor (kNN) to identify sequence interval (FSI) for a specific location as
locations with the most similar temporal behavior [19]. This Definition 3: Feature Sequence Interval with Location.
differs from the approach of Zheng et al. [5] in that we were
S(li , fj , tst,ft ) = {e(li , fj , tst ), e(li , fj , tst+1 ), . . . ,
seeking to enhance prediction sensitivity over short durations
rather than longer periods. Experimentally, kNN-DTWD out- e(li , fj , tft )}, li ∈ L, fj ∈ F, st < ft. (4)
performed kNN-ED on average. However, for some special where li has feature fj that varies from start, st, to finish, ft,
cases, e.g., flat areas with numerous locations, the kNN-ED time (st < ft); and e(li ,fj ,tk ) represents the measured value
method proved advantageous. The kNN-DTWD method was of fj at tk .
well suited selection locations with similar behavior, and We can express the distance between feature sequences for
the kNN-ED method was best avoided where there was a any two locations as
mountain between the locations.
This work combines kNN-ED benefits for featureless land- Dtq,c,tst,ft
scapes with kNN-DTWD benefits for complex landscapes, = distsequence (S(lq , ftarget , tst,ft ), S(lc , ftarget , tst,ft )),
allowing the model to derive optimal combinations of based
lq , lc ∈ L, q 6= c. (5)
on the training data, as detailed in the following sections.
to obtain the most related locations in the temporal domain.
III. PROBLEM DEFINITION Before using the measure, we must select a feature ftarget ,
We first identify the locations with the most influential as the target prediction sequence: this study chose PM2.5,
spatial-temporal relationships to the target location and then but other targets could be employed as required. Then we can
predict sequences for the target location based on time use (5) to calculate the temporal relations sequence (TRS) set,
sequence features that include spatial information. Hence, Definition 4: Temporal Relations Sequence Set.
the locations are fixed, and the time sequences vary accord-
ing to their positions. Spatial features could also impact the TRStst,ft = {Dt1,2,tst,ft , Dt1,3,tst,ft , . . . , Dtn−1,n,tst,ft } (6)
sequences, e.g., a mountain between two locations. There- We then select candidates TRS_cand(li ,k), the set of k
fore, we use both temporal and spatial relationships in the pre- locations with the least difference from location li . To con-
dictive model. To extract the features for prediction, we first sider both relationships simultaneously, we define the spatial-
define related spatial and temporal relationship parameters. temporal relations (STR) set, i.e., the set of locations most
Suppose we have a set of locations: L = {l1 , l2 , . . . , ln } and strongly related to li as
a set of features: F = {f1 , f2 , . . . fm }. Each location has geo- Definition 5: Spatial-Temporal Relations Set.
graphical information, such as latitude and longitude, hence
we define the location coordinate (LC) as STRS_cand(li , k)
Definition 1: Location Coordinate. = SRS_cand(li , k) ∪ TRS_cand(li , k), li ∈ L. (7)
Lci = (li , xi , yi ), li ∈ L (1) where we use the union SRS_cand(li ,k) and TRS_cand(li ,k)
rather than the intersection, to provide a larger number of
where xi and yi are the latitude and longitude at location li ,
relationships for the model to learn; and since some location
respectively. Since related location features could improve
behaviors may differ from adjacent locations, the intersection
prediction, we define the distance between two location coor-
would have fewer candidates (or none), resulting in the loss
dinates as
of useful target features. We define the spatial-temporal pre-
Dsq,c = distlocation (Lcq , Lcc ) dictor (STP) using STRS_cand(li ,k) as
= distlocation ((lq , xq , yq ), (lc , xc , yc )), Definition 6: Spatial-Temporal Sequence Prediction
lq , lc ∈ L, q 6 = c. (2) M (STRS_cand(li , k))[ttlb ,tc ]
to find the most closely related locations in the spatial = S(li , ftarget , tst 0 ,ft 0 ), tlb < tc < st 0 ≤ ft 0 . (8)
domain; and the spatial relationships sequence (SRS) set as to build the model M to predict a target feature sequence,
Definition 2: Spatial Relations Sequence Set. where M returns a sequence set, S, of the target features for
SRS = {Ds1,2 , Ds1,3 , . . . , Dsn−1,n }, Dsi,i = 0, the period tst 0 to tft 0 . S is generated from the most similar time
series compared with tlb to tc , where tlb is the lookback time.
0 < i < n + 1 (3)
Section IV proposes the solution method for this prediction
where the matrix elements are calculated from (2); n is the problem.
number of locations; and the diagonal elements, Dsi,i , are all
zero. The most relevant locations are SRS_cand(li ,k), the set IV. PREDICTION MODEL FRANEWORK
of k locations with the smallest spatial distance to li . To find the most relevant relationships between locations
We consider the features of these relevant locations in the we apply spatial-temporal analysis to explore sequence
spatial domain, since locations with similar feature sequences delays and interactions between locations using historical
Algorithm 1 Geographical Relationship Set Generator meaningful, sequences. (ii) We then apply DTW to the fil-
(kNN-ED) tered sequences and convert the calculated values to unit
Input: Target station li ; Set of Locations’ coordinate Lc, similarity as shown in Figure 6. (iii) Finally, TRS_cand(li , k)
where li ∈/ Lc; for the predictive model is generated using Algorithm 2.
Number of candidates k;
Output: Set of Locations by SRS_cand(li , k); Algorithm 2 Temporal Similarity Set Generator
Let SRS_cand ← ∅; for each lc ∈ Lc do (kNN-DTWD)
Calculate distances between li and lc: ED(li , lc); Input: Target station li ; Set of Locations’ L, where
SRS_cand ∪ {lc, ED(li , lc)}; / L;
li ∈
end
Number of candidates k; Target feature ftarget ;
Sort SRS_cand by ED(li , lc);
Time Interval tst,ft ;
if k ≤ Size of SRS_cand then
Output: Set of Locations by TRS_cand(li , k);
SRS_cand(li , k) ← first k th of SRS_cand
Let TRS_cand ← ∅;
end
for each l ∈ L do
else
SRS_cand(li , k) ← SRS_cand Calculate similarity between li and l;
end distdtw ←
DTWDsim(S(li , ftarget , tst,ft ), S(lj , ftarget , tst,ft ), lmin );
TRS_cand ∪ {l, distdtw }; end Sort TRS_cand by distdtw ;
if k ≤ Size of TRS_cand then
TRS_cand(li , k) ← first k th of TRS_cand
end
else
TRS_cand(li , k) ← TRS_cand
end
FIGURE 5. Example PM2.5 time series data (seven locations). B. PREDICTION MODEL DESIGN
The proposed ST-DNN model combines target location tem-
poral information, and related location spatial-temporal and
DTW, hence eliminating time shift and scaling effects, and terrain information (see Figure 2). The data flow includes
identified TRS_cand(li , k) candidates, i.e., those with the target and related location historical data, i.e., pollutants,
strongest temporal relationship to the target location. meteorological conditions, and target features and their trends
Unfortunately, this method cannot calculate the degree of over the previous few hours. These data were input to
similarity between time series with missing data. We tackle the LSTM, adaptive temporal extractor (ASE), and ANN.
this problem as follows. (i) Two selected sequences were We used a matrix of 121 square sections for terrain data,
divided into a plurality of common interval sequences, choos- i.e., 11×11 coordinate lines at 500 m intervals, where the
ing the shortest interval threshold, lmin , to filter short, i.e., not central square in the grid represents the local location.
Thus, 120 unobserved points, with AQI calculated by inverse 2) SPATIAL-TEMPORAL RELATIONS
distance weighting (IDW) [22]. We convolve the relative Pollutant dispersal means that air quality at one location can
elevation with the unobserved point AQIs to reduce AQI be spatially correlated with that at other sites. SRE uses
impact at higher elevation and provide this matrix as input historical spatial-temporal neighborhood location features
to the CNN. CNN inputs can be adjusted later to increase the inputs since air quality at a given location is affected by
resolution, e.g., LASS open source data. local emissions as well as emissions in surrounding areas.
We set lmin = 6 hour in kNN-DTWD [23], [24], Therefore, we devised a SRE to predict target location air
i.e., the minimum time interval for meteorological forecasts. quality based on AQIs and meteorological data from other
Pollutants, meteorological conditions, and target feature(s) locations. Partitioning spaces into regions using circles of
of locations with high similarity (determined using kNN-ED various diameter overlooks terrain impacts, e.g., a mountain
and kNN-DTWD) were input to the LSTM and ASE without between locations.
pretraining. A two-layer ANN was employed to combine Partitioned region data are often mean or mode values,
TRE, SRE, and TE. This final prediction was the deviation which can be highly inaccurate, particularly in areas with
between the target feature value at tc and some future time few locations. Thus, the SRE for a location requires data
tc+h , where tc+h < ft. mining from locations in the spatial-temporal neighborhood
Figure 9 shows the model structure. Air quality and meteo- using kNN-ED and kNN-DTWD, including AQIs and meteo-
rological condition data sources are input to LSTM and ASE, rological conditions (wind speed and direction) for the previ-
and terrain related data are input to CNN. The models are ous 6 hours. Similar to the target location, spatial-temporal
merged via side by side concatenation, and the variables are neighborhood time series features are also continuous and
passed to the following layer. The model is trained hourly coherent. Thus, we included the ANN SRE to increase
over the subsequent 48 hours, since the current status varies model sensitivity by considering spatial-temporal neighbor-
with respect to its effect on future time intervals. Thus, hood impacts.
we pair the inputs with target feature deviations in the various 3) TERRAIN EXTRACTOR
time intervals to train multiple models with the same structure
The relationships between locations vary due to various bar-
corresponding to the different time intervals. The advantage
riers and altitude differences. Therefore, terrain related data
of this structure is that the input sizes are constant, regardless
were included to enhance location correlations. Terrain data
of the location and time interval.
in the vicinity of the locations were captured using a matrix
of 121 square sections; i.e., 11×11 coordinate lines at 500 m
1) TEMPORAL RELATIONS intervals. We adopted the approach of Ferrero et al. [25]
The TRE uses historical target location features as inputs to define the relationships between terrain and PM2.5. The
to predict the future time series. The input time series for elevation of each point, elev, was normalized as
PM2.5 and other concentrations are continuous and coherent,
and can be divided into low frequency (trends) and high elev − elevst
frequency (rapid changes) information. Hs = (10)
elevst
VOLUME 6, 2018 38191
P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model
FIGURE 12. TWEPA PM2.5 index for the Taiwan dataset [26].
FIGURE 13. Influence of past h hours (x-axis: hour (s), y-axis: µg/m3 ).
as the minimum time interval for meteorological for further hours. The Zheng model [5] is superior to the
forecasting [23], [24]. We also used an independent LSTM other considered models except ST-DNN(C) for 4-6 hour
for each feature with h = 6 hours, and chose k = 3 for predictions, since that model partitions regions at 30, 100, and
kNN-ED and kNN-DTWD in the proposed ST-DNN model. 300 km, which helps detect diffusion from distant locations.
We used a 5Ã-5 filter for the CNN with one convolutional This phenomenon happens as Zheng partitions regions with
layer. The CNN did not include a max pooling layer since distances of 30km, 100km and 300km, which help to detect
we did not extract the highest concentrations but the most diffusions from other long distanced places. The proposed
strongly related concentrations. We chose a linear activation ST-DNN methods choose nearby or recently similar locations
function to consider negative correlations. We used a total that have insufficient diffusion data. Although ST-DNN(C)
of 4276 parameters for the ST-DNN models for the Taiwan has less information, it surpasses all other model perfor-
dataset. mances (see Figure 15). In contrast, feeding all the data into
ANN and Linear Regression models provided poorer pre-
D. PERFORMANCE OF PREDICTIONS diction performance due to interference between locations.
1) TAIWAN DATASET Generally, kNN-DTWD based models show superior perfor-
We first checked if the input components to the ST-DNN mance to those based on kNN-ED, and Adaptive methods
model were significant, examining all Adaptive ANN (A), perform similarly to kNN-DTWD based models.
LSTM (L), and CNN (C) combinations to identify the best
models, as shown in Figure 14. For the first hour predic-
tion, we found that ST-DNN models with all components
(A+L+C) outperformed all other models, and the CNN only
model outperformed all other combinations for 2-6 hour pre-
dictions. Therefore, we include only A+L+C and C models
in further discussions.
FIGURE 16. Eastern area short period (1-6 hour) prediction performance.
FIGURE 21. Basin city short period (1-6 hour) prediction performance.
FIGURE 22. Mountain city short period (1-6 hour) prediction performance.
FIGURE 19. Flatland city short period (1-6 hour) prediction performance.
than combining inputs. The proposed models also outperform E. BEHAVIOUR OF PROPOSED MODEL
the Zheng, confirming the importance of observations at indi- 1) ANALYSIS OF LSTM AND CNN
vidual locations, rather than using neighborhood averages. We used the Taiwan dataset Tainan training data to inves-
Thus, the proposed ST-DNN(C) model is a suitable approach tigate patterns between different delays and identify LTSM
for further study. prediction improvements, comparing PM2.5 variations from
tc+0 versus tc+1 , tc+0 versus tc+2 and tc+0 versus tc+3 as
2) BEIJING DATASET shown in Figures 26–28, respectively. Very short prediction
We modified the models somewhat for the Beijing dataset to (Figure 26) exhibits almost linear then a sharp rise and sub-
ensure robust prediction. We neglect forecast data, since these sequent sharp drop; whereas this linear relationship disperses
were based on physical models, and omitted the CNN compo- for longer prediction times (Figures 27 and 28). Thus, LSTM
nent from the proposed ST-DNN model since elevations were only helps improve first hour prediction.
not available in Beijing.
FIGURE 24. Short period (1-6 hour) prediction performance for the
Beijing dataset.
FIGURE 25. Extended time interval prediction performance for the Beijing
dataset. Limitations of DTW mean that kNN-DTWD chooses the
most similar candidate locations but neglects long time delay.
Figure 25 shows that overall model performances are sim- However, the delay interval is important for long term pre-
ilar to short time performances (Figure 24). dictions. The proposed method could be improved by shifting
2) INFLUENCE OF k
Figure 29 compares kNN-ED and kNN-DTWD with ANN
to investigate the performance effects of k. kNN-ED
MAE increases as k increases, whereas kNN-DTWD MAE
decreases. Thus, kNN-DTWD outperforms kNN-ED, which
only considers spatial adjacent locations, omitting ter-
rain, whereas kNN-DTWD chooses temporal similar loca-
tions, with similar responses, relaxing geographical impact
restrictions.
FIGURE 31. Dashboard of QQ air quality.
VI. REAL APPLICATIONS FOR THE PROPOSED MODEL
Figure 30 shows the web user interface of the proposed
system [16], where icons on the map represent monitoring Figure 31 shows the dashboard interface of the real-time
stations and the number associated with an icon denotes PM2.5 and the PM2.5 predictions for the next 1-6 hour.
its PM2.5 concentration. The color of the icons indicates The dashboard will automatically update information every
the air quality at that location based on TWEPA PM2.5 hour. In addition, users can choose any specific station on
their demands. This service aims to help people prevent the ACKNOWLEDGMENT
exposure of unhealthy air. The authors are grateful to the Taiwan Environmental Protec-
tion Administration for providing the monitoring data used in
this study.
REFERENCES
[1] Y.-F. Xing, Y.-H. Xu, M.-H. Shi, and Y.-X. Lian, ‘‘The impact of PM2.5
on the human respiratory system,’’ J. Thoracic Disease, vol. 8, no. 1,
pp. E69–E74, 2016.
[2] H.-J. Chu, C.-Y. Lin, C.-J. Liau, and Y.-M. Kuo, ‘‘Identifying controlling
factors of ground-level ozone levels over southwestern Taiwan using a
decision tree,’’ Atmos. Environ., vol. 60, pp. 142–152, Dec. 2012.
[3] A. P. K. Tai, L. J. Mickley, and D. J. Jacob, ‘‘Correlations between fine
particulate matter (PM2.5 ) and meteorological variables in the United
States: Implications for the sensitivity of PM2.5 to climate change,’’ Atmos.
Environ., vol. 44, no. 32, pp. 3976–3984, 2010.
[4] C.-M. Liu, C.-Y. Young, and Y.-C. Lee, ‘‘Influence of Asian dust storms on
air quality in Taiwan,’’ Sci. Total Environ., vol. 368, nos. 2–3, pp. 884–897,
2006.
FIGURE 32. Facebook chatbot of QQ air quality.
[5] Y. Zheng et al., ‘‘Forecasting fine-grained air quality based on big data,’’
in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
New York, NY, USA, 2015, pp. 2267–2276.
In addition, we develop the Facebook chatbot to send the [6] Y. Hwa-Lung and W. Chih-Hsin, ‘‘Retrospective prediction of intraurban
PM2.5 forecasts to users who subscribed the daily report of spatiotemporal distribution of PM2.5 in Taipei,’’ Atmos. Environ., vol. 44,
no. 25, pp. 3053–3065, 2010.
specific stations, as shown in Figure 32. Users can query the [7] B. S. Beckerman et al., ‘‘A hybrid approach to estimating national scale
current PM2.5 by typing the station name or sending current spatiotemporal variability of PM2.5 in the contiguous united states,’’ Envi-
GPS location. This service benefits people to easily use it on ron. Sci. Technol., vol. 47, no. 13, pp. 7233–7241, 2013.
[8] H.-J. Chu, H.-L. Yu, and Y.-M. Kuo, ‘‘Identifying spatial mixture distribu-
mobile phone or PC, which reminds people the air quality tions of PM2.5 and PM10 in Taiwan during and after a dust storm,’’ Atmos.
before going outside. Environ., vol. 54, pp. 728–737, Jul. 2012.
[9] H.-W. Chen, C.-T. Tsai, C.-W. She, Y.-C. Lin, and C.-F. Chiang, ‘‘Explor-
ing the background features of acidic and basic air pollutants around an
VII. CONCLUSION industrial complex using data mining approach,’’ Chemosphere, vol. 81,
This study proposed an air quality forecasting system using no. 10, pp. 1358–1367, 2010.
data driven models, ST-DNN, to predict PM2.5 over 48 hours. [10] A. Kurt and A. B. Oktay, ‘‘Forecasting air pollutant indicator levels with
geographic models 3 days in advance using neural networks,’’ Expert Syst.
The proposed method is also generally applicable to other Appl., vol. 37, no. 12, pp. 7986–7992, 2010.
pollutants, etc. [11] Y. Zheng, F. Liu, and H.-P. Hsieh, ‘‘U-air: When urban air quality inference
The proposed ST-DNN shows that including an LSTM meets big data,’’ in Proc. 19th ACM SIGKDD Int. Conf. Knowl. Discovery
Data Mining, 2013, pp. 1436–1444.
module enhanced first hour predictions, with CNN module [12] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, ‘‘Gradient-based learn-
inclusion being more useful for longer time frame predic- ing applied to document recognition,’’ Proc. IEEE, vol. 86, no. 11,
tions, since CNN can extract the temporal delay factor from pp. 2278–2324, Nov. 1998.
[13] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’ Neural
surrounding target features by learning spatial information. Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
We evaluated the proposed models using real-world [14] Air Quality Index Historical Data. Accessed: Aug. 22, 2016. [Online].
Taiwan and Beijing datasets. Relevant location selection was Available: http://taqm.epa.gov.tw/taqm/tw/YearlyDataDownload.aspx
[15] Y. Zheng et al. (2015). Forecasting Fine-Grained Air Quality Based on
verified to be important, with inclusion of all locations caus- Big Data. [Online]. Available: http://research.microsoft.com/apps/pubs/
ing increased model noise and hence poorer prediction per- ?id=246398
formance. the proposed methods outperformed all baselines [16] QQ Air Quality(QQAQ). Accessed: Jun. 6, 2018. [Online]. Available:
https://qqaq.ee.ncku.edu.tw
and comparative models considered.
[17] QQAQ Facebook Fan Page. Accessed: Jun. 6, 2018. [Online]. Available:
Future research will improve the proposed model perfor- https://www.facebook.com/QQAirQuality/
mances, and consider specific Airbox sensor source mod- [18] S. Qin, F. Liu, C. Wang, Y. Song, and J. Qu, ‘‘Spatial-temporal analysis and
els as features to tune and mitigate noise due to machine projection of extreme particulate matter (PM10 and PM2.5 ) levels using
association rules: A case study of the Jing-Jin-Ji region, China,’’ Atmos.
differences [33], [34]. We will also consider more chemical Environ., vol. 120, pp. 339–350, Nov. 2015.
features that affect PM2.5 components [35]. For long period [19] P.-W. Soh, K.-H. Chen, J.-W. Huang, and H.-J. Chu, ‘‘Spatial-temporal
predictions, we will consider concentric circles for different pattern analysis and prediction of air quality in taiwan,’’ in Proc. Int. Conf.
Ubi-Media Comput. (UMedia), Aug. 2017, pp. 1–6.
distance partitions or clusters to emphasize air pollution prop- [20] T. Rakthanmanon et al., ‘‘Searching and mining trillions of time series
agation delay effects. subsequences under dynamic time warping,’’ in Proc. 18th ACM SIGKDD
Ultimately, we intend to detect air pollution sources, Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 262–270.
[21] E. Keogh and C. A. Ratanamahatana, ‘‘Exact indexing of dynamic time
including domestic and transboundary pollution. To control warping,’’ Knowl. Inf. Syst., vol. 7, no. 3, pp. 358–386, 2005.
the air pollution, we must first understand how it is generated [22] D. Shepard, ‘‘A two-dimensional interpolation function for irregularly-
and propagated. Only then can we devise effective solutions spaced data,’’ in Proc. 23rd ACM Nat. Conf., 1968, pp. 517–524.
[23] METAPP. Accessed: Jun. 6, 2018. [Online]. Available: http://www.
to reduce pollution sources. Therefore, future pollution stud- metapp.org.tw/index.php/weatherknowledge/37-typhoon/83-2009-01-22-
ies will be greatly dependent on the proposed model. 08-04-48
[24] H. L. Chang, ‘‘Evaluation and application of the short-range (0-6hr) pqpfs JIA-WEI CHANG received the Ph.D. degree from
from an ensemble prediction system based on laps,’’ Ph.D. dissertation, the Department of Engineering Science, National
Graduate Inst. Atmos. Phys., Nat. Central Univ., Taoyuan, Taiwan, 2014. Cheng Kung University in 2017. He is currently
[25] L. Ferrero, G. Mocnik, B. S. Ferrini, M. G. Perrone, G. Sangiorgi, and an Adjunct Assistant Professor with the Depart-
E. Bolzacchini, ‘‘Vertical profiles of aerosol absorption coefficient from ment of Engineering Science and a Post-Doctoral
micro-Aethalometer data and Mie calculation over Milan,’’ Sci. Total Fellow Researcher with the Department of Electri-
Environ., vol. 409, no. 14, pp. 2824–2837, 2011. cal Engineering, National Cheng Kung University.
[26] PM2.5 Index. Accessed: Aug. 22, 2016. [Online]. Available:
His research interests include natural language
http://taqm.epa.gov.tw/taqm/en/fpmi.htm
processing, artificial intelligence, data mining, and
[27] TWEPA Instruments. Accessed: Jun. 6, 2018. [Online]. Available:
https://taqm.epa.gov.tw/taqm/tw/b0102-3.aspx e-learning technologies.
[28] Y. Pan, Y. Liu, B. Xu, and H. Yu, ‘‘Hybrid feedback feedforward: An
efficient design of adaptive neural network control,’’ Neural Netw., vol. 76,
pp. 122–134, Apr. 2016.
[29] J. de Jesús Rubio, ‘‘Stable Kalman filter and neural network for the chaotic
systems identification,’’ J. Franklin Inst., vol. 354, no. 16, pp. 7444–7462,
2017.
[30] Y. Pan and H. Yu, ‘‘Biomimetic hybrid feedback feedforward neural-
network learning control,’’ IEEE Trans. Neural Netw. Learn. Syst., vol. 28,
no. 6, pp. 1481–1487, Jun. 2017.
[31] J. de Jesús Rubio, ‘‘Error convergence analysis of the SUFIN and
CSUFIN,’’ Appl. Soft Comput., to be published. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S1568494618301881
[32] S.-J. Lu, D. Wang, X.-B. Li, Z. Wang, Y. Gao, and Z.-R. Peng, ‘‘Three-
dimensional distribution of fine particulate matter concentrations and syn-
chronous meteorological data measured by an unmanned aerial vehicle
(UAV) in yangtze river delta, China,’’ Atmos. Meas. Techn. Discuss.,
pp. 1–19, Mar. 2016. [Online]. Available: https://www.atmos-meas-tech-
discuss.net/amt-2016-57/
[33] L.-J. Chen, Y.-H. Ho, H.-H. Hsieh, S.-T. Huang, H.-C. Lee, and S. Mahajan,
‘‘ADF: An anomaly detection framework for large-scale PM2.5 sensing
systems,’’ IEEE Internet Things J., vol. 52, no. 2, pp. 559–570, Aug. 2018.
[34] N. Moustafa, G. Creech, E. Sitnikova, and M. Keshk. (2017). ‘‘Col-
laborative anomaly detection framework for handling big data of cloud JEN-WEI HUANG received the B.S. and Ph.D.
computing.’’ [Online]. Available: https://arxiv.org/abs/1711.02829 degrees in electrical engineering from National
[35] W. M. Hodan and W. R. Barnard, ‘‘Evaluating the contribution of PM2.5
Taiwan University, Taiwan, in 2002 and 2009,
precursor gases and re-entrained road emissions to mobile source PM2.5
respectively. He was a Visiting Scholar with
particulate matter emissions,’’ MACTEC Federal Programs, Research
Triangle Park, NC, USA, 2004. the IBM Almaden Research Center from
2008 to 2009, an Assistant Professor with Yuan
Ze University from 2009 to 2012, and a Visiting
PING-WEI SOH received the B.S degree in elec- Scholar with the University of Chicago in 2016.
trical engineering from National Cheng Kung Uni- He is currently an Assistant Professor with the
versity, Tainan, Taiwan, in 2015, and the M.S Department of Electrical Engineering, National
degree from the Institute of Computer and Com- Cheng Kung University, Taiwan. He majors in computer science and is
munication Engineering, National Cheng Kung familiar with data mining. His research interests include data mining, mobile
University, in 2018. His research interests include computing, and bioinformatics. Among these, social network analysis,
spatial-temporal and air pollution issues. spatial-temporal data mining, and multimedia information retrieval are his
special interests. In addition, some of his research is on data broadcasting,
privacy preserving data mining, e-learning, and Fin-tech.