0% found this document useful (0 votes)

37 views14 pages

Adaptive Deep Learning-Based Air Quality Prediction Model Using The Most Relevant Spatial-Temporal Relations

This paper proposes an adaptive deep learning model called spatial-temporal deep neural network (ST-DNN) to forecast air quality up to 48 hours in the future. The model uses historical data from multiple monitoring locations to extract spatial-temporal relationships using convolutional neural networks and long short-term memory. It considers factors like PM2.5, temperature, wind and elevation to predict pollution propagation between locations over time.

Uploaded by

bella

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views14 pages

Adaptive Deep Learning-Based Air Quality Prediction Model Using The Most Relevant Spatial-Temporal Relations

Uploaded by

bella

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Received May 6, 2018, accepted May 29, 2018, date of publication June 22, 2018, date of current version

July 30, 2018.

Digital Object Identifier 10.1109/ACCESS.2018.2849820

Adaptive Deep Learning-Based Air Quality

Prediction Model Using the Most Relevant
Spatial-Temporal Relations
PING-WEI SOH1 , JIA-WEI CHANG2 , AND JEN-WEI HUANG 2
1 Institute
of Computer and Communication Engineering, National Cheng Kung University, Tainan 701, Taiwan
2 Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Taiwan

Corresponding author: Jen-Wei Huang ([email protected])

ABSTRACT Air pollution has become an extremely serious problem, with particulate matter having a
significantly greater impact on human health than other contaminants. The small diameter of fine particulate
matter (PM2.5) allows it to penetrate deep into the alveoli as far as the bronchioles, interfering with a
gas exchange within the lungs. Long-term exposure to particulate matter has been shown to cause the
cardiovascular disease, respiratory disease, and increase the risk of lung cancers. Therefore, forecasting air
quality has also become important to help guide individual actions. This paper aims to forecast air quality
for up to 48 h using a combination of multiple neural networks, including an artificial neural network,
a convolutional neural network, and a long-short-term memory to extract spatial-temporal relations. The
proposed predictive model considers various meteorology data from the previous few hours as well as
information related to the elevation space to extract terrain impact on air quality. The model includes trends
from multiple locations, extracted from correlations between adjacent locations, and among similar locations
in the temporal domain. Experiments employing Taiwan and Beijing data sets show that the proposed model
achieves excellent performance and outperforms current state-of-the-art methods.

INDEX TERMS Dynamic time warping(DTW), convolutional neural network(CNN), long-short-term

memory(LSTM), spatio-temporal analysis, big data, air quality forecast.

I. INTRODUCTION strength every hour [5], and there are insufficient sensors
Increasing attention has been given to air quality degen- deployed to provide emission data from factories or vehi-
eration, with particulate matter (PM) having a significant cles. Recent studies have shown it is critical that time and
egregious impact on human health. The small diameter of space be explicitly considered to analyze air quality [6]–[8].
fine particulate matter (PM2.5) allows it to penetrate deep Particulate matter has high cyclicality and is easily affected
into the alveoli as far as the bronchioles, interfering with gas by space, stagnating or diffusing to pollute surrounding envi-
exchange within the lungs. Xing showed that long term expo- ronments. If PM is only analyzed in the time domain, this
sure to particulate matter increased the risk of the cardiovas- may neglect impacts and relationships between other regions;
cular disease, respiratory disease, and lung cancer [1]. With whereas considering only spatial relationships may omit PM
increasing public health consciousness, many cities have diffusion from over time. Therefore, time and spatial rela-
established air quality monitoring locations. However, most tions must be simultaneously considered to accurately model
services only show the current air quality and do not forecast PM diffusion.
air quality. Air quality prediction is essential to help guiding Data mining provides new methods to analyze air quality
individual actions limiting PM2.5 exposure, e.g., choosing in the absence of physical models [9]–[11], and may identify
outdoor or indoor activities. hidden information in the collected data. Furthermore, pre-
However, accurate air quality forecasting is hindered by a diction speed is far quicker once a model is trained than for
complex array of factors [2]–[4], including emissions, traffic traditional physical models. Therefore, we propose a model
patterns, and meteorological conditions. Meteorologists are to provide 48 hours air quality index (AQI) predictions every
still substantially limited to provide reliable wind pattern hour at every monitoring location. As shown in Figure 1,
predictions, which can vary considerably in direction and we forecast 48-hours predictions from the current time, tc ,

2169-3536
2018 IEEE. Translations and content mining are permitted for academic research only.
38186 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

The remainder of this paper is organized as follows.

Section II describes related works, and Section III defines the
problem to be addressed. Section IV presents an overview
of the proposed system, describes the proposed method
to mine spatial-temporal relationships, and provides the
proposed spatial-temporal prediction model framework and
details. Section V compares the proposed model perfor-
mance with previous state-of-the-art algorithms based on
real-world datasets, and Section VI introduces real applica-
tions developed to provide convenient and publicly accessible
FIGURE 1. Air quality index (AQI) forecast format.
PM2.5 forecasts. Finally, Section VII summarizes and con-
using historical data. In particular, the peaks and valleys are cludes the paper.
the most important segments for the future predictions.
This study proposes a general predictive model for II. RELATED WORKS
air quality forecasts called spatial-temporal deep neural Qin et al. [18] proposed mining environmental spatial-
network (ST-DNN) that incorporates various information temporal relationships using an a-priori pattern mining algo-
from monitoring locations, including PM2.5, PM10, tem- rithm. They shifted one of the time sequences to create
perature, wind speed, wind direction, average wind speed, specific gaps in each time sequence to generate high fre-
average wind direction, relative humidity, and data related quency candidates for rule generation. The resulting rules
to the elevation space. The model was trained using current reveal the appearance of pollutants with delays in different
and previous few hours air pollutant and meteorological con- locations. However, his method requires repeatedly running
dition data. The proposed scheme does not consider fore- the rule generation process through numerous combinations,
cast data from external sources. We developed a method to which is time consuming.
integrate the relevant data based on geographical and tempo- Zheng et al. [5] proposed a framework that considered
ral correlations among monitoring locations. We first found temporal as well as spatial relationships. The framework
the most relevant spatial-temporal relations among locations, was divided into four components: temporal predictor, spatial
then combined multiple neural network architectures using a predictor, prediction aggregator, and inflection predictor. The
convolutional neural network (CNN) [12] and long short term temporal predictor considered only historical data of the tar-
memory (LSTM) [13]. Target and similar location spatial- get location using linear regression. The spatial predictor con-
temporal features were used to increase the predictive model sidered global data using the mean and median in the region
sensitivity and explicitly consider terrain impacts for pollu- around a location, using an artificial neural network (ANN).
tant propagation. Thus, the proposed model uses (i) temporal The ANN excluded information from other locations, hence
information based on target location historical data, (ii) spa- the results were insensitive to surroundings conditions and
tial relationships based on related locations’ data, i.e., loca- global trends. The predictor aggregator used a regression tree
tions with high spatial or temporal similarity, and (iii) terrain with three inputs to combine temporal and spatial predic-
information for the area around the locations. tors with local meteorological data. The inflection predictor
To validate the proposed model, we performed experi- identified sudden drops in target feature value, i.e., PM2.5,
ments using two real-world datasets: 76 locations in 23 cities by finding situations where specific feature thresholds were
in Taiwan [14] and locations from Beijing dataset [15]. surpassed. The proposed framework pre-trained the temporal
The experimental results confirm that the proposed methods predictor and spatial predictor, and then trained the predictor
achieve excellent performance, superior to current baselines aggregator to combine the results. However, this can over-fit
and several state-of-the-art methods. The main contributions the data, since the same features are adopted as delimiters
of this study are as follows. in the predictor aggregator; and the spatial predictor con-
• We propose a framework to mine spatial-temporal data siders the mean and median values in a large region, which
for a given location to provide a predictive model. can reduce sensitivity. Zheng also separated the wind direc-
• We develop a deep learning model combining multi- tions into 8 classes and did not consider terrain information,
ple neural networks to incorporate air quality correla- although PM2.5 diffusion depends strongly on wind, which is
tions among similar locations and temporal dependency greatly influenced by terrain. Therefore, Zheng’s model may
at a given location. Spatial and temporal predictions not be suitable for undulating terrain.
are combined dynamically based on the trained neural Soh et al. [19] previously proposed a k-nearest neigh-
network. bor by DTW distance (kNN-DTWD) method to consider
• The proposed system has been deployed throughout time sequence similarities for different locations and then
Taiwan, providing access to fine grain information compared surrounding locations using the k-nearest neigh-
regarding air quality via a public website [16] and bor by Euclidean distance (kNN-ED). DTW [20], [21] is a
Facebook chatbot [17]. well-known method to calculate similarity between two time

VOLUME 6, 2018 38187

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

series, hence we applied DTW to calculate location tempo- may help to improve prediction. We define the feature
ral distances and used k-nearest neighbor (kNN) to identify sequence interval (FSI) for a specific location as
locations with the most similar temporal behavior [19]. This Definition 3: Feature Sequence Interval with Location.
differs from the approach of Zheng et al. [5] in that we were
S(li , fj , tst,ft ) = {e(li , fj , tst ), e(li , fj , tst+1 ), . . . ,
seeking to enhance prediction sensitivity over short durations
rather than longer periods. Experimentally, kNN-DTWD out- e(li , fj , tft )}, li ∈ L, fj ∈ F, st < ft. (4)
performed kNN-ED on average. However, for some special where li has feature fj that varies from start, st, to finish, ft,
cases, e.g., flat areas with numerous locations, the kNN-ED time (st < ft); and e(li ,fj ,tk ) represents the measured value
method proved advantageous. The kNN-DTWD method was of fj at tk .
well suited selection locations with similar behavior, and We can express the distance between feature sequences for
the kNN-ED method was best avoided where there was a any two locations as
mountain between the locations.
This work combines kNN-ED benefits for featureless land- Dtq,c,tst,ft
scapes with kNN-DTWD benefits for complex landscapes, = distsequence (S(lq , ftarget , tst,ft ), S(lc , ftarget , tst,ft )),
allowing the model to derive optimal combinations of based
lq , lc ∈ L, q 6= c. (5)
on the training data, as detailed in the following sections.
to obtain the most related locations in the temporal domain.
III. PROBLEM DEFINITION Before using the measure, we must select a feature ftarget ,
We first identify the locations with the most influential as the target prediction sequence: this study chose PM2.5,
spatial-temporal relationships to the target location and then but other targets could be employed as required. Then we can
predict sequences for the target location based on time use (5) to calculate the temporal relations sequence (TRS) set,
sequence features that include spatial information. Hence, Definition 4: Temporal Relations Sequence Set.
the locations are fixed, and the time sequences vary accord-
ing to their positions. Spatial features could also impact the TRStst,ft = {Dt1,2,tst,ft , Dt1,3,tst,ft , . . . , Dtn−1,n,tst,ft } (6)
sequences, e.g., a mountain between two locations. There- We then select candidates TRS_cand(li ,k), the set of k
fore, we use both temporal and spatial relationships in the pre- locations with the least difference from location li . To con-
dictive model. To extract the features for prediction, we first sider both relationships simultaneously, we define the spatial-
define related spatial and temporal relationship parameters. temporal relations (STR) set, i.e., the set of locations most
Suppose we have a set of locations: L = {l1 , l2 , . . . , ln } and strongly related to li as
a set of features: F = {f1 , f2 , . . . fm }. Each location has geo- Definition 5: Spatial-Temporal Relations Set.
graphical information, such as latitude and longitude, hence
we define the location coordinate (LC) as STRS_cand(li , k)
Definition 1: Location Coordinate. = SRS_cand(li , k) ∪ TRS_cand(li , k), li ∈ L. (7)
Lci = (li , xi , yi ), li ∈ L (1) where we use the union SRS_cand(li ,k) and TRS_cand(li ,k)
rather than the intersection, to provide a larger number of
where xi and yi are the latitude and longitude at location li ,
relationships for the model to learn; and since some location
respectively. Since related location features could improve
behaviors may differ from adjacent locations, the intersection
prediction, we define the distance between two location coor-
would have fewer candidates (or none), resulting in the loss
dinates as
of useful target features. We define the spatial-temporal pre-
Dsq,c = distlocation (Lcq , Lcc ) dictor (STP) using STRS_cand(li ,k) as
= distlocation ((lq , xq , yq ), (lc , xc , yc )), Definition 6: Spatial-Temporal Sequence Prediction
lq , lc ∈ L, q 6 = c. (2) M (STRS_cand(li , k))[ttlb ,tc ]
to find the most closely related locations in the spatial = S(li , ftarget , tst 0 ,ft 0 ), tlb < tc < st 0 ≤ ft 0 . (8)
domain; and the spatial relationships sequence (SRS) set as to build the model M to predict a target feature sequence,
Definition 2: Spatial Relations Sequence Set. where M returns a sequence set, S, of the target features for
SRS = {Ds1,2 , Ds1,3 , . . . , Dsn−1,n }, Dsi,i = 0, the period tst 0 to tft 0 . S is generated from the most similar time
series compared with tlb to tc , where tlb is the lookback time.
0 < i < n + 1 (3)
Section IV proposes the solution method for this prediction
where the matrix elements are calculated from (2); n is the problem.
number of locations; and the diagonal elements, Dsi,i , are all
zero. The most relevant locations are SRS_cand(li ,k), the set IV. PREDICTION MODEL FRANEWORK
of k locations with the smallest spatial distance to li . To find the most relevant relationships between locations
We consider the features of these relevant locations in the we apply spatial-temporal analysis to explore sequence
spatial domain, since locations with similar feature sequences delays and interactions between locations using historical

38188 VOLUME 6, 2018

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

FIGURE 2. Predictive model (ST-DNN) framework.

temporal patterns, and consider location feature trends for

potential factors. We consider adjacent locations or locations
with similar temporal patterns because they have high cor-
relations with the target location. After importing processed
data into the system, we determine the most related locations
to the target location using the proposed kNN-ED and kNN-
DTWD relationship extractors, and then generate training
datasets from the top k related locations. Finally, we train
the deep learning based model and compare prediction per-
formances. Figure 2 shows the proposed predictive model
framework comprises four main components as follows.
• The temporal relationships extractor (TRE) obtains air
quality features from the target location meteorological
data over the previous few hours using LSTM model.
• The spatial-temporal relationships extractor (SRE) uses FIGURE 3. Example derivation (kNN-ED, k = 5).
ANN model to obtain air quality feature data from
related locations selected by kNN-ED or kNN-DTWD. The 20 selected candidates, shown in green in Figure 3, are
• The terrain extractor (TE) obtains terrain information denoted as SRS_cand(li , k) and used to train the predictive
in the vicinity of the target location and uses a CNN model. The procedure is shown in Algorithm 1.
to extract interactions between terrain and air quality
features. 2) k-NEAREST NEIGHBOR BY DTW DISTANCE (kNN-DTWD)
• The merge layer provides the STP to combine the dis-
We used the DTW algorithm to calculate the distance between
crete component outcomes. In some cases, predictions
two sequences by minimizing the errors in shifting and scal-
based on historical target location data are more relevant,
ing between the sequences., as shown in Figure 4. Although
whereas in other cases, such as windy days, spatial data
the sequences are dissimilar using Euclidean distance, DTW
should be given a higher weight. The full connected
can restore sequence distortions by mapping the data points
ANN layer can learn the weights from the training data.
to corresponding intervals. Thus, DTW identified the most
The following sections introduce the methods to mine strongly related temporal relationships to the target location
spatial-temporal relationships and the ST-DNN predictive and calculated the time series feature distances between loca-
model. tions. The distances were sorted and the top k most similar
locations chosen as candidates to predict the target location
A. MINING SPATIAL-TEMPORAL RELATIONSHIPS sequence. We refer to this method as kNN-DTWD. Figure 5
FROM RELATED LOCATIONS shows that the PM2.5 time series exhibit various shift and
1) k-NEAREST NEIGHBOR BY EUCLIDEAN scale differences.S7 (purple) has delays compared with S1
DISTANCE (kNN-ED) (red), and S2 (orange), as highlighted within the red circle.
We calculated Euclidean distance between the locations We calculated the degree of similarity between TRS_cand
using their geographical coordinates, as shown in Figure 3. (li , k), i.e., the candidate set, members using conventional

VOLUME 6, 2018 38189

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

Algorithm 1 Geographical Relationship Set Generator meaningful, sequences. (ii) We then apply DTW to the fil-
(kNN-ED) tered sequences and convert the calculated values to unit
Input: Target station li ; Set of Locations’ coordinate Lc, similarity as shown in Figure 6. (iii) Finally, TRS_cand(li , k)
where li ∈/ Lc; for the predictive model is generated using Algorithm 2.
Number of candidates k;
Output: Set of Locations by SRS_cand(li , k); Algorithm 2 Temporal Similarity Set Generator
Let SRS_cand ← ∅; for each lc ∈ Lc do (kNN-DTWD)
Calculate distances between li and lc: ED(li , lc); Input: Target station li ; Set of Locations’ L, where
SRS_cand ∪ {lc, ED(li , lc)}; / L;
li ∈
end
Number of candidates k; Target feature ftarget ;
Sort SRS_cand by ED(li , lc);
Time Interval tst,ft ;
if k ≤ Size of SRS_cand then
Output: Set of Locations by TRS_cand(li , k);
SRS_cand(li , k) ← first k th of SRS_cand
Let TRS_cand ← ∅;
end
for each l ∈ L do
else
SRS_cand(li , k) ← SRS_cand Calculate similarity between li and l;
end distdtw ←
DTWDsim(S(li , ftarget , tst,ft ), S(lj , ftarget , tst,ft ), lmin );
TRS_cand ∪ {l, distdtw }; end Sort TRS_cand by distdtw ;
if k ≤ Size of TRS_cand then
TRS_cand(li , k) ← first k th of TRS_cand
end
else
TRS_cand(li , k) ← TRS_cand
end

Figure 6 is explained in details as follow. When two time

series intervals have non-missing values simultaneously and
the common interval length exceeds lmin (i.e., la > lmin ,
Figure 7) then the common interval the interval is included for
DTW similarity. In contrast, lb < lmin (Figure 7), and hence is
ignored in the DTW calculation. Although the ignored cause
some loss of information, but this method effectively removes
most noise related errors.
FIGURE 4. Euclidean and DTW matching.
The average unit distance is defined as
Pn
di
d1 = P1n (9)
1 li
where di is the distance in a common interval, and li is the
length of that interval; which combines multiple fragment
sequences to facilitate overall similarity identification.
Figure 8 shows two interval distances, d1 and d2 , calculated
by DTW; and their common interval lengths, l1 and l2 , respec-
tively, recorded for calculating the average unit distance.

FIGURE 5. Example PM2.5 time series data (seven locations). B. PREDICTION MODEL DESIGN
The proposed ST-DNN model combines target location tem-
poral information, and related location spatial-temporal and
DTW, hence eliminating time shift and scaling effects, and terrain information (see Figure 2). The data flow includes
identified TRS_cand(li , k) candidates, i.e., those with the target and related location historical data, i.e., pollutants,
strongest temporal relationship to the target location. meteorological conditions, and target features and their trends
Unfortunately, this method cannot calculate the degree of over the previous few hours. These data were input to
similarity between time series with missing data. We tackle the LSTM, adaptive temporal extractor (ASE), and ANN.
this problem as follows. (i) Two selected sequences were We used a matrix of 121 square sections for terrain data,
divided into a plurality of common interval sequences, choos- i.e., 11×11 coordinate lines at 500 m intervals, where the
ing the shortest interval threshold, lmin , to filter short, i.e., not central square in the grid represents the local location.

38190 VOLUME 6, 2018

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

FIGURE 6. Similarity measure procedure.

Since LSTM models historical time series behavior,

we consider the TRE LSTM to obtain target location time
series trends; whereas the ANN uses current data only, and
hence is sensitive to rapid changes. Thus, the LSTM and ANN
FIGURE 7. Shortest interval threshold. provide low and high frequency information, respectively,
from the sequences.
The LSTM TRE models trends for PM2.5 and PM10 con-
centrations as well as local meteorological data (wind speed,
wind direction, humidity, and temperature) over the previous
six hours; and the ASE TRE increases the model sensitivity
using the same features as LSTM TRE. Previous studies have
FIGURE 8. Average unit distance calculation example. verified the relevance of these features with regard to air
quality [11].

Thus, 120 unobserved points, with AQI calculated by inverse 2) SPATIAL-TEMPORAL RELATIONS
distance weighting (IDW) [22]. We convolve the relative Pollutant dispersal means that air quality at one location can
elevation with the unobserved point AQIs to reduce AQI be spatially correlated with that at other sites. SRE uses
impact at higher elevation and provide this matrix as input historical spatial-temporal neighborhood location features
to the CNN. CNN inputs can be adjusted later to increase the inputs since air quality at a given location is affected by
resolution, e.g., LASS open source data. local emissions as well as emissions in surrounding areas.
We set lmin = 6 hour in kNN-DTWD [23], [24], Therefore, we devised a SRE to predict target location air
i.e., the minimum time interval for meteorological forecasts. quality based on AQIs and meteorological data from other
Pollutants, meteorological conditions, and target feature(s) locations. Partitioning spaces into regions using circles of
of locations with high similarity (determined using kNN-ED various diameter overlooks terrain impacts, e.g., a mountain
and kNN-DTWD) were input to the LSTM and ASE without between locations.
pretraining. A two-layer ANN was employed to combine Partitioned region data are often mean or mode values,
TRE, SRE, and TE. This final prediction was the deviation which can be highly inaccurate, particularly in areas with
between the target feature value at tc and some future time few locations. Thus, the SRE for a location requires data
tc+h , where tc+h < ft. mining from locations in the spatial-temporal neighborhood
Figure 9 shows the model structure. Air quality and meteo- using kNN-ED and kNN-DTWD, including AQIs and meteo-
rological condition data sources are input to LSTM and ASE, rological conditions (wind speed and direction) for the previ-
and terrain related data are input to CNN. The models are ous 6 hours. Similar to the target location, spatial-temporal
merged via side by side concatenation, and the variables are neighborhood time series features are also continuous and
passed to the following layer. The model is trained hourly coherent. Thus, we included the ANN SRE to increase
over the subsequent 48 hours, since the current status varies model sensitivity by considering spatial-temporal neighbor-
with respect to its effect on future time intervals. Thus, hood impacts.
we pair the inputs with target feature deviations in the various 3) TERRAIN EXTRACTOR
time intervals to train multiple models with the same structure
The relationships between locations vary due to various bar-
corresponding to the different time intervals. The advantage
riers and altitude differences. Therefore, terrain related data
of this structure is that the input sizes are constant, regardless
were included to enhance location correlations. Terrain data
of the location and time interval.
in the vicinity of the locations were captured using a matrix
of 121 square sections; i.e., 11×11 coordinate lines at 500 m
1) TEMPORAL RELATIONS intervals. We adopted the approach of Ferrero et al. [25]
The TRE uses historical target location features as inputs to define the relationships between terrain and PM2.5. The
to predict the future time series. The input time series for elevation of each point, elev, was normalized as
PM2.5 and other concentrations are continuous and coherent,
and can be divided into low frequency (trends) and high elev − elevst
frequency (rapid changes) information. Hs = (10)
elevst
VOLUME 6, 2018 38191
P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

FIGURE 9. Proposed prediction model (ST-DNN) structure.

and transformed to the relative elevation,

1
elevrel = (11)
eHs
where Hs is the standardize elevation, to decrease the impact
of higher altitudes, as shown in Figure 10. Figure 11 shows
that PM2.5 distribution is strongly related to elevation.

FIGURE 11. Relative elevation function of PM2.5 distribution.

global inputs for prediction. In some cases, local informa-

tion is more relevant than global, e.g., when air circulation
between locations is weak. On the other hand, global disper-
sion may be a major factor determining air quality when wind
speed is high. Thus, we looked for meteorological condition
FIGURE 10. Relative elevation function design [25]. trends at a given location, such as wind speed, wind direction,
humidity, temperature, etc., to weight prediction calculations
We then calculated AQIs for each location using IDW and provided by the three components.
multiplied this by elevrel to reduce location impact at higher
elevations. Thus, we could extract relationships between loca- V. EXPERIMENTS
tions that would otherwise have been obscured, such as the A. DATASETS
impact of wind direction and wind speed for locations adja- We chose PM2.5 as the predictive feature because it is the
cent to mountains. The CNN was designed to extract useful most widely reported metric and also the most difficult air
information particularly in the spatial domain. We used the pollutant to predict. Similar architectures can be applied to
CNN to include spatial correlations between locations and predict other pollutants.
extract the obscured terrain relationships.
1) TAIWAN DATASET
4) MERGE LAYER We collected air quality and meteorological data every
We concatenated the TRE, SRE, and TE outcomes, and hour from 76 locations in 23 Taiwanese cities. Each loca-
passed these to the ANN. Thus, the model applied local and tion recorded PM2.5 and PM10, but not all locations

38192 VOLUME 6, 2018

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

FIGURE 12. TWEPA PM2.5 index for the Taiwan dataset [26].

FIGURE 13. Influence of past h hours (x-axis: hour (s), y-axis: µg/m3 ).

recorded levels of other pollutants or meteorological data. performance,

More than 28 million instances were collected from Pn
January 2014 to September 2017 by Taiwan Environmental 1 |ŷi − yi |
e= (12)
Protection Administration (TWEPA). PM2.5 was measured n
using Met-One BAM-1020 and other measurement instru- where ŷi and yi are the prediction and ground truth for the ith
ments are shown in [27]. Data was partitioned into a training hour, respectively; and n is the number measurements within
data (Jan. 2014 to Sep. 2016) and testing set (Oct. 2016 to a time interval.
Sep. 2017) at 2:1 ratio, based on seasonal cycles in Taiwan, We calculated MAE for 1-6, 7-12, 13-24, and 25-48 hours,
where the testing set covers all four seasons. which are common time frames in conventional weather
Figure 12 shows the TWEPA PM2.5 index over the study forecasting. Lower absolute error indicates higher prediction
period, and TWEPA forecast data was not included. The accuracy.
proposed PM2.5 prediction used PM2.5, PM10, wind speed,
wind direction, temperature, and humidity features. C. COMPARATIVE MODELS AND PARAMETER SETTINGS
We compared the proposed ST-DNN method predictions with
a number of baselines.
2) BEIJING DATASET
The Beijing dataset focused on Beijing city [15], excluding 1) Baselines that feed all features into a single model,
weather forecasts because they were based on physical mod- e.g., linear regression (LR_ALL) and neural net-
els. We modified the proposed model to predict min-max work (ANN_ALL), without treating various features
outputs to ensure a fair comparison. The dataset contained differently.
records from May 2014 to Apr. 2015. Months 5, 6, 8, 9, 11, 2) Baselines using kNN-ED, kNN-DTWD, and both
12 of 2014 and 2, 3 of 2015 were adopted as training data methods (referred to as adaptive baselines), to identify
and the balance as testing data, covering the last month of the top k related locations.
each season, i.e., 7, 10 of 2014 and 1, 4 of 2015. 3) Zheng et al. [5] proposed a predictive model that
considered local spatial data similar to the approach
adopted in the current study but used average neighbor
B. METRICS AND GROUND TRUTH values over a specified region.
Air quality prediction were compared with ground truth To determine suitable lookback time, h, for the models,
results obtained at each location, and the mean absolute we evaluated different h for predicting the next hour, t + 1,
error (MAE) [28]–[31] was adopted to evaluate prediction as shown in Figure 13. Hence, we chose h = 6 hours

VOLUME 6, 2018 38193

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

as the minimum time interval for meteorological for further hours. The Zheng model [5] is superior to the
forecasting [23], [24]. We also used an independent LSTM other considered models except ST-DNN(C) for 4-6 hour
for each feature with h = 6 hours, and chose k = 3 for predictions, since that model partitions regions at 30, 100, and
kNN-ED and kNN-DTWD in the proposed ST-DNN model. 300 km, which helps detect diffusion from distant locations.
We used a 5Ã-5 filter for the CNN with one convolutional This phenomenon happens as Zheng partitions regions with
layer. The CNN did not include a max pooling layer since distances of 30km, 100km and 300km, which help to detect
we did not extract the highest concentrations but the most diffusions from other long distanced places. The proposed
strongly related concentrations. We chose a linear activation ST-DNN methods choose nearby or recently similar locations
function to consider negative correlations. We used a total that have insufficient diffusion data. Although ST-DNN(C)
of 4276 parameters for the ST-DNN models for the Taiwan has less information, it surpasses all other model perfor-
dataset. mances (see Figure 15). In contrast, feeding all the data into
ANN and Linear Regression models provided poorer pre-
D. PERFORMANCE OF PREDICTIONS diction performance due to interference between locations.
1) TAIWAN DATASET Generally, kNN-DTWD based models show superior perfor-
We first checked if the input components to the ST-DNN mance to those based on kNN-ED, and Adaptive methods
model were significant, examining all Adaptive ANN (A), perform similarly to kNN-DTWD based models.
LSTM (L), and CNN (C) combinations to identify the best
models, as shown in Figure 14. For the first hour predic-
tion, we found that ST-DNN models with all components
(A+L+C) outperformed all other models, and the CNN only
model outperformed all other combinations for 2-6 hour pre-
dictions. Therefore, we include only A+L+C and C models
in further discussions.

FIGURE 16. Eastern area short period (1-6 hour) prediction performance.

FIGURE 14. Proposed ST-DNN prediction model with different component

combinations for the Taiwan Datase.

FIGURE 17. Northeastern area short period (1-6 hour) prediction

performance.

Figure 16 and Figure 17 compare overall model perfor-

mance in different cities. Most model trends are similar,
with ST-DNN(C) outperforming all other models in some
FIGURE 15. Short period (1-6 hour) prediction comparisons for the
cities, including Hua-Tung (Figure 16) and ILan (Figure 17),
Taiwan dataset. which have sparse locations and complicated terrain. CNN
extracted neighborhood elevation and determined diffusion
Figure 15 compares overall model performance at all delays (direction and time) for target features. For example,
locations. The proposed ST-DNN(A+L+C) model exhibits when predicting t + 3, higher weight filters were farther from
superior performance to all models to predict the next hour, the target location. Thus, CNN may be a good approach for
whereas the ST-DNN(C) has superior prediction performance further study.

38194 VOLUME 6, 2018

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

However, CNN also has limitations where reliable data

for each coordinate are unavailable. This study used IDW
to interpolate location data for each coordinate. The Zheng
model also showed poor performance in these areas due to
the complex terrain, with mean and mode in partitions leading
to inaccurate estimation. However, the Zheng model exhibits
superior performance in the southern area, particularly for
3-6 hour. The southern area is somewhat flatter, as shown
in Figure 18, and hence a given location may be significantly
affected by dispersion from other cities.

FIGURE 21. Basin city short period (1-6 hour) prediction performance.

FIGURE 18. Southern area short period (1-6 hour) prediction

performance.

FIGURE 22. Mountain city short period (1-6 hour) prediction performance.

FIGURE 19. Flatland city short period (1-6 hour) prediction performance.

FIGURE 23. Island short period (1-6 hour) prediction performance.

mountain city, and island, respectively. ST-DNN(C) exhibits

the best flatland performance (Figures 19 and 20) for all
locations, with the Zheng model superior to other models
for 5-6 hour, except ST-DNN(C), due to the lack of distant
location data for the other models. However, the Zheng model
performs poorly for complex terrain (Figures 21–23).
Overall, ST-DNN(A+L+C) provides superior prediction
for the immediate next hour, whereas ST-DNN(C) provides
superior prediction for 2-6 hour. This dataset shows that
different places should use different models considering the
FIGURE 20. Flatland suburbs short period (1-6 hour) prediction
performance.
impact of distances, delays, and terrains among locations
and surroundings. The proposed approach always provides
Figures 19–23 group the Taiwan dataset prediction results superior performance to adaptive baseline models verifying
by different terrain: flatland city, flatland suburb, basin city, that considering individual feature trends is more appropriate

VOLUME 6, 2018 38195

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

than combining inputs. The proposed models also outperform E. BEHAVIOUR OF PROPOSED MODEL
the Zheng, confirming the importance of observations at indi- 1) ANALYSIS OF LSTM AND CNN
vidual locations, rather than using neighborhood averages. We used the Taiwan dataset Tainan training data to inves-
Thus, the proposed ST-DNN(C) model is a suitable approach tigate patterns between different delays and identify LTSM
for further study. prediction improvements, comparing PM2.5 variations from
tc+0 versus tc+1 , tc+0 versus tc+2 and tc+0 versus tc+3 as
2) BEIJING DATASET shown in Figures 26–28, respectively. Very short prediction
We modified the models somewhat for the Beijing dataset to (Figure 26) exhibits almost linear then a sharp rise and sub-
ensure robust prediction. We neglect forecast data, since these sequent sharp drop; whereas this linear relationship disperses
were based on physical models, and omitted the CNN compo- for longer prediction times (Figures 27 and 28). Thus, LSTM
nent from the proposed ST-DNN model since elevations were only helps improve first hour prediction.
not available in Beijing.

FIGURE 26. Example relationships between tc+0 and tc+1 .

FIGURE 24. Short period (1-6 hour) prediction performance for the
Beijing dataset.

Figure 24 shows that the proposed Adaptive_LSTM, kNN-

DTWD_ANN, and Adaptive_ANN models outperform all
other models. Linear regression models exhibit the poorest
prediction among the considered models. The Zheng model
exhibits intermediate performance between ANN and linear
regression, because using TRE with region mean or mode
results in loss of sensitivity to other locations. There are also
mountainous regions in the north, northwest, and west of
Beijing City, and partitioning across mountain regions also
FIGURE 27. Example relationships between tc+0 and tc+2 .
reduces precision.

FIGURE 28. Example relationships between tc+0 and tc+3 .

FIGURE 25. Extended time interval prediction performance for the Beijing
dataset. Limitations of DTW mean that kNN-DTWD chooses the
most similar candidate locations but neglects long time delay.
Figure 25 shows that overall model performances are sim- However, the delay interval is important for long term pre-
ilar to short time performances (Figure 24). dictions. The proposed method could be improved by shifting

38196 VOLUME 6, 2018

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

delayed sequences for DTW identified candidates and choos-

ing the most relevant locations. Comparing different com-
ponents combinations in ST-DNN, CNN always improves
model performance.

TABLE 1. Proposed model with CNN (ST-DNN(C)) with and without

elevation.

FIGURE 29. Nearest neighbor (k) effect on prediction performance.

Including relative elevation helps reduce location inter-
ference and provides excellent prediction performance
for 1-6 hour with ST-DNN(C). Table 1 compares perfor- standards [26]. In addition, users can see the predictions of
mances with and without relative elevations, confirming the PM2.5 for the next 1-6 hour. More details can be shown by
observed improvement. Lu et al. [32] showed that fine par- clicking on a specific station, which opens a pop-up chart
ticulate matter concentrations decrease at higher altitude, showing recent trends in the air quality and meteorological
hence relative elevation is important and should be con- conditions. Clicking on the analysis of historical data allows
sidered in further studies. CNN can also extract the delay users to select results for past 1, 3, or 7 days. Clicking on the
factor from surrounding target features, improving prediction replay icon opens a timeline, which allows users to check the
performance. air quality at any time in a particular location to observe
In particular, Taiwan includes significant many mountain- the PM2.5 diffusion.
ous and hilly terrain regions. Since CNN with terrain factors
can learn the propagation patterns from PM2.5 variations of
the specific location and its neighbor locations, the proposed
ST-DNN(C) model was superior to LSTM, which only con-
sidered time series variation at the given location. Hence,
the experimental results confirm that CNN provides superior
prediction performance in complex terrain.
Thus, the proposed ST-DNN(A+L+C) model provided
superior performance for first hour predictions, but the pro-
posed ST-DNN(C) model provides superior longer time
frame predictions through 2-6 hour. Overall, ST-DNN has
two main characteristics: selecting spatial-temporal candi- FIGURE 30. Website of QQ air quality.
dates using kNN-ED and kNN-DTWD increased model
sensitivity; and including terrain information using CNN
incorporates elevation impacts.

2) INFLUENCE OF k
Figure 29 compares kNN-ED and kNN-DTWD with ANN
to investigate the performance effects of k. kNN-ED
MAE increases as k increases, whereas kNN-DTWD MAE
decreases. Thus, kNN-DTWD outperforms kNN-ED, which
only considers spatial adjacent locations, omitting ter-
rain, whereas kNN-DTWD chooses temporal similar loca-
tions, with similar responses, relaxing geographical impact
restrictions.
FIGURE 31. Dashboard of QQ air quality.
VI. REAL APPLICATIONS FOR THE PROPOSED MODEL
Figure 30 shows the web user interface of the proposed
system [16], where icons on the map represent monitoring Figure 31 shows the dashboard interface of the real-time
stations and the number associated with an icon denotes PM2.5 and the PM2.5 predictions for the next 1-6 hour.
its PM2.5 concentration. The color of the icons indicates The dashboard will automatically update information every
the air quality at that location based on TWEPA PM2.5 hour. In addition, users can choose any specific station on

VOLUME 6, 2018 38197

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

their demands. This service aims to help people prevent the ACKNOWLEDGMENT
exposure of unhealthy air. The authors are grateful to the Taiwan Environmental Protec-
tion Administration for providing the monitoring data used in
this study.

REFERENCES
[1] Y.-F. Xing, Y.-H. Xu, M.-H. Shi, and Y.-X. Lian, ‘‘The impact of PM2.5
on the human respiratory system,’’ J. Thoracic Disease, vol. 8, no. 1,
pp. E69–E74, 2016.
[2] H.-J. Chu, C.-Y. Lin, C.-J. Liau, and Y.-M. Kuo, ‘‘Identifying controlling
factors of ground-level ozone levels over southwestern Taiwan using a
decision tree,’’ Atmos. Environ., vol. 60, pp. 142–152, Dec. 2012.
[3] A. P. K. Tai, L. J. Mickley, and D. J. Jacob, ‘‘Correlations between fine
particulate matter (PM2.5 ) and meteorological variables in the United
States: Implications for the sensitivity of PM2.5 to climate change,’’ Atmos.
Environ., vol. 44, no. 32, pp. 3976–3984, 2010.
[4] C.-M. Liu, C.-Y. Young, and Y.-C. Lee, ‘‘Influence of Asian dust storms on
air quality in Taiwan,’’ Sci. Total Environ., vol. 368, nos. 2–3, pp. 884–897,
2006.
FIGURE 32. Facebook chatbot of QQ air quality.
[5] Y. Zheng et al., ‘‘Forecasting fine-grained air quality based on big data,’’
in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
New York, NY, USA, 2015, pp. 2267–2276.
In addition, we develop the Facebook chatbot to send the [6] Y. Hwa-Lung and W. Chih-Hsin, ‘‘Retrospective prediction of intraurban
PM2.5 forecasts to users who subscribed the daily report of spatiotemporal distribution of PM2.5 in Taipei,’’ Atmos. Environ., vol. 44,
no. 25, pp. 3053–3065, 2010.
specific stations, as shown in Figure 32. Users can query the [7] B. S. Beckerman et al., ‘‘A hybrid approach to estimating national scale
current PM2.5 by typing the station name or sending current spatiotemporal variability of PM2.5 in the contiguous united states,’’ Envi-
GPS location. This service benefits people to easily use it on ron. Sci. Technol., vol. 47, no. 13, pp. 7233–7241, 2013.
[8] H.-J. Chu, H.-L. Yu, and Y.-M. Kuo, ‘‘Identifying spatial mixture distribu-
mobile phone or PC, which reminds people the air quality tions of PM2.5 and PM10 in Taiwan during and after a dust storm,’’ Atmos.
before going outside. Environ., vol. 54, pp. 728–737, Jul. 2012.
[9] H.-W. Chen, C.-T. Tsai, C.-W. She, Y.-C. Lin, and C.-F. Chiang, ‘‘Explor-
ing the background features of acidic and basic air pollutants around an
VII. CONCLUSION industrial complex using data mining approach,’’ Chemosphere, vol. 81,
This study proposed an air quality forecasting system using no. 10, pp. 1358–1367, 2010.
data driven models, ST-DNN, to predict PM2.5 over 48 hours. [10] A. Kurt and A. B. Oktay, ‘‘Forecasting air pollutant indicator levels with
geographic models 3 days in advance using neural networks,’’ Expert Syst.
The proposed method is also generally applicable to other Appl., vol. 37, no. 12, pp. 7986–7992, 2010.
pollutants, etc. [11] Y. Zheng, F. Liu, and H.-P. Hsieh, ‘‘U-air: When urban air quality inference
The proposed ST-DNN shows that including an LSTM meets big data,’’ in Proc. 19th ACM SIGKDD Int. Conf. Knowl. Discovery
Data Mining, 2013, pp. 1436–1444.
module enhanced first hour predictions, with CNN module [12] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, ‘‘Gradient-based learn-
inclusion being more useful for longer time frame predic- ing applied to document recognition,’’ Proc. IEEE, vol. 86, no. 11,
tions, since CNN can extract the temporal delay factor from pp. 2278–2324, Nov. 1998.
[13] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’ Neural
surrounding target features by learning spatial information. Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
We evaluated the proposed models using real-world [14] Air Quality Index Historical Data. Accessed: Aug. 22, 2016. [Online].
Taiwan and Beijing datasets. Relevant location selection was Available: http://taqm.epa.gov.tw/taqm/tw/YearlyDataDownload.aspx
[15] Y. Zheng et al. (2015). Forecasting Fine-Grained Air Quality Based on
verified to be important, with inclusion of all locations caus- Big Data. [Online]. Available: http://research.microsoft.com/apps/pubs/
ing increased model noise and hence poorer prediction per- ?id=246398
formance. the proposed methods outperformed all baselines [16] QQ Air Quality(QQAQ). Accessed: Jun. 6, 2018. [Online]. Available:
https://qqaq.ee.ncku.edu.tw
and comparative models considered.
[17] QQAQ Facebook Fan Page. Accessed: Jun. 6, 2018. [Online]. Available:
Future research will improve the proposed model perfor- https://www.facebook.com/QQAirQuality/
mances, and consider specific Airbox sensor source mod- [18] S. Qin, F. Liu, C. Wang, Y. Song, and J. Qu, ‘‘Spatial-temporal analysis and
els as features to tune and mitigate noise due to machine projection of extreme particulate matter (PM10 and PM2.5 ) levels using
association rules: A case study of the Jing-Jin-Ji region, China,’’ Atmos.
differences [33], [34]. We will also consider more chemical Environ., vol. 120, pp. 339–350, Nov. 2015.
features that affect PM2.5 components [35]. For long period [19] P.-W. Soh, K.-H. Chen, J.-W. Huang, and H.-J. Chu, ‘‘Spatial-temporal
predictions, we will consider concentric circles for different pattern analysis and prediction of air quality in taiwan,’’ in Proc. Int. Conf.
Ubi-Media Comput. (UMedia), Aug. 2017, pp. 1–6.
distance partitions or clusters to emphasize air pollution prop- [20] T. Rakthanmanon et al., ‘‘Searching and mining trillions of time series
agation delay effects. subsequences under dynamic time warping,’’ in Proc. 18th ACM SIGKDD
Ultimately, we intend to detect air pollution sources, Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 262–270.
[21] E. Keogh and C. A. Ratanamahatana, ‘‘Exact indexing of dynamic time
including domestic and transboundary pollution. To control warping,’’ Knowl. Inf. Syst., vol. 7, no. 3, pp. 358–386, 2005.
the air pollution, we must first understand how it is generated [22] D. Shepard, ‘‘A two-dimensional interpolation function for irregularly-
and propagated. Only then can we devise effective solutions spaced data,’’ in Proc. 23rd ACM Nat. Conf., 1968, pp. 517–524.
[23] METAPP. Accessed: Jun. 6, 2018. [Online]. Available: http://www.
to reduce pollution sources. Therefore, future pollution stud- metapp.org.tw/index.php/weatherknowledge/37-typhoon/83-2009-01-22-
ies will be greatly dependent on the proposed model. 08-04-48

38198 VOLUME 6, 2018

P.-W. Soh et al.: Adaptive Deep Learning-Based Air Quality Prediction Model

[24] H. L. Chang, ‘‘Evaluation and application of the short-range (0-6hr) pqpfs JIA-WEI CHANG received the Ph.D. degree from
from an ensemble prediction system based on laps,’’ Ph.D. dissertation, the Department of Engineering Science, National
Graduate Inst. Atmos. Phys., Nat. Central Univ., Taoyuan, Taiwan, 2014. Cheng Kung University in 2017. He is currently
[25] L. Ferrero, G. Mocnik, B. S. Ferrini, M. G. Perrone, G. Sangiorgi, and an Adjunct Assistant Professor with the Depart-
E. Bolzacchini, ‘‘Vertical profiles of aerosol absorption coefficient from ment of Engineering Science and a Post-Doctoral
micro-Aethalometer data and Mie calculation over Milan,’’ Sci. Total Fellow Researcher with the Department of Electri-
Environ., vol. 409, no. 14, pp. 2824–2837, 2011. cal Engineering, National Cheng Kung University.
[26] PM2.5 Index. Accessed: Aug. 22, 2016. [Online]. Available:
His research interests include natural language
http://taqm.epa.gov.tw/taqm/en/fpmi.htm
processing, artificial intelligence, data mining, and
[27] TWEPA Instruments. Accessed: Jun. 6, 2018. [Online]. Available:
https://taqm.epa.gov.tw/taqm/tw/b0102-3.aspx e-learning technologies.
[28] Y. Pan, Y. Liu, B. Xu, and H. Yu, ‘‘Hybrid feedback feedforward: An
efficient design of adaptive neural network control,’’ Neural Netw., vol. 76,
pp. 122–134, Apr. 2016.
[29] J. de Jesús Rubio, ‘‘Stable Kalman filter and neural network for the chaotic
systems identification,’’ J. Franklin Inst., vol. 354, no. 16, pp. 7444–7462,
2017.
[30] Y. Pan and H. Yu, ‘‘Biomimetic hybrid feedback feedforward neural-
network learning control,’’ IEEE Trans. Neural Netw. Learn. Syst., vol. 28,
no. 6, pp. 1481–1487, Jun. 2017.
[31] J. de Jesús Rubio, ‘‘Error convergence analysis of the SUFIN and
CSUFIN,’’ Appl. Soft Comput., to be published. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S1568494618301881
[32] S.-J. Lu, D. Wang, X.-B. Li, Z. Wang, Y. Gao, and Z.-R. Peng, ‘‘Three-
dimensional distribution of fine particulate matter concentrations and syn-
chronous meteorological data measured by an unmanned aerial vehicle
(UAV) in yangtze river delta, China,’’ Atmos. Meas. Techn. Discuss.,
pp. 1–19, Mar. 2016. [Online]. Available: https://www.atmos-meas-tech-
discuss.net/amt-2016-57/
[33] L.-J. Chen, Y.-H. Ho, H.-H. Hsieh, S.-T. Huang, H.-C. Lee, and S. Mahajan,
‘‘ADF: An anomaly detection framework for large-scale PM2.5 sensing
systems,’’ IEEE Internet Things J., vol. 52, no. 2, pp. 559–570, Aug. 2018.
[34] N. Moustafa, G. Creech, E. Sitnikova, and M. Keshk. (2017). ‘‘Col-
laborative anomaly detection framework for handling big data of cloud JEN-WEI HUANG received the B.S. and Ph.D.
computing.’’ [Online]. Available: https://arxiv.org/abs/1711.02829 degrees in electrical engineering from National
[35] W. M. Hodan and W. R. Barnard, ‘‘Evaluating the contribution of PM2.5
Taiwan University, Taiwan, in 2002 and 2009,
precursor gases and re-entrained road emissions to mobile source PM2.5
respectively. He was a Visiting Scholar with
particulate matter emissions,’’ MACTEC Federal Programs, Research
Triangle Park, NC, USA, 2004. the IBM Almaden Research Center from
2008 to 2009, an Assistant Professor with Yuan
Ze University from 2009 to 2012, and a Visiting
PING-WEI SOH received the B.S degree in elec- Scholar with the University of Chicago in 2016.
trical engineering from National Cheng Kung Uni- He is currently an Assistant Professor with the
versity, Tainan, Taiwan, in 2015, and the M.S Department of Electrical Engineering, National
degree from the Institute of Computer and Com- Cheng Kung University, Taiwan. He majors in computer science and is
munication Engineering, National Cheng Kung familiar with data mining. His research interests include data mining, mobile
University, in 2018. His research interests include computing, and bioinformatics. Among these, social network analysis,
spatial-temporal and air pollution issues. spatial-temporal data mining, and multimedia information retrieval are his
special interests. In addition, some of his research is on data broadcasting,
privacy preserving data mining, e-learning, and Fin-tech.

VOLUME 6, 2018 38199

What Is The Airline Industry
100% (2)
What Is The Airline Industry
5 pages
1 s2.0 S1470160X24010665 Main
No ratings yet
1 s2.0 S1470160X24010665 Main
16 pages
Deep Learning Based Multimodal Urban Air Quality Prediction and Traffic Analytics
No ratings yet
Deep Learning Based Multimodal Urban Air Quality Prediction and Traffic Analytics
19 pages
Deep Air Quality Forecasting Using Hybrid Deep
No ratings yet
Deep Air Quality Forecasting Using Hybrid Deep
14 pages
Idt 1325610
No ratings yet
Idt 1325610
10 pages
Airqualitypridiction
No ratings yet
Airqualitypridiction
7 pages
s44273-023-00005-w
No ratings yet
s44273-023-00005-w
22 pages
Research Paper Model
No ratings yet
Research Paper Model
4 pages
Timezone-Aware_Auto-Regressive_Long_Short-Term_Memory_Model_for_Multipollutant_Prediction
No ratings yet
Timezone-Aware_Auto-Regressive_Long_Short-Term_Memory_Model_for_Multipollutant_Prediction
9 pages
RP5
No ratings yet
RP5
9 pages
3-Day-Ahead Forecasting of Regional Pollution Index For The Pollutants NO2, CO, SO2, and O3 Using Artificial Neural Networks in Athens, Greece
No ratings yet
3-Day-Ahead Forecasting of Regional Pollution Index For The Pollutants NO2, CO, SO2, and O3 Using Artificial Neural Networks in Athens, Greece
15 pages
PM2 5 Air Pollution Prediction Through Deep Learning Using Multisource Meteorological Wildfire and Heat Data
No ratings yet
PM2 5 Air Pollution Prediction Through Deep Learning Using Multisource Meteorological Wildfire and Heat Data
20 pages
Research Paper Model
No ratings yet
Research Paper Model
4 pages
Air Quality Prediction
No ratings yet
Air Quality Prediction
8 pages
Modeling Air Quality Prediction Using A Deep Learning Approach Method Optimization and Evaluation
No ratings yet
Modeling Air Quality Prediction Using A Deep Learning Approach Method Optimization and Evaluation
26 pages
1 s2.0 S2212095523000123 Main
No ratings yet
1 s2.0 S2212095523000123 Main
18 pages
2797 8011 1 PB
No ratings yet
2797 8011 1 PB
3 pages
Air Quality Forecasting Using Machine Learning
No ratings yet
Air Quality Forecasting Using Machine Learning
12 pages
1 s2.0 S0269749122011873 Main
No ratings yet
1 s2.0 S0269749122011873 Main
13 pages
AiCareAir_Hybrid-Ensemble_Internet-of-Things_Sensing_Unit_Model_for_Air_Pollutant_Control (1)
No ratings yet
AiCareAir_Hybrid-Ensemble_Internet-of-Things_Sensing_Unit_Model_for_Air_Pollutant_Control (1)
8 pages
Deepairnet: Applying Recurrent Networks For Air Quality Prediction Deepairnet: Applying Recurrent Networks For Air Quality Prediction
No ratings yet
Deepairnet: Applying Recurrent Networks For Air Quality Prediction Deepairnet: Applying Recurrent Networks For Air Quality Prediction
10 pages
Sensors: A Deep CNN-LSTM Model For Particulate Matter (PM) Forecasting in Smart Cities
No ratings yet
Sensors: A Deep CNN-LSTM Model For Particulate Matter (PM) Forecasting in Smart Cities
22 pages
Major Project Synopsis
No ratings yet
Major Project Synopsis
9 pages
MDPI Sandy S Special Issue
No ratings yet
MDPI Sandy S Special Issue
14 pages
An Efficient Implementation of ARIMA Technique for Air Quality Prediction
No ratings yet
An Efficient Implementation of ARIMA Technique for Air Quality Prediction
7 pages
Air Pollution Forecasting Using a Deep Learning Model Based on 1D Convnets and Bidirectional GRU
No ratings yet
Air Pollution Forecasting Using a Deep Learning Model Based on 1D Convnets and Bidirectional GRU
9 pages
Mani_2021_J._Phys.__Conf._Ser._2115_012016
No ratings yet
Mani_2021_J._Phys.__Conf._Ser._2115_012016
13 pages
Deep Air Learning
100% (1)
Deep Air Learning
13 pages
Design and Analysis of Air Pollution Concentration Prediction Models Using Transfer Learning and Recurrent Neural Networks
No ratings yet
Design and Analysis of Air Pollution Concentration Prediction Models Using Transfer Learning and Recurrent Neural Networks
8 pages
Science of The Total Environment
No ratings yet
Science of The Total Environment
12 pages
ssrn-4885242
No ratings yet
ssrn-4885242
15 pages
Plag
No ratings yet
Plag
40 pages
2021-Delhi Air Quality Prediction Using LSTM Deep Learning Models With A Focus On COVID-19 Lockdown 2102.10551
No ratings yet
2021-Delhi Air Quality Prediction Using LSTM Deep Learning Models With A Focus On COVID-19 Lockdown 2102.10551
18 pages
ieeeee
No ratings yet
ieeeee
6 pages
paperf
No ratings yet
paperf
6 pages
Air Pollution Prediction Using Long Short-Term Memory (LSTM) and Deep Autoencoder (DAE) Models
No ratings yet
Air Pollution Prediction Using Long Short-Term Memory (LSTM) and Deep Autoencoder (DAE) Models
17 pages
Applied Sciences: A Comparative Analysis For Air Quality Estimation From Traffic and Meteorological Data
No ratings yet
Applied Sciences: A Comparative Analysis For Air Quality Estimation From Traffic and Meteorological Data
20 pages
A Deep Learning Approach For Forecasting Air Pollution in South Korea Using LSTM
No ratings yet
A Deep Learning Approach For Forecasting Air Pollution in South Korea Using LSTM
6 pages
Air Quality Prediction Using LSTM Algorithm and Arduino: Ii. Literature Review
No ratings yet
Air Quality Prediction Using LSTM Algorithm and Arduino: Ii. Literature Review
7 pages
AiCareBreath_IoT-Enabled_Location-Invariant_Novel_Unified_Model_for_Predicting_Air_Pollutants_to_Avoid_Related_Respiratory_Disease
No ratings yet
AiCareBreath_IoT-Enabled_Location-Invariant_Novel_Unified_Model_for_Predicting_Air_Pollutants_to_Avoid_Related_Respiratory_Disease
9 pages
A Predictive Data Feature Exploration-Based Air Quality Prediction Approach
No ratings yet
A Predictive Data Feature Exploration-Based Air Quality Prediction Approach
12 pages
Air Quality With Machine Learning
No ratings yet
Air Quality With Machine Learning
17 pages
A_Deep_Learning_Approach_Using_Graph_Neural_Networks_for_Anomaly_Detection_in_Air_Quality_Data_Considering_Spatiotemporal_Correlations
No ratings yet
A_Deep_Learning_Approach_Using_Graph_Neural_Networks_for_Anomaly_Detection_in_Air_Quality_Data_Considering_Spatiotemporal_Correlations
15 pages
An LSTM Based Aggregated Model For Air Pollutio 2020 Atmospheric Pollution R
No ratings yet
An LSTM Based Aggregated Model For Air Pollutio 2020 Atmospheric Pollution R
13 pages
3
No ratings yet
3
20 pages
3
No ratings yet
3
6 pages
mahajan2017 (1)
No ratings yet
mahajan2017 (1)
7 pages
Air quality assessment and pollution forecasting
No ratings yet
Air quality assessment and pollution forecasting
19 pages
Ieee Template (2) Review 2 Mohan
No ratings yet
Ieee Template (2) Review 2 Mohan
8 pages
An Effective Air Pollution Prediction Model Using Machine Learning Algorithms
No ratings yet
An Effective Air Pollution Prediction Model Using Machine Learning Algorithms
8 pages
Integrated Multiple Directed Attention-Based Deep Learning For Improved Air Pollution Forecasting
No ratings yet
Integrated Multiple Directed Attention-Based Deep Learning For Improved Air Pollution Forecasting
15 pages
Future Air Quality Prediction Using Long Short-Term Memory Based on Hyper Heuristic Multi-Chain Model
No ratings yet
Future Air Quality Prediction Using Long Short-Term Memory Based on Hyper Heuristic Multi-Chain Model
16 pages
Air Quality Prediction
No ratings yet
Air Quality Prediction
2 pages
Air Quality Prediction Using Machine Learning Algorithms
100% (1)
Air Quality Prediction Using Machine Learning Algorithms
4 pages
Air Quality Index Prediction Using Bi-LSTM
No ratings yet
Air Quality Index Prediction Using Bi-LSTM
8 pages
1-s2.0-S0957417423014239-main
No ratings yet
1-s2.0-S0957417423014239-main
11 pages
Sci Paper Iot
No ratings yet
Sci Paper Iot
16 pages
Paper-2
No ratings yet
Paper-2
12 pages
Bayesian Network Reasoning and Machine Learning With Multiple Data Features
No ratings yet
Bayesian Network Reasoning and Machine Learning With Multiple Data Features
18 pages
Review_paper...bbbbbbb
No ratings yet
Review_paper...bbbbbbb
7 pages
Training and Human Resource Considerations for Nuclear Facility Decommissioning
From Everand
Training and Human Resource Considerations for Nuclear Facility Decommissioning
IAEA
No ratings yet
Types of Compressors
No ratings yet
Types of Compressors
16 pages
Anfis Weather
No ratings yet
Anfis Weather
6 pages
Hydraulic Servo Motor System
No ratings yet
Hydraulic Servo Motor System
5 pages
A Review On Maintenance Strategies For PV Systems - Elsevier Enhanced Reader-Dikompresi
No ratings yet
A Review On Maintenance Strategies For PV Systems - Elsevier Enhanced Reader-Dikompresi
25 pages
Skema Instalasi Sistem 240 Volt PLTH
No ratings yet
Skema Instalasi Sistem 240 Volt PLTH
1 page
Analysis of RAM (Reliability, Availability, Maintainability) Production of Electric Voltage From 48 V PV (Photovoltaic) at Pantai Baru Pandansimo, Indonesia
No ratings yet
Analysis of RAM (Reliability, Availability, Maintainability) Production of Electric Voltage From 48 V PV (Photovoltaic) at Pantai Baru Pandansimo, Indonesia
9 pages
Reliability Evaluation of Electric Power Generation Systems With Solar Power
No ratings yet
Reliability Evaluation of Electric Power Generation Systems With Solar Power
83 pages
Application of Statistical Process Control Chart For Monitoring Electric Power Losses Through Transmission and Distribution System
No ratings yet
Application of Statistical Process Control Chart For Monitoring Electric Power Losses Through Transmission and Distribution System
7 pages
Design and Evaluation of A Diaphragm For Electrocardiography in Eelectronic Stethoscopes
No ratings yet
Design and Evaluation of A Diaphragm For Electrocardiography in Eelectronic Stethoscopes
8 pages
Vocabulary Files C1 Unit 14
No ratings yet
Vocabulary Files C1 Unit 14
23 pages
DNV Lidar Verfication Example Report ZX Lidars 2021-08-01
No ratings yet
DNV Lidar Verfication Example Report ZX Lidars 2021-08-01
38 pages
09 Evaporation
No ratings yet
09 Evaporation
4 pages
Camellia Japonica
No ratings yet
Camellia Japonica
8 pages
12green Roof Pre Fabricated Poland Sample
No ratings yet
12green Roof Pre Fabricated Poland Sample
25 pages
activities-snake-bite-chapter-1
No ratings yet
activities-snake-bite-chapter-1
6 pages
Vocab Lists
No ratings yet
Vocab Lists
6 pages
Metamorphosis 4000 Version
No ratings yet
Metamorphosis 4000 Version
23 pages
Copaifera Prioritaria
No ratings yet
Copaifera Prioritaria
12 pages
BINFORD Constructing Frames
No ratings yet
BINFORD Constructing Frames
586 pages
Different Step of Processing The AERMOD Modeling
No ratings yet
Different Step of Processing The AERMOD Modeling
10 pages
Process Heating: Technology - Concept
No ratings yet
Process Heating: Technology - Concept
34 pages
Manual de Operacion Manlift - Haulotte
No ratings yet
Manual de Operacion Manlift - Haulotte
114 pages
Type 3,14
No ratings yet
Type 3,14
59 pages
Synoptic Conditions Associated To Extreme Concentrations of Fine Particles (PM2.5) in Coyhaique, Chilean Patagonia
No ratings yet
Synoptic Conditions Associated To Extreme Concentrations of Fine Particles (PM2.5) in Coyhaique, Chilean Patagonia
1 page
Reading - ToEFL Exercise 1
No ratings yet
Reading - ToEFL Exercise 1
4 pages
1 - Power System Management - Planning-Forecasting - 97-119
No ratings yet
1 - Power System Management - Planning-Forecasting - 97-119
119 pages
Novel Watchful Eyes Part1
100% (2)
Novel Watchful Eyes Part1
17 pages
03 WSC 13
No ratings yet
03 WSC 13
36 pages
Introduction To MATLAB 3rd Ed Etter Problems
No ratings yet
Introduction To MATLAB 3rd Ed Etter Problems
11 pages
Broiler MGMT Guide 2008
100% (2)
Broiler MGMT Guide 2008
72 pages
A Study On Natural Disaster Management
No ratings yet
A Study On Natural Disaster Management
8 pages
06-C-Oma-I.01, R.01 (30 Aug 24)
No ratings yet
06-C-Oma-I.01, R.01 (30 Aug 24)
522 pages
Cruz, Kyrene Kesia B. 11 STEM-H
No ratings yet
Cruz, Kyrene Kesia B. 11 STEM-H
5 pages
MATH2
No ratings yet
MATH2
7 pages
Developing English Power
No ratings yet
Developing English Power
20 pages
Wood Pellet Heating Systems - The Earthscan Expert Handbook On Planning, Design and Installation - Dilwyn Jenkins (2010)
No ratings yet
Wood Pellet Heating Systems - The Earthscan Expert Handbook On Planning, Design and Installation - Dilwyn Jenkins (2010)
142 pages
Revision of DNV Standard For Offshore Wind Turbine PDF
100% (1)
Revision of DNV Standard For Offshore Wind Turbine PDF
8 pages

Adaptive Deep Learning-Based Air Quality Prediction Model Using The Most Relevant Spatial-Temporal Relations

Uploaded by

Adaptive Deep Learning-Based Air Quality Prediction Model Using The Most Relevant Spatial-Temporal Relations

Uploaded by

Received May 6, 2018, accepted May 29, 2018, date of publication June 22, 2018, date of current version

July 30, 2018.

Adaptive Deep Learning-Based Air Quality

Corresponding author: Jen-Wei Huang ([email protected])

INDEX TERMS Dynamic time warping(DTW), convolutional neural network(CNN), long-short-term

The remainder of this paper is organized as follows.

VOLUME 6, 2018 38187

38188 VOLUME 6, 2018

FIGURE 2. Predictive model (ST-DNN) framework.

temporal patterns, and consider location feature trends for

VOLUME 6, 2018 38189

Figure 6 is explained in details as follow. When two time

38190 VOLUME 6, 2018

FIGURE 6. Similarity measure procedure.

Since LSTM models historical time series behavior,

FIGURE 9. Proposed prediction model (ST-DNN) structure.

and transformed to the relative elevation,

FIGURE 11. Relative elevation function of PM2.5 distribution.

global inputs for prediction. In some cases, local informa-

38192 VOLUME 6, 2018

recorded levels of other pollutants or meteorological data. performance,

VOLUME 6, 2018 38193

FIGURE 14. Proposed ST-DNN prediction model with different component

FIGURE 17. Northeastern area short period (1-6 hour) prediction

Figure 16 and Figure 17 compare overall model perfor-

38194 VOLUME 6, 2018

However, CNN also has limitations where reliable data

FIGURE 18. Southern area short period (1-6 hour) prediction

FIGURE 23. Island short period (1-6 hour) prediction performance.

mountain city, and island, respectively. ST-DNN(C) exhibits

VOLUME 6, 2018 38195

FIGURE 26. Example relationships between tc+0 and tc+1 .

Figure 24 shows that the proposed Adaptive_LSTM, kNN-

FIGURE 28. Example relationships between tc+0 and tc+3 .

38196 VOLUME 6, 2018

delayed sequences for DTW identified candidates and choos-

TABLE 1. Proposed model with CNN (ST-DNN(C)) with and without

FIGURE 29. Nearest neighbor (k) effect on prediction performance.

VOLUME 6, 2018 38197

38198 VOLUME 6, 2018

VOLUME 6, 2018 38199

You might also like