A Deep Learning Model Integrating a Wind Direction-based Dynamic Graph Network for Ozone Prediction
A Deep Learning Model Integrating a Wind Direction-based Dynamic Graph Network for Ozone Prediction
H I G H L I G H T S G R A P H I C A L A B S T R A C T
A R T I C L E I N F O A B S T R A C T
Editor: Meng Gao Ozone pollution is an important environmental issue in many countries. Accurate forecasting of ozone con
centration enables relevant authorities to enact timely policies to mitigate adverse impacts. This study develops a
Keywords: novel hybrid deep learning model, named wind direction-based dynamic spatio-temporal graph network
Ozone prediction (WDDSTG-Net), for hourly ozone concentration prediction. The model uses a dynamic directed graph structure
Dynamic graph structure
based on hourly changing wind direction data to capture evolving spatial relationships between air quality
Graph neural network
monitoring stations. It applied the graph attention mechanism to compute dynamic weights between connected
Deep learning
stations, thereby aggregating neighborhood information adaptively. For temporal modeling, it utilized a
sequence-to-sequence model with attention mechanism to extract long-range temporal dependencies. Addi
tionally, it integrated meteorological predictions to guide the ozone forecasting. The model achieves a mean
absolute error of 6.69 μg/m3 and 18.63 μg/m3 for 1-h prediction and 24-h prediction, outperforming several
classic models. The model’s IAQI accuracy predictions at all stations are above 75 %, with a maximum of 81.74
%. It also exhibits strong capabilities in predicting severe ozone pollution events, with a 24-h true positive rate of
0.77. Compared to traditional static graph models, WDDSTG-Net demonstrates the importance of incorporating
* Corresponding author at: College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China.
E-mail address: [email protected] (Y. He).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.scitotenv.2024.174229
Received 5 February 2024; Received in revised form 11 June 2024; Accepted 21 June 2024
Available online 23 June 2024
0048-9697/© 2024 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
S. Wang et al. Science of the Total Environment 946 (2024) 174229
short-term wind fluctuations and transport dynamics for data-driven air quality modeling. In principle, it may
serve as an effective data-driven approach for the concentration prediction of other airborne pollutants.
1. Introduction LSTM model, utilizing CNNs and LSTMs to extract spatial and tempo
ral dependencies, respectively. They used data from 12 meteorological
In recent decades, ozone pollution has emerged as an urgent envi sites and air quality monitoring sites in Beijing as model inputs to predict
ronmental problem in many countries (Gao et al., 2017). Prolonged the next day’s 8-h average O3 concentration. The results showed that the
exposure to elevated ozone concentrations has been associated with root mean square error (RMSE) was reduced by approximately 35 %
adverse cardiovascular and respiratory impacts, along with decreased compared to the LSTM model (Pak et al., 2018). Wang et al. adopted a
crop yields (Cao et al., 2020; Wang et al., 2017a). Since 2013, China has DNN and an attention-based Seq2Seq model for spatial modeling of
established >1300 air quality monitoring stations, enabling real-time geographic information and temporal modeling of historical pollutant
monitoring of major air pollutants (Wang et al., 2022a). With the and meteorological information, respectively, to predict the 24-h O3
continuous efforts of the government, five of the six major air pollutants concentration in Beijing. The results demonstrated that the mean ab
(PM2.5, PM10, SO2, NO2, and CO) have been controlled to a certain solute error (MAE) was reduced by at least 5.8 % relative to the
extent. However, tropospheric ozone concentrations persistently remain compared models (Wang et al., 2020a). Although the above models
high (Chen et al., 2020; Fan et al., 2020; Maji et al., 2019). Therefore, it based on CNNs-RNNs can extract the spatiotemporal dependencies be
is necessary to develop reliable methods for predicting ozone concen tween historical data from monitoring stations, there are still two sig
tration so that timely measures can be taken to mitigate its pollution. nificant problems to address. Firstly, RNN variants have certain
Ozone prediction has long been a challenging task, owing to the limitations in processing long-term input data (Hao et al., 2019). Sec
complex photochemical reaction mechanism involved in O3 formation, ondly, CNN is only suitable for Euclidean space, but the monitoring
the intricate non-linear relationships between O3 and its precursors NOx network belongs to non-Euclidean space and has complex topological
and volatile organic compounds (VOCs), and the involvement of mul relationships. Using CNNs may result in the loss of topological infor
tiple meteorological conditions in the reaction process. Consequently, mation and a decrease in prediction accuracy (Wei et al., 2020; Zhang
O3 exhibits complex dynamic spatiotemporal processes. Currently, et al., 2020).
models for ozone prediction are mainly divided into two types: nu With the continuous development of artificial intelligence technol
merical driven models and data-driven models. Numerical driven ogy, it has achieved milestone results in many fields, and a variety of
models, which have widely been used in air pollutant prediction for the emerging model methods can learn key information from various types
past few decades, include the Community Multi-scale Air Quality of data, thus promoting the development of scientific research(Xu et al.,
(CMAQ) model (Foley et al., 2010) and the Weather Research and 2023). Graph neural networks (GNN) have emerged as a novel model for
Forecasting Model with Chemistry (WRF-Chem) (Chuang et al., 2011; extracting and mapping intricate topological relationships in non-
Wang et al., 2020b). However, numerical models heavily depend on Euclidean spaces (Lin et al., 2018; Ouyang et al., 2021; Qi et al., 2019;
parameter settings, and their accuracy in prediction work can be greatly Wu et al., 2021). Yu et al. proposed the Graph Interpolation Attention
influenced by the incompleteness of the dataset and insufficient un Recursive Network (GinAR), which employs interpolation attention and
derstanding of ozone-related chemical formation mechanisms (Chen adaptive graph convolution to accurately reconstruct spatiotemporal
et al., 2021; Wang et al., 2020a; Zhan et al., 2018). Data-driven models dependencies, thereby achieving prediction and advancing the devel
can achieve predictions without substantial prior knowledge, which can opment of GNN to a certain extent(Yu et al., 2024). Liu et al. utilized
be divided into three subcategories: statistical models, shallow machine GCN-LSTM and GCN-GRU as main predictors and employed a Q-
learning (SML) models, and deep learning (DL) models (Zhang et al., learning algorithm to realize the ensemble of the two predictors, ulti
2022). Statistical models, such as autoregressive moving average mately developing the GCN-LSTM-GRU-Q model, which demonstrated
(ARMA) model (Zhu and Lu, 2016), autoregressive integrated moving excellent performance in pollutant forecasting(Liu et al., 2021). Wu
average (ARIMA) model (Wang et al., 2017b), multiple linear regression et al. employed a residual neural network (ResNet) to adaptively learn
(MLR) model (Chelani, 2019), can capture linear relationships but the deep spatial correlations among monitoring sites, utilized a graph
cannot handle complex nonlinear problems. Considering inherent convolutional network (GCN) to capture the topological information of
nonlinear mapping capabilities, SML models such as random forest (RF) the entire monitoring site network, and used a BiLSTM to extract the
(Feng et al., 2019; Song et al., 2021), extreme gradient boosting temporal correlations of auxiliary information and meteorological data.
(XGBoost) (Ma et al., 2020), and support vector machine (SVM) (He They achieved hourly O3 concentration prediction for Shanghai,
et al., 2018), have experienced exponential development. However, as demonstrating improved prediction performance compared to other
the volume and dimensions of data have increased substantially, SML models (Wu et al., 2023). In the current GNN-based research, the spatial
models have struggled to perform effectively in handling enormous relationships among graph nodes are computed and constructed based
datasets and high dimensions. As an emerging form of SML models, DL on distances between stations, so the graph structure remains static. In
models have quickly become an efficient way to process complex multi- ozone prediction, ozone is more likely to spread downwind and more
dimensional data and have exhibited state-of-the-art performances in air difficult to diffuse upwind, leading to anisotropic ozone behavior under
prediction (Kim et al., 2021; Li et al., 2019), such as recurrent neural wind effects. Considering the horizontal transport and diffusion caused
network (RNN) (Freeman et al., 2018), long short-term memory recur by wind speed and wind direction, ozone concentration will be
rent network (LSTM) (Li et al., 2017), gated recurrent network (GRU) dynamically influenced by the transportation of pollutant from neigh
(Cheng et al., 2021), sequence-to-sequence (Seq2Seq) (Cho et al., 2014) boring stations. Therefore, it is necessary to construct dynamic graph
and convolutional neural network (CNN) (Eslami et al., 2020). Spatio structure based on changeable wind direction instead of static graph
temporal hybrid models can extract both temporal and spatial features structure for dynamically modeling the monitoring network, which is
(Chen et al., 2022; Dun et al., 2022; Le et al., 2020; Mao et al., 2022; theoretically possible to improve the prediction performance. However,
Wang et al., 2022a, 2022b, 2022c), have also achieved favorable pre to the best of our knowledge, no studies have considered dynamic graph
diction performance. Hu et al. constructed a CNN-BiLSTM-GRU model to structure in ozone prediction.
forecast the hourly ozone concentration at stations in Beijing. The pre To this end, a novel spatiotemporal hybrid deep learning model
dictive performance of the model was superior to that of standalone called wind direction-based dynamic spatio-temporal graph network
models such as BiLSTM (Hu et al., 2023). Pak et al. employed a CNN- (WDDSTG-Net) was proposed to predict hourly ozone concentration.
2
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Specifically, in the spatial feature extraction module, a dynamic directed because of the proximity of air quality monitoring stations, making
graph structure was constructed for the monitoring network based on meteorological data from nearby stations highly similar. Therefore, the
changeable wind direction to achieve dynamic evolution processing and Xiaoshan meteorological station is selected to represent the meteoro
calculated the dynamic weights between stations based on the graph logical features of the entire study area.
attention mechanism, which allowed for comprehensive extraction and
aggregation of dynamic spatial neighborhood information. For temporal 2.2. Data preprocessing process
modeling, it utilized sequence-to-sequence with attention to focus more
on the crucial information and extract long-range dependencies by 2.2.1. Outlier processing
assigning different weights to different moments when processing time Outliers can arise due to various reasons, including machine abnor
series. Additionally, meteorological predictions were incorporated to malities, sudden power outages, or manual counting errors. Boxplots of
guide the forecasting. each pollutant were used to remove outliers to prevent adverse effects.
Boxplots typically include five values, from bottom to top are the lower
2. Materials and method limit, the first quartile, the median, the third quartile, and the upper
limit. Data above the upper limit or below the lower limit are considered
2.1. Study area and dataset description outliers and are recognized as null values.
Hangzhou was selected as the study area in this work. Hangzhou 2.2.2. Missing value processing
ranges from 29.18◦ to 30.55◦ N and 118.35◦ to 120.50◦ E. It locates in Different strategies were used based on the duration of continuous
East China, downstream of the Qiantang River, and northern Zhejiang. data gaps for missing value processing. If the value is missing continu
Fig. 1 shows the geographical location of the study area and the distri ously for <3 h, linear interpolation is used. If the value is missing
bution of monitoring stations, including 10 air quality monitoring sta continuously for >3 h, cubic spline interpolation is used. The whole
tions and 1 meteorological monitoring station. The latitude and interpolation process ensures that the interpolated data is between the
longitude coordinates of each station are shown in Table S1. In 2012, upper threshold and the lower threshold.
after the Ministry of Environmental Protection of the People’s Republic
of China (MEP) published new air quality standards, various cities across 2.2.3. Normalization
the country gradually established environmental monitoring stations. Due to significant differences in the significance and numerical
The air pollution dataset was derived from the China National Envi ranges of different features, normalization is essential before inputting
ronmental Monitoring Center (CNEMC) for the 10 stations from 2016 to into the model. The normalized data accelerates model convergence and
2022. Additionally, the meteorological dataset at the same period is improves model prediction performance. This study uses linear
obtained from the U.S. National Climate Data Center. The specific data normalization, mapping the data into a numerical space ranging from
format and data description of the two datasets are detailed in Table S2. 0 to 1, the formula is as follows (1):
Additionally, histograms which represent the data distribution of key x − xmin
pollutant features (Fig. S1) and meteorological features (Fig. S2) are xscaled = (1)
xmax − xmin
provided, thereby facilitating a visual inspection of the distribution of
the original data. It should be noted that the Zhaohuiwuqu station has no where xscaled is the normalized value, x is the original data, xmin is the
data record since January 1, 2021, with an hourly ozone concentration minimum value in the original data, xmax is the maximum value in the
missing rate as high as 31.3 %. Therefore, Zhaohuiwuqu station is not original data.
considered. For meteorological data, there is only one meteorological
station in Hangzhou. In fact, it is common that there are no matching
meteorological stations near many air quality monitoring stations. It is
3
S. Wang et al. Science of the Total Environment 946 (2024) 174229
4
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Fig. 3. Two dynamic directed graph construction strategies. (a) Cartesian directed graph. (b) Angle-based directed graph.
5
S. Wang et al. Science of the Total Environment 946 (2024) 174229
aligned with the north. Consequently, only station E located in the first
< vc , va >= (19)
quadrant theoretically will not exert an influence on the target station A, ⎩
not exists, otherwise
and stations in other quadrants have the potential to influence station A.
Therefore, directed paths are created from all stations except station E where θ can be calculated from the wind direction angle α and the azi
towards station A, enabling the construction of the spatial relationship
muth angle βc,a . The conditions of θ ∈ ( − 90 , 90 ) can be converted into
◦ ◦
6
S. Wang et al. Science of the Total Environment 946 (2024) 174229
where eti,j represents the correlation coefficient between station i and the decoder to adaptively select the hidden states generated in the
2Mʹ
encoder, and models the dynamic temporal dependences of the source
station j at time t, a ∈ R is a single-layer feedforward neural network, time series and the generated sequence. This process, combined with
W ∈ RM ×M is a trainable weight matrix enabling a linear transformation
ʹ
weather data predicted in the meteorological prediction module, leads
from input feature M to output feature M’, Nti represents the set of all to the output. As an example, the future time tʹ is taken to introduce the
stations connected to station i at time t, || represents the connection calculation process of attention.
operation. To facilitate easier computation and comparison of attention
coefficients for the same station, the softmax function is employed for etʹt = vTe tanh(W1 stʹ− 1 + W2 ht + b) (27)
regularization.
exp(etʹt )
( ( )) αtʹt = ∑r (28)
exp LeakyReLU eti,j t=1 exp(etʹt )
t
αi,j = ∑ ( ( )) (24)
k∈Nt exp LeakyReLU ei,k
t
i where W1 , W2 , ve and b are trainable parameters, r represents the input
time step, etʹt represents the correlation between the previous decoder
where LeakyReLU(⋅) represents an activation function, and the obtained hidden state stʹ− 1 and the encoder’s hidden state ht at time t,
αti,j represents the degree of influence of neighbor station j on target and αtʹt represents the normalized relevance weight derived from etʹt .
station i at time t. The context vector at time tʹ can be obtained by the weighted summation
The above process is applied to all neighboring stations of the target of αtʹ .
station to obtain their respective regularized attention coefficients. ∑l
Subsequently, these coefficients are used to aggregate all neighborhood ctʹ = t=1
αtʹt ht (29)
spatial features, followed by concatenation with the target station’s
Further, the next hidden state and the output of this time step can be
initial information to facilitate the fusion process. This results in the
calculated.
update of the target station’s internal features.
⎛ ⎛ ⎞⃦ ⎞ stʹ = GRU(stʹ− 1 , mtʹ , ctʹ ) (30)
∑ ⃦
⃦ t
cti = ⎝FC⎝ αti,j • Wxtj ⎠ ⃦⃦ xi
⎠ (25) ytʹ = tanh(FC(stʹ , mtʹ , ctʹ ) ) (31)
t
j∈Ni ⃦
where mtʹ represents the meteorological data predicted by the meteo
where cti represents the updated state feature of node i after utilizing the
rological prediction module at time tʹ, and ytʹ represents the output at
graph attention mechanism to aggregate neighborhood information. time tʹ.
Since the node states (variable feature states) of the neighboring stations
and the target station change at each time step, the attention weight 2.4.5. Fusion module
coefficients will also vary according to the node states of the stations, After extracting spatial and temporal dependencies from the time
which reflect the dynamism of each station. This process effectively series, the output is concatenated with meteorological data output by
distinguishes the degree of influence of each station and reflects the the meteorological prediction module. The concatenated data is then
dynamic influence from all neighboring stations. These weights are input into a fully connected layer:
utilized to aggregate neighborhood information and update target sta
tion states, allowing to adaptively extract dynamically varying spatial yi = δ(W[O1 : O2 ] + b ) (32)
characteristics.
where W and b are trainable parameters,: represents the concatenation
2.4.4. Temporal feature extraction module operation, δ is a non − linear activation function, O1 and O2 represent
The Seq2Seq with attention mechanism is used for temporal the outputs of the meteorological prediction module and the time
modeling. GRU is chosen as the decoder, which can also avoid gradient feature extraction module, respectively. Finally, through the collabo
disappearance but uses smaller parameters compared to LSTM. Bi-LSTM ration and guidance of the above modules, the predicted hourly ozone
is selected as the encoder. As an extension of LSTM, Bi-LSTM uses two concentrations for the target station are obtained as a sequence yi =
[ i i ]
LSTM networks to process time sequences in both forward and backward y1 , y2 , …, yin .
directions. This allows the model to effectively leverage information
from both forward and backward simultaneously, demonstrating 2.5. Process details
exceptional performance in simulating long-term sequences with tem
poral dependencies (Bahdanau et al., 2016). For Bi-LSTM, the hidden Based on the specific model architecture of WDDSTG-Net described
→
state obtained from the forward LSTM cell is the forward hidden state ht , above, the detailed procedure for ozone prediction is summarized
while the hidden state obtained from the backward LSTM cell is referred (Fig. S9). Generally, it contains the following parts.
←
to as the backward hidden state ht . The concatenation of these two (1) Data collection and preprocessing. Collected meteorological data
components constitutes the entire hidden state of the Bi-LSTM at time t, and air quality data in Hangzhou from 2016 to 2022. Boxplots
denoted as: were used to set manual thresholds for outlier removal, filled in
[→ ← ]
data according to the duration of continuous missing data, and
ht = ht , ht (26)
segmented the dataset to construct input-output pairs for pre
After using a dynamic graph neural network to extract spatial de diction, completing the data cleansing process.
pendencies from time series, the sequence corresponding to the target (2) Feature selection. Using data analysis techniques such as auto
station is concatenated and input into the Bi-LSTM as the encoder. In the correlation analysis, Pearson correlation analysis, and mutual
encoder, the main work is to iteratively extract the temporal charac information to select the combinations of input features.
teristics of the input sequence, compressing spatiotemporal information (3) Model training and parameter tuning. The WDDSTG-Net model
into an intermediate variable s0 , while also generating hidden states at was constructed, using the training set for model training. The
each historical time step. The GRU, as the decoder, receives the inter hyperparameters in the model were fine-tuned using data from
mediate variable s0 as the initial input, uses the attention mechanism in the validation set to achieve the optimal model structure.
7
S. Wang et al. Science of the Total Environment 946 (2024) 174229
(4) The optimally trained model was tested using the data from the where IAQIpred represents the predicted IAQI for ozone, IAQItrue repre
testing set. The results were compared with other models based sents the true IAQI for ozone. If the above conditions are not met, it is
on various evaluation metrics to complete the ozone prediction. considered as an inaccurate prediction. Furthermore, based on the
magnitude relationship between IAQIpred and IAQItrue , two types of
2.6. Experimental settings inaccuracies are further categorized: overestimation or underestimation
of IAQI. The accuracy, underestimation rate, and overestimation rate are
Pytorch is used as deep learning framework to build the models. The calculated using the following formula, where k represents the number
time range of data in the dataset is from 0:00 on January 1, 2016, to of accurate predictions, underestimated predictions and overestimated
23:00 on December 31, 2022. The dataset is divided into 70 % for predictions respectively, and N represents the total number of samples.
training, 15 % for validation, and 15 % for testing to ensure that each
k
subset contains at least one complete year of data. The input time step is Percent = × 100% (38)
N
set to 24, which is optimized for better model performance (Fig. S10).
The learning rate is 0.00001(Fig. S11), and the batchsize is set to 64 To further analyze the predictive ability of the model for high ozone
(Table S3). Models are trained for 200 epochs and the early stopping concentrations, three evaluation metrics are introduced for an in-depth
strategy is used to prevent overfitting and improve training efficiency evaluation. These metrics are the true positive rate (TPR), false accep
(patience is set to 5). Adaptive Moment Estimation(Adam) is selected to tance rate (FAR), and false positive rate (FPR). When using these met
be the optimizer and the HuberLoss is selected as the loss function. rics, a threshold needs to be set. Values above the threshold are regarded
Three model evaluation metrics, including root mean square error as positive samples of high ozone concentration, while values below the
(RMSE), mean absolute error (MAE), and the coefficient of determina threshold are considered negative samples of low ozone concentration.
tion (R2), are used to evaluate the ozone prediction performance of TPR, also known as recall rate, reflects the ability of the model to
models. The specific formulas are as follows: correctly predict positive samples, representing the proportion of
correctly predicted positive samples in the total positive samples. FAR,
n
1∑ also known as false alarm rate, reflects the model’s tendency to incor
MAE = |Obsi − Prei | (33)
n i=1 rectly identify predictions as positive samples, representing the pro
portion of negative samples incorrectly predicted as positive samples
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
1 ∑ n among all predicted positive samples. FPR, also known as the false
RMSE = (Obsi − Prei )2 (34) recognition rate, reflects the ability of the model to incorrectly predict
n i=1
negative samples, signifying the proportion of negative samples incor
n
∑ rectly predicted as positive samples among all actual negative samples.
(Obsi − Prei )2 The relevant formulas are shown below:
i=1
R = 1−
2
n (35)
∑ TP
(Obsi − Obsi )2 TPR = 1 − Ratemiss = (39)
i=1 (TP + FN)
8
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Table 1
Comparison of results of two directed construction strategies.
Strategy Cartesian directed graph Angle-based directed graph
2
Monitoring Station R MAE RMSE R2 MAE RMSE
The bold text indicates better results for the same metric.
9
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Fig. 5. The MAE and RMSE of different models in 24-h prediction. (a) Chengxiangzhen station. (b) Zhedanongda station.
stronger applicability in this study area, and also has better performance horizon was extended from the next hour to the next natural day (24 h).
in longer-term predictions. Therefore, the angle-based directed graph Taking Chengxiangzhen station and Zhedanongda station as an
strategy is used to construct a unified dynamic graph structure for example, the 24-h prediction results are shown in Fig. 5. For all models,
monitoring network with topological relationships. as the time step increases, both MAE and RMSE continuously increase
and tend to be smooth, indicating that long-term prediction is more
challenging and less accurate than short-term prediction. This suggests
3.3. Model prediction performance for ozone concentration an accumulation of prediction errors over time. For Chengxiangzhen
station, WDDSTG-Net performs similarly to other models for the first
The WDDSTG-Net was used to predict the ozone concentration for two hours. After 2 h, MAE and RMSE are the lowest among all models,
each station in the next hour and summarized the results reported in the and MLP performs the worst, indicating that its limitations in capturing
literature review (Zhang et al., 2022)as baselines(Table S6). A inherent nonlinear relationships. For Zhedanongda station, WDDSTG-
comparative analysis reveals that WDDSTG-Net exhibits significantly Net consistently outperforms excellent time-series models at all times,
superior performance in all metrics compared to statistic models and indicating its superior predictive performance. It can also be observed
deterministic models. It also closely approaches the performance of DL that WDDSTG-Net consistently outperforms DSSTG-Net at each time
models, which were reported as the best performing in the literature step, indicating that the proposed wind-direction-based dynamic
review, and achieves lower MAE in the predictions for most stations. directed graph structure exhibits superior predictive performance
Achieving such outstanding performance results is still commendable compared to the traditional static graph structure based on geographical
and promising. adjacency relationships. Such outstanding performance shows that the
Furthermore, prediction results from other studies were collected WDDSTG-Net model can more effectively capture complex nonlinear
and compared with the proposed model. Taking the Xixi as an example, relationships in longer-term ozone prediction, and has the potential to
the detailed results are summarized in Table S7. For 1-h prediction, the play a more important role in long-term ozone prediction.
WDDSTG-Net exhibited relatively better fitting and prediction perfor The results of the 24-h prediction of all stations were shown in the
mance than Seq2Seq (Jia et al., 2021) and Res-GCN-BiLSTM (Wu et al., Table S8. MLP performs the worst in the majority of stations, confirming
2023), with higher R2 and lower MAE and RMSE, suggesting its potential its weaker ability to capture nonlinear relationships and process time
advantage and prospects in short-term ozone forecasting to some extent. series. Both GRU and LSTM exhibit comparable and relatively good
Based on the results, the performance of all monitoring stations were predictive performance across each station which exhibit relatively
summarized (Fig. S13). It is evident that the trend lines fitted for each thorough exploration of deep temporal information within time series.
station closely align with y = x, visualizing the proximity and strong However, CNN-LSTM, which acts as a spatiotemporal hybrid model,
correlation between predicted and observed values, highlighting the does not show significantly better predictive performance than models
excellence of prediction results. such as GRU and LSTM, which only extract temporal features. This
In order to afford the government more policy response time in might be attributed to its complex internal structure and potential
addressing potentially severe ozone-related problems, long-term pre overfitting due to the relatively small dataset size. It is also possible that
diction of ozone concentrations is crucial. Therefore, the prediction
10
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Fig. 6. (a) Distribution of R2 of different models at each station. (b) Distribution of mean value of ozone at each monitoring station in the study area and distribution
of MAE of different models.
11
S. Wang et al. Science of the Total Environment 946 (2024) 174229
conventional deep learning models in high ozone areas, making it in the proposed WDDSTG-Net structure, including meteorological pre
challenging to effectively capture ozone peaks. For WDDSTG-Net, the diction module, spatial feature extraction module, and temporal feature
predictions remain relatively stable in high ozone areas. This demon extraction module, as detailed in Table S11. It’s obvious that removing
strates the model’s robust predictive capability for high ozone concen any of the three modules leads to a certain degree of reduction in
tration levels. model’s predictive accuracy. This means that each module plays an
Ozone variations exhibit not only daily periodicity but also distinct important role in mining specific deep feature relationships. These
seasonality. Therefore, a detailed analysis was conducted from a modules mutually interact and influence each other, coupling internally
monthly perspective. Fig. 7(a) presents the concentrations of ozone in in certain dimensions, compensating for the limitations of a singular
different months. It is evident that months with higher ozone concen model in capturing missed key information. They constitute the
trations mainly span from April to September, peaking in August with a comprehensive and outstanding performance of the WDDSTG-Net
mean ozone concentration of 95.80 μg/m3. From October to March, model, helping to achieve optimal prediction results. All these results
ozone levels are lower, reaching their minimum in December with a provide insights into the architectural design of spatiotemporal hybrid
median concentration of 28.99 μg/m3. Ozone concentrations vary models for ozone concentration prediction, contributing to the devel
widely, with generally high concentrations in summer and autumn and opment of more efficient and accurate prediction models for processing
low concentrations in spring and winter. This variation is attributed to data with spatiotemporal features.
seasonal differences in meteorological factors such as temperature, solar
radiation, and precipitation. Typically, summer and autumn have more
3.5. Analysis of IAQI accuracy and other metrics
sunlight and longer daylight hours, conditions conducive to the photo
chemical reactions involved in secondary pollutants like ozone. Thus,
Models were applied at a real-world level, extending the evaluation
the climate characteristics of summer and autumn determine the prev
from theoretical numerical proximity to ozone pollution levels based on
alence of elevated ozone levels to a certain extent. In Fig. 7(b) and Fig. 7
policy documents in real society, thereby measuring the model’s prac
(c), the worst result occurred in July, a month within the high ozone
tical significance. According to the IAQI calculation formula, the pre
concentration range, with median MAE and RMSE of 25.47 μg/m3 and
dicted IAQI were calculated based on the 24-h ozone concentration
33.98 μg/m3, respectively. The best fitting was observed in December,
prediction results, and calculated the IAQI accuracy as shown in Fig. 8.
the month with the lowest ozone concentration, with median MAE and
The IAQI accuracy of all stations can reach >75 %, and the accuracy of
RMSE of 13.17 μg/m3 and 17.53 μg/m3, respectively. The results indi
S2 and S8 can even reach >80 %. This shows that WDDSTG-Net can
cate that high ozone concentrations are often associated with lower
efficiently quantify ozone pollution levels and show promising practical
prediction accuracy, denoted by relatively higher MAE and RMSE.
value. Additionally, it’s noteworthy that the underestimation rates for
Taking the runtime of the GRU model as a benchmark, the compu
each station are higher than the overestimation rates, with underesti
tational cost for multiple models including WDDSTG-Net was computed.
mation rates ranging between 10 % and 14 % and overestimation rates
The specific results and efficiency analysis were summarized in
remain between 7 % and 11 %. This suggests that WDDSTG-Net tends to
Table S10.
underestimate the ozone pollution levels when the prediction is not
In summary, the proposed WDDSTG-Net demonstrates significant
accurate enough. The model’s ability to detect extreme weather still
application advantages when comparing to other models in predicting
needs to be further strengthened.
short-term or longer-term ozone concentration. Its remarkable sensi
A detailed analysis of the IAQI accuracy at each hour was conducted
tivity and detection capabilities towards extreme concentrations are
for the Chengxiangzhen station and Zhedanongda station (Fig. S14).
commendable. During periods of severe ozone pollution, the model’s
MLP achieves the worst prediction for almost all hours. The changes in
predictive capacity remains impressive, allowing a substantial capture
accuracy for LSTM and GRU show remarkable similarity, while CNN-
of the complex diurnal cycle and seasonal variation patterns of ozone in
LSTM has overall higher accuracy rates than both LSTM and GRU,
high dimensions to a great extent.
although it slightly underperforms for specific hours. However, the ac
curacy of DSSTG-Net at each time step is slightly higher than CNN-
3.4. Ablation study LSTM. Among all models, WDDSTG-Net achieves the best prediction
results. For the initial two hours, WDDSTG-Net achieves accuracy rates
Ablation studies were conducted to verify the rationality of each part exceeding 90 %, while achieving over 80 % accuracy for the first six
12
S. Wang et al. Science of the Total Environment 946 (2024) 174229
hours. Even for the platform with the worst accuracy, it maintains rates BiLSTM. Overall, the proposed model provides an effective data-driven
above 77 % and 73 %, respectively. approach for hourly ozone concentration prediction. It exhibited po
Other metrics were also calculated to further inspect the perfor tentials for the prediction of other airborne pollutants.
mance of the model in predicting severe ozone pollution or extreme
peaks, setting 100 μg/m3 as the critical threshold for high ozone con CRediT authorship contribution statement
centration. Fig. 9. shows the calculation of various metrics at each sta
tion in the 24-h prediction. TPR consistently remains above 0.69, Shiyi Wang: Writing – original draft, Visualization, Methodology,
reaching a maximum of 0.77, indicating a reasonable ability of the Data curation, Conceptualization. Yiming Sun: Software, Investigation.
model to predict high ozone concentrations. FAR remains mostly below Haonan Gu: Validation. Xiaoyong Cao: Formal analysis. Yao Shi:
0.28, indicating a low tendency for incorrect prediction. FPR is below Resources. Yi He: Writing – review & editing, Supervision, Project
0.06, suggesting a minimal likelihood of incorrectly predicting in the administration.
absence of high ozone concentrations. Taking FAR and FPR together,
these two metrics indicate that the model does not tend to overestimate Declaration of competing interest
actual concentrations. Based on the results of multiple indicators,
WDDSTG-Net demonstrates sensitivity to high ozone concentration and The authors declare that they have no known competing financial
false warnings is also maintained at a low level. However, there is still interests or personal relationships that could have appeared to influence
room for improvement and enhancement in predicting extreme the work reported in this paper.
scenarios.
Data availability
4. Conclusion
Data will be made available on request.
A hybrid model WDDSTG-Net was proposed and used to predict
hourly ozone concentration. For both 1-h prediction and 24-h predic Acknowledgements
tion, this model outperformed several classic data-driven models which
were used as benchmarks in the study of Hangzhou. The MAE of This work is supported by the National Key Research and Develop
WDDSTG-Net was 6.69 μg/m3 and 18.63 μg/m3, respectively. Daily ment Program of China (grant number 2022YFE0106100), and the
change analysis, specific station analysis and monthly analysis showed National Natural Science Foundation of China (grant number 22178299,
that the model can effectively applied in different concentration time 51933009).
scales and spatial scales. Furthermore, ablation studies at monitoring
stations reflected the rationality of the overall structure of the model. Appendix A. Supplementary data
The IAQI accuracy calculated for each station shows a high accuracy of
81.74 %. In addition, for the prediction ability of the WDDSTG-Net for Supplementary data to this article can be found online at https://doi.
high ozone concentrations, results show that TPR reaches 0.77, and FAR org/10.1016/j.scitotenv.2024.174229.
and FPR are as low as 0.21 and 0.05, showing the ability of the model in
achieving accurate peak predictions. References
The model demonstrated the importance of considering three major
components simultaneously, including the dynamic spatial correlation, Bahdanau, D., Cho, K., Bengio, Y., 2016. Neural Machine Translation by Jointly Learning
temporal correlation and meteorological predictions. The key to to Align and Translate. https://doi.org/10.48550/arXiv.1409.0473.
Baklanov, A., Korsholm, U., Mahura, A., Petersen, C., Gross, A., 2008. ENVIRO-HIRLAM:
implement the dynamic spatial correlation was to use directed graphs
on-line coupled modelling of urban meteorology and air pollution. Adv. Sci. Res. 2,
dynamically based on wind direction by applying the angle-based 41–46. https://doi.org/10.5194/asr-2-41-2008.
strategy. In addition, the graph attention mechanism was used to Cao, Y., Qiao, X., Hopke, P.K., Ying, Q., Zhang, Y., Zeng, Y., Yuan, Y., Tang, Y., 2020.
Ozone pollution in the West China rain zone and its adjacent regions, southwestern
assign dynamic weights to each station in the dynamic directed graphs.
China: concentrations, ecological risk, and sources. Chemosphere 256, 127008.
Moreover, LSTM was used to implement to achieve meteorological https://doi.org/10.1016/j.chemosphere.2020.127008.
prediction and attained the higher R2 in the prediction of the three Chelani, A.B., 2019. Estimating PM2.5 concentration from satellite derived aerosol optical
meteorological factors than other classic methods such as MLP, GRU and depth and meteorological variables using a combination model. Atmos. Pollut. Res.
10, 847–857. https://doi.org/10.1016/j.apr.2018.12.013.
13
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Chen, S., Wang, H., Lu, K., Zeng, L., Hu, M., Zhang, Y., 2020. The trend of surface ozone convolutional network. Atmospheric. Pollut. Res. 12, 101197 https://doi.org/
in Beijing from 2013 to 2019: indications of the persisting strong atmospheric 10.1016/j.apr.2021.101197.
oxidation capacity. Atmos. Environ. 242, 117801 https://doi.org/10.1016/j. Ma, J., Ding, Y., Cheng, J.C.P., Jiang, F., Tan, Y., Gan, V.J.L., Wan, Z., 2020.
atmosenv.2020.117801. Identification of high impact factors of air quality on a national scale using big data
Chen, D., Wang, G., Xinyue, Z., Liu, Q., Liu, X., 2021. A hybrid CNN-LSTM model for and machine learning techniques. J. Clean. Prod. 244, 118955 https://doi.org/
predicting PM2.5 in Beijing based on spatiotemporal correlation. Environ. Ecol. Stat. 10.1016/j.jclepro.2019.118955.
28, 503–522. https://doi.org/10.1007/s10651-021-00501-8. Maji, K.J., Ye, W.-F., Arora, M., Nagendra, S.M.S., 2019. Ozone pollution in Chinese
Chen, Y., Chen, X., Xu, A., Sun, Q., Peng, X., 2022. A hybrid CNN-transformer model for cities: assessment of seasonal variation, health effects and economic burden.
ozone concentration prediction. Air Qual. Atmos. Health 15, 1533–1546. https:// Environ. Pollut. 247, 792–801. https://doi.org/10.1016/j.envpol.2019.01.049.
doi.org/10.1007/s11869-022-01197-w. Mao, W., Jiao, L., Wang, W., 2022. Long time series ozone prediction in China: a novel
Cheng, Y., He, L.-Y., Huang, X.-F., 2021. Development of a high-performance machine dynamic spatiotemporal deep learning approach. Build. Environ. 218, 109087
learning model to predict ground ozone pollution in typical cities of China. https://doi.org/10.1016/j.buildenv.2022.109087.
J. Environ. Manage. 299, 113670 https://doi.org/10.1016/j.jenvman.2021.113670. Ouyang, X., Yang, Y., Zhang, Y., Zhou, W., 2021. Spatial-temporal dynamic graph
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., convolution neural network for air quality prediction, in: 2021 international joint
Bengio, Y., 2014. Learning Phrase Representations using RNN Encoder-Decoder for conference on neural networks (IJCNN). In: Presented at the 2021 International
Statistical Machine Translation. https://doi.org/10.48550/arXiv.1406.1078. Joint Conference on Neural Networks (IJCNN), pp. 1–8. https://doi.org/10.1109/
Chuang, M.-T., Zhang, Y., Kang, D., 2011. Application of WRF/Chem-MADRID for real- IJCNN52387.2021.9534167.
time air quality forecasting over the southeastern United States. Atmos. Environ. 45, Pak, U., Kim, C., Ryu, U., Sok, K., Pak, S., 2018. A hybrid model based on convolutional
6241–6250. https://doi.org/10.1016/j.atmosenv.2011.06.071. neural networks and long short-term memory for ozone concentration prediction.
Dun, A., Yang, Y., Lei, F., 2022. A novel hybrid model based on spatiotemporal Air Qual. Atmos. Health 11, 883–895. https://doi.org/10.1007/s11869-018-0585-1.
correlation for air quality prediction. Mob. Inf. Syst. 2022, e9759988 https://doi. Qi, Y., Li, Q., Karimian, H., Liu, D., 2019. A hybrid model for spatiotemporal forecasting
org/10.1155/2022/9759988. of PM2.5 based on graph convolutional neural network and long short-term memory.
Eslami, E., Choi, Y., Lops, Y., Sayeed, A., 2020. A real-time hourly ozone prediction Sci. Total Environ. 664, 1–10. https://doi.org/10.1016/j.scitotenv.2019.01.333.
system using deep convolutional neural network. Neural Comput. & Applic. 32, Russo, A., Raischel, F., Lind, P.G., 2013. Air quality prediction using optimal neural
8783–8797. https://doi.org/10.1007/s00521-019-04282-x. networks with stochastic variables. Atmos. Environ. 79, 822–830. https://doi.org/
Fan, Y., Ding, X., Hang, J., Ge, J., 2020. Characteristics of urban air pollution in different 10.1016/j.atmosenv.2013.07.072.
regions of China between 2015 and 2019. Build. Environ. 180, 107048 https://doi. Song, X.-Y., Gao, Y., Peng, Y., Huang, S., Liu, C., Peng, Z.-R., 2021. A machine learning
org/10.1016/j.buildenv.2020.107048. approach to modelling the spatial variations in the daily fine particulate matter
Feng, R., Zheng, H., Gao, H., Zhang, A., Huang, C., Zhang, J., Luo, K., Fan, J., 2019. (PM2.5) and nitrogen dioxide (NO2) of Shanghai, China. Environment and Planning
Recurrent neural network and random forest for analysis and accurate forecast of B: Urban Analytics and City Science 48, 467–483. https://doi.org/10.1177/
atmospheric pollutants: a case study in Hangzhou, China. J. Clean. Prod. 231, 2399808320975031.
1005–1015. https://doi.org/10.1016/j.jclepro.2019.05.319. Wang, T., Xue, L., Brimblecombe, P., Lam, Y.F., Li, L., Zhang, L., 2017a. Ozone pollution
Foley, K.M., Roselle, S.J., Appel, K.W., Bhave, P.V., Pleim, J.E., Otte, T.L., Mathur, R., in China: a review of concentrations, meteorological influences, chemical precursors,
Sarwar, G., Young, J.O., Gilliam, R.C., Nolte, C.G., Kelly, J.T., Gilliland, A.B., Bash, J. and effects. Sci. Total Environ. 575, 1582–1596. https://doi.org/10.1016/j.
O., 2010. Incremental testing of the community multiscale air quality (CMAQ) scitotenv.2016.10.081.
modeling system version 4.7. Geosci. Model Dev. 3, 205–226. https://doi.org/ Wang, P., Zhang, H., Qin, Z., Zhang, G., 2017b. A novel hybrid-Garch model based on
10.5194/gmd-3-205-2010. ARIMA and SVM for PM2.5 concentrations forecasting. Atmos. Pollut. Res. 8,
Freeman, B.S., Taylor, G., Gharabaghi, B., Thé, J., 2018. Forecasting air quality time 850–860. https://doi.org/10.1016/j.apr.2017.01.003.
series using deep learning. J. Air Waste Manage. Assoc. 68, 866–886. https://doi. Wang, H.-W., Li, X.-B., Wang, D., Zhao, J., He, H., Peng, Z.-R., 2020a. Regional
org/10.1080/10962247.2018.1459956. prediction of ground-level ozone using a hybrid sequence-to-sequence deep learning
Gao, J., Woodward, A., Vardoulakis, S., Kovats, S., Wilkinson, P., Li, L., Xu, L., Li, J., approach. J. Clean. Prod. 253, 119841 https://doi.org/10.1016/j.
Yang, J., Li, J., Cao, L., Liu, X., Wu, H., Liu, Q., 2017. Haze, public health and jclepro.2019.119841.
mitigation measures in China: a review of the current evidence for further policy Wang, P., Qiao, X., Zhang, H., 2020b. Modeling PM2.5 and O3 with aerosol feedbacks
response. Sci. Total Environ. 578, 148–157. https://doi.org/10.1016/j. using WRF/Chem over the Sichuan Basin, southwestern China. Chemosphere 254,
scitotenv.2016.10.231. 126735. https://doi.org/10.1016/j.chemosphere.2020.126735.
Hao, S., Lee, D.-H., Zhao, D., 2019. Sequence to sequence learning with attention Wang, Sichen, Huo, Y., Mu, X., Jiang, P., Xun, S., He, B., Wu, W., Liu, L., Wang, Y.,
mechanism for short-term passenger flow prediction in large-scale metro system. 2022a. A high-performance convolutional neural network for ground-level ozone
Transportation Research Part C: Emerging Technologies 107, 287–300. https://doi. estimation in eastern China. Remote Sens. (Basel) 14, 1640. https://doi.org/
org/10.1016/j.trc.2019.08.005. 10.3390/rs14071640.
He, H., Li, M., Wang, W., Wang, Z., Xue, Y., 2018. Prediction of PM2.5 concentration Wang, Shun, Qiao, L., Fang, W., Jing, G., Sheng, V., Zhang, Y., 2022b. Air pollution
based on the similarity in air quality monitoring network. Build. Environ. 137, prediction via graph attention network and gated recurrent unit. CMC 73, 673–687.
11–17. https://doi.org/10.1016/j.buildenv.2018.03.058. https://doi.org/10.32604/cmc.2022.028411.
Hong, F., Ji, C., Rao, J., Chen, C., Sun, W., 2023. Hourly ozone level prediction based on Wang, Dongsheng, Wang, H.-W., Lu, K.-F., Peng, Z.-R., Zhao, J., 2022c. Regional
the characterization of its periodic behavior via deep learning. Process Saf. Environ. prediction of ozone and fine particulate matter using diffusion convolutional
Prot. 174, 28–38. https://doi.org/10.1016/j.psep.2023.03.059. recurrent neural network. Int. J. Environ. Res. Public Health 19, 3988. https://doi.
Hu, J., Chen, Y., Wang, W., Zhang, S., Cui, C., Ding, W., Fang, Y., 2023. An optimized org/10.3390/ijerph19073988.
hybrid deep learning model for PM2.5 and O3 concentration prediction. Air Qual. Wei, X., Yu, R., Sun, J., 2020. View-GCN: view-based graph convolutional network for 3D
Atmos. Health 16, 857–871. https://doi.org/10.1007/s11869-023-01317-0. shape analysis. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern
Jia, P., Cao, N., Yang, S., 2021. Real-time hourly ozone prediction system for Yangtze Recognition (CVPR). Presented at the 2020 IEEE/CVF Conference on Computer
River Delta area using attention based on a sequence to sequence model. Atmos. Vision and Pattern Recognition (CVPR), pp. 1847–1856. https://doi.org/10.1109/
Environ. 244, 117917 https://doi.org/10.1016/j.atmosenv.2020.117917. CVPR42600.2020.00192.
Kim, J., Wang, X., Kang, C., Yu, J., Li, P., 2021. Forecasting air pollutant concentration Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S., 2021. A comprehensive survey on
using a novel spatiotemporal deep learning model based on clustering, feature graph neural networks. IEEE Transactions on Neural Networks and Learning Systems
selection and empirical wavelet transform. Sci. Total Environ. 801, 149654 https:// 32, 4–24. https://doi.org/10.1109/TNNLS.2020.2978386.
doi.org/10.1016/j.scitotenv.2021.149654. Wu, C., He, H., Song, R., Zhu, X., Peng, Z., Fu, Q., Pan, J., 2023. A hybrid deep learning
Le, V.-D., Bui, T.-C., Cha, S.-K., 2020. Spatiotemporal deep learning model for citywide model for regional O3 and NO2 concentrations prediction based on spatiotemporal
air pollution interpolation and prediction. In: 2020 IEEE International Conference on dependencies in air quality monitoring network. Environ. Pollut. 320, 121075
Big Data and Smart Computing (BigComp). Presented at the 2020 IEEE International https://doi.org/10.1016/j.envpol.2023.121075.
Conference on Big Data and Smart Computing (BigComp), pp. 55–62. https://doi. Xu, Y., Wang, F., An, Z., Wang, Q., Zhang, Z., 2023. Artificial intelligence for
org/10.1109/BigComp48618.2020.00-99. science—bridging data to wisdom. The Innovation 4, 100525. https://doi.org/
Li, X., Peng, L., Yao, X., Cui, S., Hu, Y., You, C., Chi, T., 2017. Long short-term memory 10.1016/j.xinn.2023.100525.
neural network for air pollutant concentration predictions: method development and Yu, C., Wang, F., Shao, Z., Qian, T., Zhang, Z., Wei, W., Xu, Y., 2024. GinAR: An end-to-
evaluation. Environ. Pollut. 231, 997–1004. https://doi.org/10.1016/j. end multivariate time series forecasting model suitable for variable missing. Doi:1
envpol.2017.08.114. 0.48550/arXiv.2405.11333.
Li, L.-L., Wen, S.-Y., Tseng, M.-L., Wang, C.-S., 2019. Renewable energy prediction: a Zang, Z., Guo, Y., Jiang, Y., Zuo, C., Li, D., Shi, W., Yan, X., 2021. Tree-based ensemble
novel short-term prediction model of photovoltaic output power. J. Clean. Prod. 228, deep learning model for spatiotemporal surface ozone (O3) prediction and
359–375. https://doi.org/10.1016/j.jclepro.2019.04.331. interpretation. Int. J. Appl. Earth Obs. Geoinf. 103, 102516 https://doi.org/
Lin, Y., Mago, N., Gao, Y., Li, Y., Chiang, Y.-Y., Shahabi, C., Ambite, J.L., 2018. 10.1016/j.jag.2021.102516.
Exploiting Spatiotemporal Patterns for Accurate Air Quality Forecasting Using Deep Zhan, Y., Luo, Y., Deng, X., Grieneisen, M.L., Zhang, M., Di, B., 2018. Spatiotemporal
Learning, in: Proceedings of the 26th ACM SIGSPATIAL International Conference on prediction of daily ambient ozone levels across China using random forest for human
Advances in Geographic Information Systems, SIGSPATIAL ‘18. Association for exposure assessment. Environ. Pollut. 233, 464–473. https://doi.org/10.1016/j.
Computing Machinery, New York, NY, USA, pp. 359–368. https://doi.org/10.1145/ envpol.2017.10.029.
3274895.3274907.
Liu, X., Qin, M., He, Y., Mi, X., Yu, C., 2021. A new multi-data-driven spatiotemporal
PM2.5 forecasting model based on an ensemble graph reinforcement learning
14
S. Wang et al. Science of the Total Environment 946 (2024) 174229
Zhang, J., Chen, F., Guo, Y., Li, X., 2020. Multi-graph convolutional network for short- Zhu, H., Lu, X., 2016. The prediction of PM2.5 value based on ARMA and improved BP
term passenger flow forecasting in urban rail transit. IET Intell. Transp. Syst. 14, neural network model. In: 2016 International Conference on Intelligent Networking
1210–1217. https://doi.org/10.1049/iet-its.2019.0873. and Collaborative Systems (INCoS). Presented at the 2016 International Conference
Zhang, B., Rong, Y., Yong, R., Qin, D., Li, M., Zou, G., Pan, J., 2022. Deep learning for air on Intelligent Networking and Collaborative Systems (INCoS), pp. 515–517. https://
pollutant concentration prediction: a review. Atmos. Environ. 290, 119347 https:// doi.org/10.1109/INCoS.2016.81.
doi.org/10.1016/j.atmosenv.2022.119347.
15