0005 An Ensemble Model-Based Estimation of Nitrogen Dio
0005 An Ensemble Model-Based Estimation of Nitrogen Dio
Article
An Ensemble Model-Based Estimation of Nitrogen Dioxide in
a Southeastern Coastal Region of China
Sicong He 1 , Heng Dong 1,2 , Zili Zhang 3,4 and Yanbin Yuan 1, *
1 School of Resources and Environment Engineering, Wuhan University of Technology, Wuhan 430070, China;
[email protected] (S.H.); [email protected] (H.D.)
2 Zhejiang Spatiotemporal Sophon Bigdata Co., Ltd., Ningbo 315101, China
3 Ecological Environment Monitoring Center of Zhejiang, Hangzhou 310012, China; [email protected]
4 Zhejiang Key Laboratory of Ecological Environment Monitoring,
Early Warning and Quality Control Research, Hangzhou 310012, China
* Correspondence: [email protected]
Abstract: NO2 (nitrogen dioxide) is a common pollutant in the atmosphere that can have serious
adverse effects on the health of residents. However, the existing satellite and ground observation
methods are not enough to effectively monitor the spatiotemporal heterogeneity of near-surface
NO2 concentrations, which limits the development of pollutant remediation work and medical
health research. Based on TROPOMI-NO2 tropospheric column concentration data, supplemented
by meteorological data, atmospheric condition reanalysis data and other geographic parameters,
combined with classic machine learning models and deep learning networks, we constructed an
ensemble model that achieved a daily average near-surface NO2 of 0.03◦ exposure. In this article, a
meteorological hysteretic effects term and a spatiotemporal term were designed, which considerably
improved the performance of the model. Overall, our ensemble model performed better, with a
10-fold CV R2 of 0.89, an RMSE of 5.62 µg/m3 , and an MAE of 4.04 µg/m3 . The model also had
good temporal and spatial generalization capability, with a temporal prediction R2 and a spatial
Citation: He, S.; Dong, H.; Zhang, Z.; prediction R2 of 0.71 and 0.81, respectively, which can be applied to a wider range of time and
Yuan, Y. An Ensemble Model-Based space. Finally, we used an ensemble model to estimate the spatiotemporal distribution of NO2 in
Estimation of Nitrogen Dioxide in a a coastal region of southeastern China from May 2018 to December 2020. Compared with satellite
Southeastern Coastal Region of observations, the model output results showed richer details of the spatiotemporal heterogeneity
China. Remote Sens. 2022, 14, 2807.
of NO2 concentrations. Due to the advantages of using multi-source data, this model framework
https://doi.org/10.3390/rs14122807
has the potential to output products with a higher spatial resolution and can provide a reference for
Academic Editors: Jun Wang, downscaling work on other pollutants.
Zhanqing Li, Jing Wei and Lin Sun
Keywords: nitrogen dioxide; air pollution; high-spatial-resolution estimation model; machine learning
Received: 6 May 2022
Accepted: 9 June 2022
Published: 11 June 2022
At the end of 2012, China established an air pollutant monitoring network covering
major cities across the country [10], which provided basic support for monitoring air
pollutants. However, due to the sparse distribution of the stations and the inherent defects
of point source monitoring, it is difficult for ground sites to achieve full coverage of regional
NO2 exposure. In order to solve the shortcomings of long-term observations on the ground,
satellites are widely used to monitor the spatial and temporal distribution of atmospheric
NO2 concentrations by virtue of their advantages such as coverage of a large area and
a long access period [11,12]. Currently, the mainstream NO2 monitoring satellites in the
world include MetOp-A(B) [13] with the Global Ozone Monitoring Experiment (GOME-2),
Aura [14] with the Ozone Monitoring Instrument (OMI), and the tropospheric monitoring
instrument (TROPOMI) Sentinel-5p [15]. These are able to acquire light backscattered
spectra and use inversion algorithms to estimate NO2 column concentrations. Among
these, TROPOMI, as the latest and best NO2 monitoring sensor, has provided a stable
global perspective of the spatial distribution of atmospheric NO2 concentrations since 2018,
with a resolution of about 7 km.
However, NO2 in cities mainly comes from mobile source emissions and industrial
combustion [16], and the variation gradient within a day is large [17]. The complex multi-
source emission characteristics make the NO2 in cities have obvious spatial and temporal
variability. The fixed transit time of satellites and their spatial resolution of less than
7 km are likely to homogenize these spatiotemporal heterogeneities, and the homogenized
features are often of great significance to the scientific prevention and control of NO2 and
health research.
Using data from site and satellite observations, a number of high-temporal- and spatial-
resolution NO2 exposure assessment models have been developed and have continued
to evolve over the decades. The existing high-spatial-resolution NO2 estimation models
are mainly divided into statistical models and physical models. Physical models improve
the spatial resolution of target pollutants through high-precision models of the physical
area, such as the Community Multiscale Air Quality (CMAQ). However, these are limited
by the amount of computation and precursor data [18,19]. Statistical models are exposure
assessment methods that simulate the statistical relationship between pollutants and im-
pact factors. Earlier statistical models, such as geostatistical models [20,21] and land use
regression models [22,23], improved spatial accuracy at the expense of reduced temporal
resolution. For the development of models, machine-learning- and deep-learning-based
methods have been applied in statistical models for NO2 estimation, which are able to
improve the resolution in both time and space [24,25]. Recent studies have shown that
these methods and their variants (e.g., ensemble models) capture the nonlinear variation
characteristics of pollutants well [26] and are significantly better than traditional statistical
models in terms of modeling accuracy, as they can achieve a certain degree of accuracy in
capturing the spatiotemporal heterogeneity of the target gas. These models have become
more and more popular in modeling studies of various atmospheric pollutants, including
NO2 , at a high temporal and spatial resolution. For example, You et al. [27], Dou et al. [28],
and Liu [29] realized a fine-scale evaluation of NO2 in China based on machine learning
algorithms. However, research on high-precision modeling of NO2 at medium and small re-
gional scales in China has not been fully discussed, and there is still room for improvements
in the accuracy of the model. How to make full use of multi-source data, including satellite,
station, and physical model outputs; strengthen the ability of explanatory variables to
characterize changes in NO2 concentrations; and further improve the models’ estimation
accuracy require further consideration.
In order to improve the model performance and obtain high-precision, daily average
NO2 near-surface concentration distribution data with a high-spatial-resolution, we de-
signed a meteorological hysteretic effects term and a spatiotemporal term. At the same time,
by combining multi-source explanatory variables such as the tropospheric NO2 column
concentration observed by Sentinel-5p, atmospheric conditions, and human activities, an
ensemble “Classic Machine Learning + Deep Neural Network” model was constructed. On
Remote Sens. 2022, 14, 2807 3 of 19
Remote Sens. 2022, 14, 2807 column concentration observed by Sentinel-5p, atmospheric conditions, and human 3 ofac-
18
tivities, an ensemble “Classic Machine Learning + Deep Neural Network” model was con-
structed. On this basis, the NO2 concentration of a coastal region of southeastern China
was generated
this basis, with
the NO a high spatial resolution, and the accuracy and error of three classic
2 concentration of a coastal region of southeastern China was generated
machine
with a high spatial resolution, and
learning algorithms and the
the ensemble model
accuracy and were
error compared
of three classicand analyzed.
machine Fi-
learning
nally, based on temporal, spatial, and completely random-set partitioning
algorithms and the ensemble model were compared and analyzed. Finally, based on tem-strategies, the
inferring powerand
poral, spatial, of the ensemblerandom-set
completely model in time and spacestrategies,
partitioning was comprehensively
the inferringevaluated.
power of
the ensemble model in time and space was comprehensively evaluated.
2. Materials and Methods
2. Materials
2.1. Data and and Methods
Pretreatment
2.1. Data and Pretreatment
2.1.1. Ground-Level NO2 Observations
2.1.1. Ground-Level NO2 Observations
This study obtained the hourly NO2 observation data from May 2018 to December
This study obtained the hourly NO2 observation data from May 2018 to December
2020 from the China Environmental Monitoring Center (CEMC). All sites are based on the
2020 from the China Environmental Monitoring Center (CEMC). All sites are based on
GB3095-2012 standard, and the near-ground NO2 concentration is measured by the chem-
the GB3095-2012 standard, and the near-ground NO2 concentration is measured by the
iluminescence method [30]. The study area is the coastal area (116–124° E, 25–33° N) with
chemiluminescence method [30]. The study area is the coastal area (116–124◦ E, 25–33◦ N)
the main cities of the Yangtze River Delta as the core. There are 258 monitoring stations
with the main cities of the Yangtze River Delta as the core. There are 258 monitoring
within the range, most of which are located in urban areas, and a few are located in sub-
stations within the range, most of which are located in urban areas, and a few are located
urban areas (Figure 1). In order to build a high-precision NO2 exposure assessment model,
in suburban areas (Figure 1). In order to build a high-precision NO2 exposure assessment
we averaged the hourly NO2 observation records to the day, eliminated daily observation
model, we averaged the hourly NO2 observation records to the day, eliminated daily
data of less than
observation data18
of h, and
less finally
than 18 h,obtained 225,197
and finally valid225,197
obtained observation records.
valid observation records.
Figure
Figure 1.
1. Distribution
Distribution of
of the
the air
air quality
quality stations.
stations.
2.1.2. TROPOMI
2.1.2. TROPOMI NO NO22 Data
Data
The TROPOspheric
The TROPOspheric Monitoring
Monitoring Instrument
Instrument (TROPOMI)
(TROPOMI) can can monitor
monitor thethe data
data of
of air
air
pollutants such as NO22,, O
pollutants O33,, SO
SO22, ,and
andHCHO.
HCHO.The
Thetime
time when
when the
the satellite
satellite passes
passes through
through
the study area
the area (Beijing
(Beijingtime)
time)isisabout
about13:00–14:00, and
13:00–14:00, thethe
and spatial resolution
spatial is 7 is
resolution km × 3.5
7 km × km
3.5
(30 April 2018 to 6 August 2019) and 5.5 km × 3.5 km (7 August 2019 to now), making it the
best atmospheric observation spectrometer at present. Under the processing framework of
the retrieval–assimilation–modeling system, the TROPOMI NO2 Level 2 product combines
Remote Sens. 2022, 14, 2807 4 of 18
the DOAS algorithm and the TM5 chemical transport model, and converts the measured
Level-1B radiance and irradiance spectra into NO2 vertical column concentrations, in
units of molec/cm2 [24]. For this article, the TROPOMI NO2 Level 2 product from NASA
(https://disc.gsfc.nasa.gov/, accessed on 7 July 2021) was obtained and the tropospheric
NO2 column concentration was taken from it as a modeling factor. To weaken cloud cover,
snow landscapes, and other dubious retrievals, we kept the pixels in the file that had a
quality assurance value (QA) greater than 0.75. Data missing more than 30% were also
removed, as too many missings would increase uncertainty.
of this standard. For data with a spatial resolution lower than 0.03◦ , the inverse distance
weighting method (IDW) was used to interpolate them into the mesh; for the data with a
spatial resolution higher than 0.03◦ , we used resampling technology to upscale it into the
standard mesh. By matching the geographic locations of the NO2 ground observation sites,
all the data finally formed a dataset for modeling and validation.
2.2. Methodology
2.2.1. Hysteretic Effects Term
Meteorological conditions often have a continuous impact on atmospheric pollutants.
Over time, a change in the mechanism of influence or the accumulated effects of the
influence may increase the magnitude of the influence, usually called the hysteretic effects.
Examples include the chemical accumulation process of temperature-induced changes
in atmospheric pollutant concentrations [35] and the interaction mechanism between the
short-term wet deposition of trace gases by rainfall and the hydrolysis of pollutants such as
formaldehyde polymers in a humid atmospheric environment [36]. These physicochemical
processes create a certain hysteresis effect.
Therefore, in this study, we bundled the meteorological predictors observed at the
site with the records of the previous 2 days and used them as the meteorological hysteretic
effects to construct a NO2 exposure assessment model, which is conducive to solving
the problem that the single-day variables cannot fully characterize the characteristics of
the change in NO2 concentration due to the hysteretic impact. The details are shown in
Equation (1):
Vti = (vti , vti−1 , vti−2 ) (1)
where Vti is the meteorological hysteretic effects term of day ti, and vti , vti−1 and vti−2 are
the meteorological factors of day ti and the previous two days, respectively.
Figure 2. Model framework: RF stands for Random Forest Model, EXT stands for Extreme Random
Forest Model,
Forest Model, and
and XGB
XGB stands
stands for
for XGboost
XGboost Model.
Model. The
The near-surface
near-surface NO
NO22 was the final output of the
ensemble model.
2.2.4.
2.2.4. Model
Model Validation
Validation
In
In the 10-foldcross-validation
the 10-fold cross-validation technique
technique(10-fold CV),
(10-fold the the
CV), dataset is randomly
dataset divided
is randomly di-
into 10 data subsets and 1 data subset is selected in turn for model validation,
vided into 10 data subsets and 1 data subset is selected in turn for model validation, with with the
remaining
the remaining9 used for model
9 used establishment.
for model The final
establishment. The evaluation of theof
final evaluation 10-fold CV results
the 10-fold CV re-is
the
sultsaverage of the 10oftime
is the average the validations, which can
10 time validations, morecan
which realistically evaluate evaluate
more realistically the model’s
the
performance. To evaluate the fitting performance and generalization
model’s performance. To evaluate the fitting performance and generalization perfor- performance of
the ensemble model, we used the 10-fold cross-validation technique to calculate the R 2
mance of the ensemble model, we used the 10-fold cross-validation technique to calculate
(goodness of fit), of
the R2 (goodness MAEfit), (mean absolute
MAE (mean error),error),
absolute and RMSE
and RMSE(root-mean-square
(root-mean-squareerror). The
error).
details are shown in Equations (4)–(7).
The details are shown in Equations (4)–(7).
∑in∑
2
=1 ( y(𝑦
i − ŷi𝑦
))
R2𝑅= =
1−1 (4)
(4)
∑
∑ (y(𝑦
n
i − y)
𝑦2)
i =1
1 1n
MAE==n ∑ |(|(𝑦
MAE yi − ŷi𝑦)|)| (5)
(5)
𝑛i=1
s
1 n
n1i∑
RMSE = (yi − ŷi )2 (6)
=1
RMSE = (𝑦 𝑦 ) (6)
10 𝑛
1
10 j∑
2
CV( p) = CVj ( p ) , p ∈ R , MAE, RMSE (7)
=1
enhancement factors were added separately (Plan2 and Plan3). In particular, when the
spatiotemporal term was added, the R2 of the RF, EXT, and XGB models reached 0.81, 0.82,
and 0.88, meaning that the performance of the three models was improved by 4%, 4%, and
6%, respectively. When the meteorological hysteretic effects term and the spatiotemporal
term were added at the same time, the performance of each machine learning model still
increased slightly with an increase in the explanatory variables, and the R2 range was
0.82–0.88. Under the same scheme, the performances of the three models were ranked in
descending order as XGB, EXT, and RF.
Table 1. Evaluation of the performance of three machine learning models with different
variable combinations.
Scheme Name Predictors (x) Predictand (y) RF (R2 ) EXT (R2 ) XGB (R2 )
Plan 1 Basic predictors 0.77 0.78 0.82
Basic predictors + meteorological
Plan 2 0.80 0.81 0.86
lag factors NO2 concentration
Basic predictors + spatiotemporal monitored by
Plan 3 0.81 0.82 0.88
heterogeneity factor the station
Basic predictors + spatiotemporal
Plan 4 heterogeneity factor + 0.82 0.84 0.88
meteorological lag factors
Figure S3a–d show the relative importance of the independent variables for the four
schemes. When no model enhancement variables were introduced (Figure S3a), TROPOMI
NO2 was the most important factor (RF: 35%, EXT: 25%, XGB: 25%), followed by air pressure
(AP) and boundary layer height (BLH), which were two or three times less important
than TROPOMI NO2 concentrations. When the meteorological hysteretic effect term was
introduced into the model (Figure S3b,d), its relative importance in all models was 2–14%.
The relative importance of air pressure and land surface temperature two days ago was
higher than that of the current day, which indicates that the meteorological hysteretic effects
term is better than the simple meteorological factor. When the spatiotemporal term was
introduced into the model (Figure S3c,d), the temporal heterogeneity factor became one
of the most important factors (7–20%), second only to TROPOMI NO2 . Overall, although
there were some differences in the relative importance of the variables in different models,
TROPOMI NO2 , boundary layer height, the meteorological hysteretic effects term, and the
spatiotemporal term had important effects in all models.
corrected the predictions of the different machine learning models through the deep learn-
Remote Sens. 2022, 14, 2807 ing network and outperformed the individual machine learning models. 9 of 18
Figure
Figure 3. 3.
TheThe 10-foldCV
10-fold CVresults
resultsofofthe
themodel:
model: (a)
(a) validation
validation results of the
the random
randomforest
forestmodel;
model;(b)
(b)validation
validationresults
results of the extreme
extreme random
randomforest
forestmodel;
model;(c)
(c)validation
validationresults
results
ofof the
the XGBoost
XGBoost model;
model;
(d) validation results of the ensemble model; (e) validation results of the multiple linear model;
(f) validation results of the SVM model. The units of RMSE and MAE were µg/m3 .
Figure 4 shows the local R2 distribution information. Spatially, the local R2 of the
annual statistical scale and the full-time statistical scale have similar regional distribution
Remote Sens. 2022, 14, 2807 10 of 19
(d) validation results of the ensemble model; (e) validation results of the multiple linear model; (f)
validation results of the SVM model. The units of RMSE and MAE were µg/m3.
Remote Sens. 2022, 14, 2807 10 of 18
Figure 4 shows the local R2 distribution information. Spatially, the local R2 of the an-
nual statistical scale and the full-time statistical scale have similar regional distribution
characteristics. The local R2 is high in the areas with a large population and densely dis-
characteristics. The local R2 is high in the areas with a large population and densely
tributed monitoring stations, while the local R2 in the areas with a small population and
distributed monitoring stations, while the local R2 in the areas with a small population
sparsely distributed stations is low. In the case of sparse sites, it is difficult for the model
and sparsely distributed stations is low. In the case of sparse sites, it is difficult for the
to fully explore the statistical relationship between the NO2 concentration and the explan-
model to fully explore the statistical relationship between the NO22 concentration and the
atory variables. Across the entire study period, the range of local R is 0.13–0.94, and 92%
explanatory variables.
of the sites have a local Across theThis
R2 of >0.7. entire study
result is period, the
consistent range
with the of local R2 is
performance 0.13–0.94,
results of
and 92% of the sites have a local R 2 of >0.7. This result is consistent with the performance
local RMSE (Figure S4) and MAE (Figure S5), which means that the model has a relatively
results
stableof localperformance.
fitting RMSE (Figure S4) and MAE (Figure S5), which means that the model has a
relatively stable fitting performance.
Figure 4. Spatial
Figure Spatialdistribution
distributionofofthe local
the local 2 values
R2Rvalues of the ensemble
of the model:
ensemble (a) average
model: annual
(a) average vali-
annual
dation results
validation of the
results of ensemble model;
the ensemble (b) validation
model; resultsresults
(b) validation of the ensemble model in
of the ensemble 2018; in
model (c) 2018;
vali-
dation
(c) resultsresults
validation of the ensemble model in
of the ensemble 2019;in(d)2019;
model validation results of
(d) validation the ensemble
results model inmodel
of the ensemble 2020;
in 2020.
We also captured the model prediction results for different site locations and plotted
the trend of changes in the daily average NO2 concentration from May 2018 to December
2020 (Figure S6). In the time series, the predicted results are basically consistent with the
actual observed results (Figure S2), and there is no obvious deviation, indicating that the
model shows high stability for the time series.
Based on the ensemble model framework, three validation sets were designed in this
study, namely all samples in the last 3 months of 2020, 30% randomly selected samples
Remote Sens. 2022, 14, 2807 11 of 18
We also captured the model prediction results for different site locations and plotted
the trend of changes in the daily average NO2 concentration from May 2018 to December
2020 (Figure S6). In the time series, the predicted results are basically consistent with the
actual observed results (Figure S2), and there is no obvious deviation, indicating that the
model shows high stability for the time series.
Based on the ensemble model framework, three validation sets were designed in this
study, namely all samples in the last 3 months of 2020, 30% randomly selected samples from
all sites, and 30% randomly selected samples. These validation sets did not participate in
model training, and their validation results represent the temporal, spatial, and comprehen-
sive predictive capabilities of the model, respectively, as detailed in Figure 5. The ensemble
model had the strongest comprehensive prediction ability (Figure 5a), with an R2 of 0.86
between the predicted value and the true value, which means that the model was able to
predict 86% of the near-ground NO2 concentrations. The performance of the model for
temporal (Figure 5b) and spatial (Figure 5c) variations was slightly weaker than the com-
prehensive prediction ability, but it still showed a relatively good spatiotemporal prediction
ability and applicability. The temporal and spatial R2 values of the model reached 0.71 and
0.81, respectively, and the corresponding RMSE and MAE were 10.24 µg/m3 , 7.79 µg/m3 ,
7.39 µg/m3 , and 5.50 µg/m3 . Compared with the XGBoost model (Figure 5d–f), the com-
Remote Sens. 2022, 14, 2807 12 of 19
prehensive prediction R2 , temporal R2 , and spatial R2 of the ensemble model were improved
by 1%, 2%, and 1%, respectively, and the corresponding RMSE was reduced by 0.23 µg/m3 ,
0.19 µg/m3 , and 0.23 µg/m3 .
Figure
Figure 5. 5.The
Theprediction
prediction accuracy
accuracyofofthethe
ensemble
ensemble andand
XGBoost model:
XGBoost (a) validation
model: results ofresults
(a) validation com-
pletely random selection of 30% of the samples as the validation set (ensemble
of completely random selection of 30% of the samples as the validation set (ensemble model); model); (b) validation
results using
(b) validation the samples
results using from the last three
the samples frommonths
the lastof three
2020 as the validation
months of 2020setas(ensemble model);
the validation set
(c) validation results of randomly selecting 30% of samples from all sites as the validation set (en-
(ensemble model); (c) validation results of randomly selecting 30% of samples from all sites as the
semble model); (d) validation results of completely random selection of 30% of the samples as the
validation set set
validation (ensemble
(XGBoost model);
model);(d)(e)validation
validationresults
resultsof completely
using random
the samples fromselection of 30%
the last three of the
months
samples
of 2020asasthe
thevalidation
validationset
set(XGBoost
(XGBoost model);
model); (f)(e) validation
validationresults
resultsofusing the samples
randomly selectingfrom
30% ofthe
lastsamples from allofsites
three months 2020asas
thethe
validation
validationset (XGBoost
set (XGBoost model). The units
model); of RMSE and
(f) validation MAEofare
results µg/m3.
randomly
selecting 30% of samples from all sites as the validation set (XGBoost model). The units of RMSE and
MAE 3.2. Analysis
are µg/m3 .of the Fine-Scale Spatiotemporal Variation in NO2
To analyze the ability of the ensemble model of capturing the spatially heterogeneous
features, we interpolated the satellite-observed tropospheric NO2 vertical column concen-
trations to a spatial resolution of 0.03° and compared it with the ensemble model’s output
of the spatial distribution of the near-surface NO2 concentration (Figures S7 and S8). The
spatial distribution law of the model output and the interpolation results of satellite ob-
servations were basically consistent, which proves the correctness of the model output
results from a new perspective. In detail, the interpolation results of the satellite observa-
tions were relatively smooth. Although the spatial resolution after interpolation was con-
Remote Sens. 2022, 14, 2807 12 of 18
Figure
Figure 6. 6. Seasonaldistribution
Seasonal distributionofofnear-surface
near-surface NO
NO22 concentrations
concentrationsoutput
outputbybythe
theensemble
ensemble model:
model:
(a) spatial distribution of NO 2 concentrations in spring;(b) spatial distribution of NO 2 concentra-
(a) spatial distribution of NO2 concentrations in spring;(b) spatial distribution of NO2 concentrations
in tions in summer;
summer; (c) spatial
(c) spatial distribution
distribution NO2 concentrations
of NO2ofconcentrations in autumn;
in autumn; (d) spatial
(d) spatial distribution
distribution of NO2
of NO2 concentrations in winter. The units are 3 µg/m3.
concentrations in winter. The units are µg/m .
All in all, there is still much room for improvement in the air quality of major cities in
the Yangtze River Delta Plain. Most of these places have relatively developed economies, a
large population, and low terrain. Anthropogenic emissions may be the main reason for
the formation of these high-value areas. In recent years, a series of effective measures for
environmental governance have reduced the average concentration of NO2 in the entire
region, but the effect has not been significant. It is still necessary to implement targeted
prevention and control measures for cities with more serious pollution.
Yangtze River Delta Plain has a large population base and complex human activities, and
is also one of the most important economic areas in China. In accordance with the national
ambient air quality standard GB 3095-2012, we calculated the NO2 air quality of major
cities in the Yangtze River Delta Plain with a daily average NO2 concentration of >80
µg/m3. If any grid in the city did not meet the national regulations, it was considered to
Remote Sens. 2022, 14, 2807 14 of 18
have exceeded the standard. The details are shown in Figure 7.
Between 28 April 2018 and 31 December 2020, Hefei was the city most seriously pol-
4. Discussion
lutedFocusing
by NO2, with on the 95scientific
days of air NOof
issue 2 exceeding the standard. In addition, Suzhou, Wuxi,
assessing near-surface NO2 concentrations, we de-
Hangzhou, Shanghai, and Nanjing
signed a model enhancement factor considering had more than 60 days exceeding
the influencing mechanism the of
standard, and
multi-source
the corresponding maximum NO 2 concentration was also relatively high, indicating that
explanatory variables on the near-surface NO2 concentration and the spatial heterogeneity
the
in NOair quality of these cities was relatively poor. It is worth noting that when NO2 air
2 concentration itself. With an ensemble model composed of a deep neural network
pollution occurs,
and machine learning the near-surface
methods, aNO 2 concentration in Zhenjiang and Changzhou can
high-precision ground NO2 concentration exposure
reach very high levels
assessment model was developed. (greater than 135 µg/m3), although the number of days when this
occurs is not particularly
Compared with previous great, indicating
studies, that the
our model two cities
exhibits certainneed to be alertThe
advantages. to the occur-
ensemble
rence
modelofinsudden
this study NO2has air apollution incidents.
better ability Amongthe
to evaluate theNOmany2
cities in
concentrations, the Yangtze
and its River
R2 in
Delta Plain, only Ningbo City did not have the phenomenon of
the 10-fold cross-validation results is 0.89, which is better than the performance of most excessive NO 2 concentra-
tions
models,in the
suchair.as the annual-scale stepwise regression model (CV R2 : 0.78) constructed by
All in all,
Xu et al. [46], the there is still muchland
seasonal-scale roomuse forregression
improvement modelin the
(CVair R2quality of major cities
: 0.70) constructed by
in the Yangtze
Zhang et al. [23],River Delta Plain. Most
the annual-scale land useof these placesmodel
regression have relatively 2
(CV R : 0.67) developed
constructedecono-
by
mies,
Andrew a large
Larkinpopulation,
et al. [47], andthelow terrain.scale
monthly Anthropogenic
random forest emissions
model in may be the
China main rea-
constructed
son
by Youfor et
theal.formation
[27] (CV Rof 2 : 0.85),
thesethe high-value
daily-scale areas.
random In recent
forest andyears, a series
extreme of effective
random forest
measures for environmental
models constructed by Qin etgovernance
al. [43] (CV R 2 : 0.70
have reduced the average
and 0.72), the XGBoost concentration
daily-scaleofmodel
NO2
in the 2
entire region, but the effect has not been significant. It is
(CV R : 0.83) constructed by Liu [29], and the random forest spatiotemporal kriging (RF-still necessary to implement
targeted
STK) model (CV R2 : 0.62)
prevention and control measures
constructed by Zhanfor et
cities withAt
al. [48]. more serious
the same pollution.
time, the model in this
article outputs daily-scale results with a spatial resolution of 0.03◦ (about 3 km), which can
4. Discussion
better meet the needs of research such as popular medicine and environmental protection.
As far as we know,
Focusing on the there are notissue
scientific many ofdaily-scale NO2 exposure
assessing near-surface NO assessment modelswe
2 concentrations, with
de-a
spatial resolution better than 0.03 ◦ in China. Only the random forest model constructed
signed a model enhancement factor considering the influencing mechanism of multi-
by Panexplanatory
source et al. [45] and the ensemble
variables model (Machine
on the near-surface Learning + GAM)
NO2 concentration and theconstructed
spatial heter-by
Huang etinal.NO
ogeneity [26] and Di et al. [49] reached a spatial resolution of 1 km, but the performance
2 concentration itself. With an ensemble model composed of a deep neural
of these models (CV R 2 : 0.67–0.82) was slightly lower than that of the model in this
network and machine learning methods, a high-precision ground NO2 concentration ex-2
study. The
posure prediction
assessment performance
model of our model was also good: The spatial prediction R
was developed.
and the temporal prediction R2 were 0.81 and 0.71 respectively, which is slightly weaker
than the model with the best temporal prediction ability of the abovementioned studies
(Di et al. [49], spatial prediction R2 : 0.84, temporal prediction R2 : 0.73).
Multi-source predictors and their effective utilization are the keys to improving model
performance. As in most studies [29,43], the TROPOMI tropospheric NO2 column concen-
tration was the most important predictor in this model because it has a consistent trend with
near-surface NO2 concentrations (Figure S2). Compared with OMI-NO2 (the NO2 observed
by the Ozone Monitoring Instrument), the NO2 exposure assessment models constructed
with TROPOMI-NO2 had better accuracy, which was demonstrated in the study of Liu
et al. [29]. We also designed a meteorological hysteretic effects term and a spatiotemporal
Remote Sens. 2022, 14, 2807 15 of 18
term. When used alone, they significantly improved the model’s performance (accuracy
improvement: 4–6%) and played an important role in the explanatory variable system
(Figure S1), which is consistent with previous studies in China [42,45]. However, when
the meteorological hysteretic effects term and the spatiotemporal term were used at the
same time, the performance of the model was not improved significantly, which may be
because the accuracy of the model itself had reached a high level, and there were systematic
errors in the explanatory variables themselves. In addition, the atmospheric condition
variables and other geographic factors in the hourly climate analysis data also play a role
to varying degrees. Although the relative importance of these variables is not high enough
to significantly improve the model’s performance, they increase the spatial heterogeneity
of the model. For example, DEM, city proportion, road network density, and population
are highly consistent with the predicted spatial distribution of NO2 concentrations.
The systematic and comprehensive multi-source explanatory variable system also
increases the interpretability and applicability of the model. It contains large-scale data
sources at fine spatial scales, giving the model the potential to produce higher-spatial-
resolution NO2 products (better than 1 km) and apply them to larger areas. In future
studies, the model is also expected to be applicable to other high-level exposure assessment
models for atmospheric pollutants that have both site and satellite observations.
However, there are still some limitations of our study. First, although the model
proposed in this article has a good temporal prediction ability (R2 : 0.71), this is only a
short-term prediction, and the stationarity of the long-term prediction of the model needs
to be further explored and proved. Second, the performance of the model and the potential
for application of the model at higher time scales are limited due to the coarse resolution
of the geographic data and atmospheric condition data. On the one hand, the spatial
resolution of the analytical data (ERA5) used by the model is 0.25◦ , and a single pixel
record is not enough to perfectly describe the atmospheric conditions at the site location,
which increases the uncertainty of the model. On the other hand, at a fine spatiotemporal
scale (hourly scales, 100 m level resolution), the information provided by TROPOMI NO2
and the meteorological data will not be sufficient to reflect the short-term changes in NO2
concentration caused by transportation, daily life, and industrial emissions. Variations in
human activity at a fine enough spatial and temporal resolution are needed to provide
relevant information. Currently, such data are difficult to obtain and are not publicly
available, which means that it is difficult for the model to improve in terms of temporal
resolution. Third, the meteorological hysteretic effects term proposed in this article only
considers the meteorological factors over 3 days. This assumption is too simple and may
limit the further improvement of the model’s performance. The time span involved in the
hysteretic effects may be more than 3 days and is not fixed. In future research, developing
a dynamic hysteretic effects term will help to improve the model.
5. Conclusions
In this article, we combined the XGB, RF, EXT, and deep neural network models,
and used multi-source data to develop a high-precision ensemble model, which was
successfully applied to the central and southern coastal areas of China, generating a near-
ground daily average NO2 concentration distribution of 0.03◦ . Compared with other
studies, the ensemble model had better estimation accuracy, with 10-fold CV R2 , RMSE,
and MAE values of 0.89, 5.62 µg/m3 , and 4.04 µg/m3 , respectively.
The model has good generalization in space and time, and the short-term daily av-
erage NO2 prediction effects were stable and reliable, which shows that the model can
be applied to larger areas and longer time series. Compared with satellite observations
and site observations, the ensemble model output provides rich and accurate details of the
spatiotemporal heterogeneity in NO2 , which will be helpful for air pollutant traceability
and urban health research in the future.
Remote Sens. 2022, 14, 2807 16 of 18
Supplementary Materials: The following supporting information can be downloaded at: https:
//www.mdpi.com/article/10.3390/rs14122807/s1, Table S1: Modeling variable preselection table;
Figure S1: Pearson correlation between the near-ground NO2 concentrations and the explanatory
variables; Figure S2: Time variation trend of NO2 concentrations observed by TROPOMI and ground
stations; Figure S3: The relative importance of explanatory variables in the three machine learning
methods of XGB, EXT, and RF: (a) the relative importance of variables in Plan 1; (b) the relative
importance of variables in Plan 2; (c) the relative importance of variables in Plan 3; (d) the relative
importance of variables in Plan 4; Figure S4: Spatial distribution of the local RMSE values of the
ensemble model; Figure S5: Spatial distribution of the local MAE values of the ensemble model;
Figure S6: The daily average variation in NO2 concentrations outputted by the ensemble model;
Figure S7: The distribution of the daily average NO2 concentrations outputted by the ensemble
model; Figure S8: The distribution of daily average tropospheric NO2 column concentrations interpo-
lated from TROPOMI observations; Figure S9: The monthly and annual average variation in NO2
concentrations outputted by the ensemble model.
Author Contributions: Conceptualization, S.H., H.D. and Y.Y.; methodology, S.H.; software, S.H.;
validation, S.H., H.D. and Y.Y.; formal analysis, S.H. and H.D.; resources, H.D. and Y.Y.; data
curation, Z.Z.; writing—original draft preparation, S.H.; writing—review and editing, H.D. and Y.Y.;
visualization, Z.Z.; supervision, H.D. and Y.Y.; project administration, H.D.; funding acquisition, H.D.
All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (52079101),
the Zhejiang Ecological Environment Research and Achievement Promotion Project (2020HT0011),
and the Ningbo Science and Technology Plan Project (2021ZH1CXYD060013).
Acknowledgments: The authors thank NASA for providing free TROPOMI NO2 data and vege-
tation index data, the European Space Agency for providing ERA5 reanalysis data, the National
Meteorological Science Data Center for providing meteorological data, the China Environmental
Monitoring Center for providing ground-level NO2 observation data, the Climate Change Initiative
for providing land cover data, SRTM for providing DEM data, OSM for providing road network data,
and Worldpop for providing population data.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Kim, H.C.; Lee, P.; Judd, L.; Pan, L.; Lefer, B. OMI NO2 column densities over North American urban cities: The effect of satellite
footprint resolution. Geosci. Model Dev. 2016, 9, 1111–1123. [CrossRef]
2. Palmer, P.I.; Jacob, D.J.; Fiore, A.M.; Martin, R.V.; Chance, K.; Kurosu, T.P. Mapping isoprene emissions over North America using
formaldehyde column observations from space. J. Geophys. Res. Atmos. 2003, 108, 4180. [CrossRef]
3. Jacob, D.J.; Heikes, E.G.; Fan, S.M.; Logan, J.A.; Mauzerall, D.L.; Bradshaw, J.D.; Singh, H.B.; Gregory, G.L.; Talbot, R.W.; Blake,
D.R.; et al. Origin of ozone and NOx in the tropical troposphere: A photochemical analysis of aircraft observations over the South
Atlantic basin. J. Geophys. Res. Atmos. 1996, 101, 24235–24250. [CrossRef]
4. Gifford, F. Atmospheric Chemistry and Physics of Air Pollution. Eos Trans. Am. Geophys. Union 1987, 68, 1595. [CrossRef]
5. Fishman, J.; Crutzen, P.J. The origin of ozone in the troposphere. Nature 1978, 274, 855–858. [CrossRef]
6. Chen, R.; Samoli, E.; Wong, C.M.; Huang, W.; Wang, Z.; Chen, B.; Kan, H.; Group, C.C. Associations between short-term exposure
to nitrogen dioxide and mortality in 17 Chinese cities: The China Air Pollution and Health Effects Study (CAPES). Environ. Int.
2012, 45, 32–38. [CrossRef]
7. Gauderman, W.J.; McConnell, R.; Gilliland, F.; London, S.; Thomas, D.; Avol, E.; Vora, H.; Berhane, K.; Rappaport, E.B.; Lurmann,
F.; et al. Association between air pollution and lung function growth in southern California children. Am. J. Respir. Crit. Care Med.
2000, 162, 1383–1390. [CrossRef]
8. Faustini, A.; Rapp, R.; Forastiere, F. Nitrogen dioxide and mortality: Review and meta-analysis of long-term studies. Eur. Respir. J.
2014, 44, 744–753. [CrossRef]
9. Jerrett, M.; Burnett, R.T.; Beckerman, B.S.; Turner, M.C.; Krewski, D.; Thurston, G.; Martin, R.V.; van Donkelaar, A.; Hughes,
E.; Shi, Y.; et al. Spatial analysis of air pollution and mortality in California. Am. J. Respir. Crit. Care Med. 2013, 188, 593–599.
[CrossRef]
10. Wang, W.N.; Cheng, T.H.; Gu, X.F.; Chen, H.; Guo, H.; Wang, Y.; Bao, F.W.; Shi, S.Y.; Xu, B.R.; Zuo, X.; et al. Assessing Spatial and
Temporal Patterns of Observed Ground-level Ozone in China. Sci. Rep. 2017, 7, 3651. [CrossRef]
11. Sun, J.; Zhou, C.Y.; Zhang, Y.H.; Yang, X.Y.; Ge, L.; Liu, J.J. Spatio-temporal variation of tropospheric NO2 column density in Shan
dong Province nearly five years. Environ. Sci. Technol. 2021, 44, 177–182. [CrossRef]
Remote Sens. 2022, 14, 2807 17 of 18
12. Martin, R.V. Evaluation of GOME satellite measurements of tropospheric NO2 and HCHO using regional data from aircraft
campaigns in the southeastern United States. J. Geophys. Res. 2004, 109, D24307. [CrossRef]
13. Wang, H.; Wei, W.; Che, H.; Tang, X.; Bian, J.; Yu, K.; Wang, W. Ground-Based MAX-DOAS Measurements of Tropospheric
Aerosols, NO2 , and HCHO Distributions in the Urban Environment of Shanghai, China. Remote Sens. 2022, 14, 1726. [CrossRef]
14. Levelt, P.F.; Van Den Oord, G.H.J.; Dobber, M.R.; Malkki, A.; Visser, H.; De Vries, J.; Stammes, P.; Lundell, J.O.V.; Saari, H. The
ozone monitoring instrument. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1093–1101. [CrossRef]
15. Rabiei-Dastjerdi, H.; Mohammadi, S.; Saber, M.; Amini, S.; McArdle, G. Spatiotemporal Analysis of NO2 Production Using
TROPOMI Time-Series Images and Google Earth Engine in a Middle Eastern Country. Remote Sens. 2022, 14, 1725. [CrossRef]
16. van der A, R.J.; Mijling, B.; Ding, J.; Koukouli, M.E.; Liu, F.; Li, Q.; Mao, H.; Theys, N. Cleaning up the air: Effectiveness of air
quality policy for SO2 and NOx emissions in China. Atmos. Chem. Phys. 2017, 17, 1775–1789. [CrossRef]
17. Cyrys, J.; Eeftens, M.; Heinrich, J.; Ampe, C.; Armengaud, A.; Beelen, R.; Bellander, T.; Beregszaszi, T.; Birk, M.; Cesaroni, G.; et al.
Variation of NO2 and NOx concentrations between and within 36 European study areas: Results from the ESCAPE study. Atmos.
Environ. 2012, 62, 374–390. [CrossRef]
18. Kim, H.; Lee, S.-M.; Chai, T.; Ngan, F.; Pan, L.; Lee, P. A Conservative Downscaling of Satellite-Detected Chemical Compositions:
NO2 Column Densities of OMI, GOME-2, and CMAQ. Remote Sens. 2018, 10, 1001. [CrossRef]
19. Goldberg, D.L.; Lamsal, L.N.; Loughner, C.P.; Lu, Z.; Streets, D.G. A high-resolution and observationally constrained OMI NO2
satellite retrieval. Atmos. Chem. Phys. 2017, 17, 11403–11421. [CrossRef]
20. Cersosimo, A.; Serio, C.; Masiello, G. TROPOMI NO2 Tropospheric Column Data: Regridding to 1 km Grid-Resolution and
Assessment of their Consistency with In Situ Surface Observations. Remote Sens. 2020, 12, 2212. [CrossRef]
21. Beloconi, A.; Vounatsou, P. Bayesian geostatistical modelling of high-resolution NO2 exposure in Europe combining data from
monitors, satellites and chemical transport models. Environ. Int. 2020, 138, 105578. [CrossRef]
22. Novotny, E.V.; Bechle, M.J.; Millet, D.B.; Marshall, J.D. Correction to National Satellite-Based Land-Use Regression: NO2 in the
United States. Environ. Sci. Technol. 2011, 45, 8596. [CrossRef]
23. Zhang, L.; Yang, C.; Xiao, Q.; Geng, G.; Cai, J.; Chen, R.; Meng, X.; Kan, H. A Satellite-Based Land Use Regression Model of
Ambient NO2 with High Spatial Resolution in a Chinese City. Remote Sens. 2021, 13, 397. [CrossRef]
24. Yu, M.; Liu, Q. Deep learning-based downscaling of tropospheric nitrogen dioxide using ground-level and satellite observations.
Sci. Total Environ. 2021, 773, 145145. [CrossRef]
25. Chen, J.; de Hoogh, K.; Gulliver, J.; Hoffmann, B.; Hertel, O.; Ketzel, M.; Bauwelinck, M.; van Donkelaar, A.; Hvidtfeldt,
U.A.; Katsouyanni, K.; et al. A comparison of linear regression, regularization, and machine learning algorithms to develop
Europe-wide spatial models of fine particles and nitrogen dioxide. Environ. Int. 2019, 130, 104934. [CrossRef]
26. Huang, C.; Sun, K.; Hu, J.; Xue, T.; Xu, H.; Wang, M. Estimating 2013–2019 NO2 exposure with high spatiotemporal resolution in
China using an ensemble model. Environ. Pollut. 2022, 292, 118285. [CrossRef]
27. You, J.W.; Zou, B.; Zhao, X.G.; Xu, S.; He, R. Estimating ground-level NO2 concentrations across mainland China using random
forests regression modeling. China Environ. Sci. 2019, 39, 969–979. [CrossRef]
28. Dou, X.; Liao, C.; Wang, H.; Huang, Y.; Tu, Y.; Huang, X.; Peng, Y.; Zhu, B.; Tan, J.; Deng, Z.; et al. Estimates of daily ground-level
NO2 concentrations in China based on Random Forest model integrated K-means. Adv. Appl. Energy 2021, 2, 100017. [CrossRef]
29. Liu, J. Mapping high resolution national daily NO2 exposure across mainland China using an ensemble algorithm. Environ.
Pollut. 2021, 279, 116932. [CrossRef]
30. Wang, Y.; Ying, Q.; Hu, J.; Zhang, H. Spatial and temporal variations of six criteria air pollutants in 31 provincial capital cities in
China during 2013–2014. Environ. Int. 2014, 73, 413–422. [CrossRef]
31. Fenn, M.E.; Richard, H.; Tonnesen, G.S.; Baron, J.S.; Susanne, G.C.; Diane, H.; Jaffe, D.A.; Scott, C.; Linda, G.; Rueth, H.M.
Nitrogen Emissions, Deposition, and Monitoring in the Western United States. Bioscience 2003, 53, 391–403. [CrossRef]
32. Anttila, P.; Tuovinen, J.-P.; Niemi, J.V. Primary NO2 emissions and their role in the development of NO2 concentrations in a traffic
environment. Atmos. Environ. 2011, 45, 986–992. [CrossRef]
33. He, Q.; Huang, B. Satellite-based mapping of daily high-resolution ground PM2.5 in China via space-time regression modeling.
Remote Sens. Environ. 2018, 206, 72–83. [CrossRef]
34. Reid, C.E.; Jerrett, M.; Petersen, M.L.; Pfister, G.G.; Morefield, P.E.; Tager, I.B.; Raffuse, S.M.; Balmes, J.R. Spatiotemporal prediction
of fine particulate matter during the 2008 northern California wildfires using machine learning. Environ. Sci. Technol. 2015, 49,
3887–3896. [CrossRef] [PubMed]
35. Zhu, L.; Mickley, L.J.; Jacob, D.J.; Marais, E.A.; Sheng, J.; Hu, L.; Abad, G.G.; Chance, K. Long-term (2005–2014) trends in
formaldehyde (HCHO) columns across North America as seen by the OMI satellite instrument: Evidence of changing emissions
of volatile organic compounds. Geophys. Res. Lett. 2017, 44, 7079–7086. [CrossRef]
36. Pang, X.; Mu, Y.; Lee, X.; Zhang, Y.; Xu, Z. Influences of characteristic meteorological conditions on atmospheric carbonyls in
Beijing, China. Atmos. Res. 2009, 93, 913–919. [CrossRef]
37. Robinson, D.P.; Lloyd, C.D.; McKinley, J.M. Increasing the accuracy of nitrogen dioxide (NO2 ) pollution mapping using
geographically weighted regression (GWR) and geostatistics. Int. J. Appl. Earth Obs. Geoinf. 2013, 21, 374–383. [CrossRef]
38. Qin, K.; Rao, L.; Xu, J.; Bai, Y.; Zou, J.; Hao, N.; Li, S.; Yu, C. Estimating Ground Level NO2 Concentrations over Central-Eastern
China Using a Satellite-Based Geographically and Temporally Weighted Regression Model. Remote Sens. 2017, 9, 950. [CrossRef]
Remote Sens. 2022, 14, 2807 18 of 18
39. He, Q.; Huang, B. Satellite-based high-resolution PM2.5 estimation over the Beijing-Tianjin-Hebei region of China using an
improved geographically and temporally weighted regression model. Environ. Pollut. 2018, 236, 1027–1037. [CrossRef]
40. Karney, C.F.F. Algorithms for geodesics. J. Geod. 2012, 87, 43–55. [CrossRef]
41. Behrens, T.; Schmidt, K.; Viscarra Rossel, R.A.; Gries, P.; Scholten, T.; MacMillan, R.A. Spatial modelling with Euclidean distance
fields and machine learning. Eur. J. Soil Sci. 2018, 69, 757–770. [CrossRef]
42. Liu, R.; Ma, Z.; Liu, Y.; Shao, Y.; Zhao, W.; Bi, J. Spatiotemporal distributions of surface ozone levels in China from 2005 to 2017: A
machine learning approach. Environ. Int. 2020, 142, 105823. [CrossRef]
43. Qin, K.; Han, X.; Li, D.; Xu, J.; Loyola, D.; Xue, Y.; Zhou, X.; Li, D.; Zhang, K.; Yuan, L. Satellite-based estimation of surface
NO2 concentrations over east-central China: A comparison of POMINO and OMNO2d data. Atmos. Environ. 2020, 224, 117322.
[CrossRef]
44. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the KDD’16: 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, USA, 13–17 August 2016. [CrossRef]
45. Pan, Y.; Zhao, C.; Liu, Z. Estimating the Daily NO2 Concentration with High Spatial Resolution in the Beijing–Tianjin–Hebei
Region Using an Ensemble Learning Model. Remote Sens. 2021, 13, 758. [CrossRef]
46. Xu, H.; Bechle, M.J.; Wang, M.; Szpiro, A.A.; Vedal, S.; Bai, Y.Q.; Marshall, J.D. National PM2.5 and NO2 exposure models
for China based on land use regression, satellite measurements, and universal kriging. Sci. Total Environ. 2018, 655, 423–433.
[CrossRef]
47. Larkin, A.; Geddes, J.A.; Martin, R.V.; Xiao, Q.; Liu, Y.; Marshall, J.D.; Brauer, M.; Hystad, P. Global Land Use Regression Model
for Nitrogen Dioxide Air Pollution. Environ. Sci. Technol. 2017, 51, 6957–6964. [CrossRef]
48. Zhan, Y.; Luo, Y.; Deng, X.; Zhang, K.; Zhang, M.; Grieneisen, M.L.; Di, B. Satellite-Based Estimates of Daily NO2 Exposure in
China Using Hybrid Random Forest and Spatiotemporal Kriging Model. Environ. Sci. Technol. 2018, 52, 4180–4189. [CrossRef]
49. Di, Q.; Amini, H.; Shi, L.; Kloog, I.; Silvern, R.; Kelly, J.; Sabath, M.B.; Choirat, C.; Koutrakis, P.; Lyapustin, A.; et al. Assessing
NO2 Concentration and Model Uncertainty with High Spatiotemporal Resolution across the Contiguous United States Using
Ensemble Model Averaging. Environ. Sci. Technol. 2020, 54, 1372–1384. [CrossRef]