A Survey On Air Quality Prediction Using Machine Learning
A Survey On Air Quality Prediction Using Machine Learning
I. I NTRODUCTION
regression and autoregression. Benzene concentration, in the mid-20th century, with the establishment of rudimentary
conjunction with carbon monoxide also plays a significant monitoring networks [8] in industrialized regions. However,
role in air pollution assessment. The detrimental effects it wasn’t until the late 20th century that standardized AQI
of air pollution on human health range from minor [6] systems began to take shape, driven by advancements in
-irritations to severe respiratory illnesses, cancers, and atmospheric science, environmental engineering, and public
fatal conditions. To combat this pressing issue, accurate health research.
prediction of air quality is essential. Traditional methods Components of AQI
fall short in precision, necessitating the integration of The AQI calculation typically integrates measurements of
Machine Learning, a subset of Artificial Intelligence, in key air pollutants, including particulate matter (PM2.5,
predicting the Air Quality Index (AQI). Research efforts PM10), ozone, sulfur dioxide, nitrogen dioxide, and carbon
are ongoing to leverage Machine Learning algorithms monoxide. Each pollutant is assigned a specific weighting
to enhance AQI measurement accuracy, with neural net- factor based on its known health effects, and the overall AQI
works emerging as a promising solution in this field. value is derived from the pollutant with the highest
[10]Accurate measurement of AQI is paramount in the concentration.
battle against air pollution, emphasizing the crucial role AQI Calculation Methods
of advanced technologies in safeguarding human health Various methodologies have been developed for calculating
and environmental well-being AQI values, ranging from simple arithmetic averaging to
more complex statistical models. Common approaches
III. BACKGROUND include the US Environmental Protection Agency’s (EPA)
AQI formula, which employs breakpoints and concentration
Air quality degradation is a global issue with profound
thresholds to categorize air quality into different severity
implications for public health and environmental
levels (e.g., Good, Moderate, Unhealthy). Other models, such
sustainability. The Air Quality Index (AQI) serves as a vital
as the Air Quality Health Index (AQHI) used in Canada,
tool in assessing and communicating air quality information
incorporate additional factors like meteorological conditions
and guiding decision-making processes [1]- [3] and
and pollutant interactions.
interventions to safeguard human health and the
Applications of AQI
environment. However, traditional AQI prediction models
AQI data serves as a vital tool for informing public health
often overlook crucial environmental factors, limiting their
interventions, urban planning decisions, and environmental
effectiveness in capturing the complexities of air quality
policy formulation. By providing real-time information on air
dynamics. To address this gap, this survey paper proposes an
quality conditions, AQI systems enable authorities to issue
innovative approach that integrates machine learning
advisories, implement pollution control measures, and
techniques, specifically Support Vector Machine (SVM) and
allocate resources more effectively. Furthermore, AQI indices
Random Forest, with environmental considerations. By
play a crucial role in raising public awareness about air
incorporating comprehensive environmental datasets
pollution-related risks and promoting behavioral changes to
encompassing meteorological parameters, land use patterns,
mitigate exposure.
and pollution sources, alongside conventional AQI predictors,
AQI Monitoring Technologies
this approach aims to enhance the accuracy and reliability of
Advancements in sensor technology, data analytics, and
AQI forecasts. Through interdisciplinary collaboration and
remote sensing have revolutionized AQI monitoring
stakeholder engagement, this research endeavor seeks to
capabilities in recent years. Traditional ground-based
contribute to the development of more sustainable and
monitoring stations have been supplemented with
effective air quality management strategies [2]. By adopting
satellite-based sensors, mobile monitoring platforms, and
an environmentally conscious approach to AQI prediction,
IoT-enabled devices, facilitating comprehensive spatial
we can strive towards a cleaner, healthier, and more resilient
coverage and temporal resolution of air quality data. These
environment for present and future generations.
technological innovations hold promise for enhancing the
accuracy, reliability, and accessibility of AQI information
IV. L ITRATURE R EVIEW across diverse geographical regions.
The field of air quality monitoring and assessment has Regional Variances
garnered significant attention from researchers, [11] Despite the widespread adoption of AQI systems globally,
policymakers, and environmentalists worldwide. In this significant disparities exist in regulatory standards,
section, we provide an overview of the existing literature on monitoring infrastructure, and data reporting practices among
Air Quality Index (AQI) related work, encompassing different regions. Variations in pollutant thresholds,
historical development, methodological approaches, monitoring methodologies, [2] and interpretation criteria
applications, and challenges. pose challenges for harmonizing AQI assessments and
Historical Development fostering cross-border collaboration. Efforts to standardize
The concept of an Air Quality Index (AQI) emerged in AQI protocols and harmonize regulatory frameworks at the
response to growing concerns about air pollution and its international level are essential for ensuring consistency and
effects on public health and the surrounding ecosystem. comparability of air quality data across geopolitical
Early attempts to quantify air pollution levels date back to boundaries.
3
Our proposed methodology involves collecting The Individual AQI (IAQIi ) for each pollutant is calculated
comprehensive environmental data, including meteorological as follows:
parameters, land use patterns, and pollution sources, in
addition to conventional AQI predictors. SVM and Random (IHI − ILO ) × (Ci − CLO )
Forest algorithms are then trained on this augmented dataset IAQIi = + ILO (2)
(CHI − CLO )
to develop robust AQI prediction models.
Where:
• Ci = Concentration of pollutant i
VI. DATA COLLECTION
Once the Individual AQI values for all pollutants are
Environmental data collection involves leveraging remote
calculated, the overall AQI is determined by selecting the
sensing, IoT devices, and government databases to gather
highest individual AQI value among all pollutants.
real-time and historical information on air quality
The results vary depending on the specific pollutants being
influencers. Data preprocessing techniques, including
analyzed. Researchers select methods based on the type of
normalization, feature engineering, and outlier detection, are
pollutants and the location, considering whether it is an
applied to ensure data quality and model performance.
urban or rural area [7]. They then predict accuracy and error
to determine how closely the predicted values align with the
VII. M ODEL D EVELOPMENT AND E VALUATION exact values. This process allows for a comprehensive
SVM and Random Forest models are trained on the assessment of the effectiveness of the methods in different
augmented dataset and assessment relies on standard environmental contexts
performance criteria like accuracy, precision, recall, and
F1-score. Techniques like cross-validation are utilized to
gauge the adaptability and resilience of the model.
TABLE I
C OMPARISON OF A IR Q UALITY P REDICTION T ECHNIQUES
XI. CONCLUSION
This survey synthesizes current research and data on air
pollution, highlighting the urgency of tackling air pollution
through enhanced technological interventions. The focus is
directed towards how machine learning can revolutionize air
quality assessments, contributing to more effective
environmental health management. This paper aims to offer
a thorough examination of the current status of air quality
degradation, its ramifications, and the progressive steps being
taken to mitigate it through advances in technology. .
R EFERENCES
[1] CR Aditya, Chandana R Deshmukh, DK Nayana, and Praveen Gandhi
Vidyavastu. Detection and prediction of air pollution using machine
learning models. International journal of engineering trends and
technology (IJETT), 59(4):204–207, 2018.
[2] Timothy M Amado and Jennifer C Dela Cruz. Development of
machine learning-based predictive models for air quality monitoring and
characterization. In TENCON 2018-2018 IEEE Region 10 Conference,
pages 0668–0672. IEEE, 2018.
[3] Liuzhu Chen, Feiyue Mao, Jia Hong, Lin Zang, Jiangping Chen,
Yi Zhang, Yuan Gan, Wei Gong, and Houyou Xu. Improving pm2. 5 pre-
dictions during covid-19 lockdown by assimilating multi-source obser-
vations and adjusting emissions. Environmental Pollution, 297:118783,
2022.
[4] Georg A Grell, Steven E Peckham, Rainer Schmitz, Stuart A McKeen,
Gregory Frost, William C Skamarock, and Brian Eder. Fully coupled
“online” chemistry within the wrf model. Atmospheric environment,
39(37):6957–6975, 2005.
[5] Gaganjot Kaur Kang, Jerry Zeyu Gao, Sen Chiao, Shengqiang Lu,
and Gang Xie. Air quality prediction: Big data and machine learning
approaches. Int. J. Environ. Sci. Dev, 9(1):8–16, 2018.
[6] Huabing Ke, Sunling Gong, Jianjun He, Lei Zhang, Bin Cui, Yaqiang
Wang, Jingyue Mo, Yike Zhou, and Huan Zhang. Development and
application of an automated air quality forecasting system based on
machine learning. Science of The Total Environment, 806:151204, 2022.
[7] Savita Vivek Mohurle, Richa Purohit, and Manisha Patil. A study of
fuzzy clustering concept for measuring air pollution index. Int. J. Adv.
Sci, 3:43–45, 2018.
[8] Khaled Bashir Shaban, Abdullah Kadri, and Eman Rezk. Urban air
pollution monitoring system with forecasting models. IEEE Sensors
Journal, 16(8):2598–2606, 2016.
[9] Arwa Shawabkeh, Feda Al-Beqain, Ali Redan, and Maher Salem.
Benzene air pollution monitoring model using ann and svm. In 2018
Fifth HCT Information Technology Trends (ITT), pages 197–204. IEEE,
2018.
[10] Kostandina Veljanovska and Angel Dimoski. Air quality index predic-
tion using simple machine learning algorithms. International Journal
of Emerging Trends & Technology in Computer Science (IJETTCS),
7(1):025–030, 2018.
[11] An Wang, Junshi Xu, Ran Tu, Marc Saleh, and Marianne Hatzopoulou.
Potential of machine learning for prediction of traffic related air pol-
lution. Transportation Research Part D: Transport and Environment,
88:102599, 2020.
[12] Xiaosong Zhao, Rui Zhang, Jheng-Long Wu, and Pei-Chann Chang. A
deep recurrent neural network for air quality classification. J. Inf. Hiding
Multim. Signal Process., 9(2):346–354, 2018.