TIJER2306218
TIJER2306218
org
Keywords:
Machine Learning Algorithms, Air Pollutant , Pollution , Weka , Excel , Graph , Accuracy , Human Health , Sustainable
Environment .
1.Introduction
In this modern era of advancements and urbanization, one of the most crucial problems in society is air pollution. Air pollution is
caused by any physical, chemical or biological agents that change the characteristics of the natural form of atmosphere. It is a
pressing global issue that poses significant risk to human health and ecosystem and the overall well-being of the planet. Household
combustion devices, automobile smoke emission, industries and forest fires are the most common sources of air pollution that
release Carbon monoxide, Carbon dioxide, Nitrogen dioxide, Sulphur oxide, Chlorofluorocarbons, Particulate Matter, and other air
pollutants that cause air pollution into the environment. WHO data show that almost 99% people are breathing air that crosses the
WHO guideline limits and is exposed to large amounts of pollutants. The low and middle income countries are found to be affected
the most.During several billion years of chemical and biological evolution, the composition of earth’s atmosphere has changed.
Ambient air quality standards are permissible exposure of all living and nonliving things for 24 hours per day, 7 days per week.
Air pollution poses significant damage to both humans and the environment, so monitoring pollutants level is crucial. We can do
this with the help of Machine Learning models. Machine learning is a subset of Artificial Intelligence that helps the computer to
learn how to build models based on training data. Machine Learning can inspect a wide range of data and recognize particular
trends and patterns. Machine learning is the ability given to a computer program to do a task without any external programming
and this task is achieved by using some statistical and advanced mathematical algorithms. Machine can be considered to be
learning if it can gain experience by doing certain tasks and develop its performance in doing similar tasks in the future. There are
essentially three types of Machine Learning: Supervised learning, Unsupervised learning and Reinforcement learning. The four
main Machine Learning Algorithms used for training the dataset in this project are Linear Regression, Support Vector Machine
(SVM), Bagging and Random Forest. The pollutant levels in a location can be collected with the help of sensors and the dataset can
be used to train the Machine Learning model.
2.Literature Survey
In [1] authors proposed that the Machine Learning models are showing very good accuracy and efficiency in terms of training the
model. Only Machine Learning models can handle and train the rigorous dataset collected with advanced techniques and sensors.
The Machine Learning algorithm KNN is showing accuracy of 99.1071% in their air pollution prediction.
In [2] the authors concluded their work by saying that concentration of air pollutants in ambient air is governed by the various
parameters such as wind speed, wind direction, relative humidity, and temperature. Air Quality Index(AQI), is used to measure the
quality of air. The proposed work is a supervised learning approach using various algorithms such as LR, SVM, DT and RF. The
result has shown that AQI predictions obtained through RF are promising and which are analysed with results.
In [3] the authors intend to develop models based on past data and use them to make future decisions. The future is evaluated or
forecasted in accordance with the past. The Time series supplements an additional time order dependence among observations.
This dependency provides both a knowledge source and a knowledge barrier. According to the authors of this review, the majority
of research has concentrated on evaluating or forecasting the AQI and pollutant concentration levels, which will provide a precise
idea of AQI. Several researchers opt for Artificial Neural Network (ANN), ARIMA Model, Linear Regression, and Logistic
Regression for forecasting of AQI and air pollutants concentration. When protruding the AQI or the subsequent concentration level
of several pollutants, the future needs may take attributes into the picture , including meteorological framework and air
contaminants. As the data switches at particular periods of time, it is also possible to use real-time data analysis through the cloud
to get better outcomes for increased performance.
Analysing the above graphs ,in majority cases Random Forest holds the lowest error rate in Mean Absolute Error(MAE) , Root
Mean Squared Error (RMSE), Relative Absolute Error (RAS) and Root Relative Squared Error (RRSE) . We know that accuracy
and error rate are inversely proportional . As a result, the Random Forest algorithm has a higher accuracy rate compared with
Linear Regression , Bagging and Support Vector Machine(SVM). Therefore , we can conclude that Random Forest is the best
algorithm by training the collected dataset.
8.Conclusion
In conclusion, our project demonstrates that machine learning models can be used to forecast air quality with a high degree of
accuracy. By analysing historical air quality data, we were able to develop a model that can predict air quality in real-time. This
model has the potential to help individuals and organisations take informed action to reduce their exposure to harmful pollutants
and improve public health. Further research can be done to improve the accuracy of the model and to explore other applications of
machine learning in environmental science.
9.Reference
[1] Deepu B P, Dr. Ravindra P Rajput, “Air Pollution Prediction using Machine Learning”,International Research Journal of
Engineering and Technology (IRJET),Volume: 09 Issue: 07 | July 2022
[2] Madhuri VM, Samyama Gunjal GH, Savitha Kamalapurkar, “Air Pollution Prediction Using Machine Learning Supervised
Learning Approach”,INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 04,
APRIL 2020.
[3] Vidit Kumar, Sparsh Singh, Zaid Ahmed, Ms. Nikita Verma, “Air Pollution Prediction using Machine Learning Algorithms: A
Systematic Review”,International Journal of Engineering Research & Technology (IJERT),Vol. 11 Issue 12, December 2022
[4] Shreyas Simu∗,Varsha Turkar∗, Rohit Martires∗, Vranda Asolkar∗, Swizel Monteiro∗, Vaylon Fernandes∗, and Vassant
Salgaoncar, “Air Pollution Prediction using Machine Learning”, ETC Department, Don Bosco College Engineering, Fatorda, Goa,
India
[5] K. Rajakumari, V. Priyanka,“Air Pollution Prediction in Smart Cities by using Machine Learning Techniques”, 2020,
International Journal of Innovative Technology and Exploring Engineering (IJITEE), Volume 9, Issue 05.
[6] Ayele, Temesgen Walelign, and RutvikMehta.”Air pollution monitoring and prediction using IoT.” In 2018 Second
International Conference on Inventive Communication 6 Fig. 12. RH w.r.t Temperature Fig. 13. RH w.r.t CO and Computational
Technologies (ICICCT), pp. 1741-1745. IEEE,2018.
[7] Venkat Rao Pasupuleti, Uhasri , Pavan Kalyan, “Air Quality Prediction Of Data Log By Machine Learning”, 2020 , IEEE
[8] SriramKrishna Yarragunta, Mohammed Abdul Nabi, Jeyanthi.P, “Prediction of Air Pollutants Using Supervised Machine
Learning”, 2021, IEEE
[9] NadjetDjebbri, and MouniraRouainia. ”Artificial neural networks based air pollution monitoring in industrial sites.” In 2017
International Conference on Engineering and Technology (ICET), pp. 1-5. IEEE,2017.
[10] Jiang, Ningbo, and Matthew L. Riley. ”Exploring the utility of the random forest method for forecasting ozone pollution in
SYDNEY.” Journal of Environment Protection and Sustainable Development 1.5 (2015): 245-254