SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1855
Climate Visibility Prediction Using Machine Learning
Gaurav Kadam1, Aman Tobaria2, Sahil Arya3, Asst. Prof. Dr. Jyoti Kaushik(Guide)4
1Gaurav kadam, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India
2Aman Tobaria, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India
3Sahil Arya, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India
4Asst. professor Dr. Jyoti Kaushik, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of
Technology, Delhi, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Visibility distance prediction based on climatic
indicators plays a crucial role in ensuring safety and
efficiency in various sectors, including transportation,
aviation, and environmental monitoring. This research
paper presents a comprehensive analysis of a carefully
curated dataset encompassing diverse climatic indicators,
such as temperature, humidity, wind speed, precipitation,
atmospheric pressure, and corresponding visibility distance
measurements. By exploring the intricate relationships
between these indicators and visibility distance, a robust
regression model is developed using state-of-the-art
techniques. The model is trained and rigorously evaluated,
employing appropriate performance metrics and cross-
validation techniques. Additionally, feature selection
methods are applied to identify the most influential
indicators impacting visibility distance. The research
showcases the significance of regression modeling in
accurately estimating visibility distance, enabling
stakeholders to make informed decisions, mitigate risks, and
implement effective safety measures. The findings highlight
the practical applications of climatic indicator-based
visibility distance prediction and provide valuable insights
for optimizing operations across diverse domains.
Key Words: Machine Learning, weather Visibility,
Decision Tree, XGBoost, KNN-Clustering
1. INTRODUCTION
The accurate prediction of visibility distance based on
climatic indicators is of paramount importance in various
sectors, including transportation, aviation, and
environmental monitoring. Visibility plays a crucial role in
determining the safety and efficiency of operations in
these domains. By developing a regression model that
leverages the relationships between different climatic
indicators and visibility distance, we can effectively
estimate visibility under diverse weather conditions.
In this research paper, our objective is to build a robust
regression model capable of predicting visibility distance
using a comprehensive dataset of climatic indicators.
These indicators may include temperature, humidity, wind
speed, precipitation, and atmospheric pressure, among
others. By analyzing the historical data and understanding
the complex interactions between these variables, we aim
to develop a model that provides accurate and reliable
predictions of visibility distance.
The outcomes of this research have significant
implications for various stakeholders. Meteorologists can
benefit from a deeper understanding of how climatic
indicators influence visibility distance, enabling them to
enhance weather forecasting and advisory services. In the
transportation sector, accurate visibility predictions can
help mitigate risks and improve safety measures for
drivers, pilots, and other operators. Moreover,
environmental monitoring agencies can use this
information to assess air quality and identify regions with
poor visibility due to weather-related factors.
By building a regression model that effectively captures
the relationships between climatic indicators and visibility
distance, we aim to contribute to the body of knowledge in
this field and provide a valuable tool for decision-making
processes and operational planning. Ultimately, this
research aims to enhance safety, efficiency, and
environmental awareness by accurately predicting
visibility distance based on diverse climatic indicators.
1.1 OBJECTIVE
The objective of this research is to develop a regression
model capable of predicting visibility distance using
various climatic indicators. Visibility distance plays a
crucial role in numerous applications such as
transportation, aviation, and safety. By understanding the
relationship between climatic factors and visibility,
accurate predictions can be made to improve decision-
making processes and enhance safety measures. The
proposed regression model will leverage a dataset
containing historical records of visibility distance along
with corresponding climatic indicators such as
temperature, humidity, wind speed, and atmospheric
pressure. These indicators serve as potential predictors
for visibility distance. The model development process
involves several steps. First, the dataset will be
preprocessed to handle missing values, outliers, and
perform any necessary feature engineering techniques to
extract meaningful information. Next, the dataset will be
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1856
divided into training and testing sets to evaluate the
model's performance. Various regression algorithms, such
as linear regression, decision trees, or ensemble methods
like XGBoost or Random Forest, will be considered for
modeling the relationship between the climatic indicators
and visibility distance. The model will be trained on the
training set, and its performance will be evaluated using
appropriate metrics such as mean squared error (MSE),
mean absolute error (MAE), or R-squared value. To
enhance the model's predictive capabilities, techniques
such as feature selection, regularization, or model
optimization may be applied. Additionally, cross-
validation techniques can be used to assess the model's
generalization ability. The ultimate goal of this research is
to build a regression model that accurately predicts
visibility distance based on the given climatic indicators.
The model can then be used to forecast visibility distance
in real-time or future scenarios, assisting in decision-
making processes related to transportation planning,
weather forecasting, and ensuring safety in various
domains.
2. DATASET
The dataset used in this research paper consists of a
comprehensive collection of climatic indicators and
corresponding visibility distance measurements. Data is
collected from NOAA dataset that contain hourly
observation of various climate data, climate variables like
visibility, temperature, wind speed and direction, humidity,
dew point, and pressure. The dataset creation consists of
various measures like dry bulb temperature, wet bulb
temperature, wind speed or wind direction. It encompasses
a diverse range of variables, including temperature,
humidity, wind speed, precipitation, and atmospheric
pressure. The dataset is carefully curated to cover a
significant time period, capturing various weather
conditions and their impact on visibility. Each observation
in the dataset provides detailed information about the
climatic indicators at a specific location and time, along
with the corresponding measured visibility distance. The
dataset's size and quality enable in-depth analysis and
modeling, facilitating the development of a robust
regression model for predicting visibility distance based on
climatic indicators. The dataset's availability and reliability
ensure that the research outcomes are accurate and
applicable in real-world scenarios.
2.1 Data Description:
Based on many variables, this dataset estimates the
visibility distance as follows:
1. VISIBILITY - Distance from which an object can be
seen.
2. DRYBULBTEMPF-Dry bulb temperature (degrees
Fahrenheit). Most commonly reported standard
temperature.
3. WETBULBTEMPF- Wet bulb temperature (degrees
Fahrenheit).
4. DewPointTempF- Dew point temperature (degrees
Fahrenheit).
5. RelativeHumidity- Relative humidity (percent).
6. WindSpeed-Wind speed (miles per hour).
7. WindDirection- Wind direction from true north using
compass directions.
8. StationPressure-Atmospheric pressure (inches of
Mercury; or ‘in Hg’).
9. SeaLevelPressure- Sea level pressure (in Hg).
10. Precip- Total-precipitation in the past hour (in
inches).
A "schema" file, which includes all the necessary details
about the training files, is also something we need from
the customer in addition to training files.
Names of the files, the lengths of the date and time values
in the filenames, the number of columns, the names of the
columns, and the datatypes of the columns.
3. METHODOLOGY
To build a regression model for predicting visibility
distance based on climatic indicators, a systematic
methodology is followed in this research. The steps
involved in the methodology include data collection, data
preprocessing, feature selection, model selection, model
training, model evaluation, and model tuning.
1. Dataset collection: First, a dataset is collected that
contains historical records of climatic indicators such
as temperature, humidity, wind speed, precipitation,
atmospheric pressure, and corresponding visibility
distance measurements. The dataset is carefully
curated to ensure an adequate number of observations
and a diverse range of climatic conditions.
2. Data validation: Data validation techniques for
predicting visibility distance based on climatic
indicators involve identifying and handling outliers,
addressing missing data, ensuring data consistency,
conducting cross-validation, performing sensitivity
analysis, and comparing predictions with ground truth
measurements. These steps help ensure the accuracy
and reliability of the dataset used for regression
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1857
modeling. We have used different sets of validation like
Name validations, Number of columns, Name of
columns, Datatype of columns and Null values of
columns.
3. Data preprocessing: Next, the dataset undergoes
preprocessing to handle missing values, outliers, and
inconsistencies. Techniques such as imputation, outlier
detection, and data normalization or standardization
are applied to ensure data quality and uniformity.
4. Feature selection: Feature selection techniques are then
employed to identify the most relevant climatic
indicators that have a significant impact on visibility
distance. Correlation analysis, feature importance
ranking, or other statistical methods are used to select
the optimal set of features.
5. Model training: Once the features are selected, a
suitable regression model is chosen based on the
nature of the problem and dataset. Linear regression,
decision tree regression, random forest regression, or
other regression algorithms are considered. The
chosen model is trained using the training set, where it
learns the relationship between the climatic indicators
and visibility distance.
6. The trained model is then evaluated using appropriate
evaluation metrics such as mean squared error (MSE),
mean absolute error (MAE), or R-squared value. This
evaluation helps assess the model's performance and
identify areas for improvement.
7. Hyperparameter tuning In the model tuning stage,
hyperparameter optimization techniques like grid
search or cross-validation are employed to fine-tune
the model and optimize its performance. By adjusting
hyperparameters such as regularization parameters,
tree depth, or kernel parameters, the model's
predictive capabilities are enhanced.
8. Prediction: The final regression model, after proper
training, evaluation, and tuning, can be utilized to
predict visibility distance based on given climatic
indicators. The methodology ensures a systematic and
rigorous approach to building an accurate regression
model for visibility distance prediction in diverse
weather conditions.
9. Deployment: The deployment of the regression model
for predicting visibility distance based on climatic
indicators involves packaging the trained model,
implementing software infrastructure, integrating data
sources, processing input data, performing model
inference, visualizing predictions, monitoring and
maintaining the system, ensuring user access and
security, considering scalability and integration, and
continuously evaluating and improving the model. This
process enables the model to be utilized in real-world
scenarios for making visibility distance predictions and
supporting data-driven decision-making.
10. Monitoring and maintenance: Continuously monitor
the model's performance over time and update it as
needed. This ensures the model remains accurate and
reliable as new data becomes available
This methodology provides a general framework, and the
specific implementation details may vary based on the
complexity of the visibility prediction problem, available
data, and chosen machine learning algorithms.
4. ARCHITECTURE
Fig -1: Architecture of Project
5. ALGORITHMS
5.1 Clustering Algorithms:
K-means clustering is used in the project as a supportive
technique for data exploration, feature engineering,
preprocessing, and visualization. It helps identify patterns,
group similar data points, handle outliers, and generate
cluster features for regression modeling. However, K-
means clustering itself does not directly predict visibility
distance, and the main prediction task still relies on a
regression model trained on the climatic indicators.
The incredibly powerful clusters that the K-means
clustering algorithm generates are what make it so
successful. It might be challenging to choose the right
amount of clusters, though. There are a few alternative
methods for figuring out how many clusters are optimum,
but in this post we concentrate on the most effective one.
The steps are explained below:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1858
The sum of the squares of the distances between each data
point and its cluster1 centroid is known as the "Pi in
Cluster1 distance" (abbreviated "Pi C1) 2" in the WCSS
formula). Any method, such as the Manhattan distance or
the Euclidean distance, can be used to calculate the
distance between the data points and the centroid. The
elbow approach carries out the subsequent actions to
determine the clusters' ideal value:
On a given dataset, K-means clustering is carried out for
various K values (which vary from 1 to 10).The WCSS value
is computed for each value of K. draws a curve between the
estimated WCSS values and the K-fold clustering factor. If a
bend's sharp edge or a point on the plot resembles an arm,
that point is said to have the highest K value.
5.2 XgBoost Algorithms:
XGBoost is an ensemble learning algorithm used to predict
visibility distance based on climatic indicators. It combines
the predictions of multiple decision tree models to improve
accuracy. The algorithm involves preparing the data,
training the XGBoost model, tuning hyperparameters for
optimal performance, and evaluating the model using
metrics like MSE or R-squared. XGBoost is known for its
ability to handle complex relationships and handle both
numerical and categorical features effectively.
The XGBoost tree for Regression may be built using the
formulae shown below.
Step 1: Calculate the similarity scores; this aids in the tree's
growth. Similarity Score is equal to:
(Sum of Remainders)2 / Remainders + Lambda
Step 2: Determine how to partition the data by calculating
the gain. Gain is equal to the sum of the similarity scores for
the left tree, the right tree, and the root tree.
Step 3: Prune the tree using the user-defined tree-
complexity parameter, gamma, to find its difference from
Gain. Gamma gain If the outcome is a positive number, do
not prune; if it is a negative number, prune and once again
deduct gamma from the subsequent Gain value up the tree.
Step 4: For the remaining leaves, determine the output
value.
Lambda + Number of residuals / Sum of residuals is the
output value.
The loss function can be calculated as:
Fig -2: Regularization graph
5.3 Decision Tree Algorithms:
The decision tree algorithm is used to predict visibility
distance by constructing a tree-like structure based on
climatic indicators. It selects informative features and
determines optimal splits using criteria like Gini impurity
or entropy. Decision trees are interpretable and can handle
numerical and categorical features. However, they can
overfit the training data, so pruning techniques are
employed to enhance generalization. Decision trees are
widely used due to their simplicity, interpretability, and
ability to capture non-linear relationships.
It is a tool with applications in several industries. Decision
trees can be used to address classification and regression
concerns. The name itself suggests that it uses a flowchart
that mimics a tree structure to represent the predictions
that result from a series of feature-based splits. The leaves
at the end, which come after the root node, decide.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1859
The mean square error is a measurement that indicates
how far our forecasts stray from the intended course.
A decision tree's root node is the node from which the
population first begins to branch out based on different
features.
Decision Nodes: These are the nodes that result from
separating the root nodes.
Leaf Nodes - Leaf nodes or terminal nodes are the nodes
where further splitting is not allowed.
Similar to how a small area of a graph is referred to as a
sub-graph, this decision tree's sub-section is known as a
sub-tree.
Pruning simply involves removing certain nodes to prevent
overfitting.
Fig -3: Decision Tree
6. MODEL TRAINING
1. Data Export from Db - To be used for model training,
data from a stored database is exported as a CSV file.
2. Data Preprocessing
 Remove columns that won't help the model be
trained. These columns were chosen during
the EDA.
 Substitute numpy "nan" for the erroneous
numbers so that we may run imputer on them.
 Check the columns for null values. If present,
use the KNN imputer to impute the null
values.
 Scale the training and test sets of data
independently.
Fig -4: Correlation between the columns
Fig -5: Dropping the columns with high correlaion
3. Clustering - The preprocessed data is clustered using
the KMeans technique. The elbow plot is used to
determine the ideal number of clusters, and the
"KneeLocator" function is used to dynamically
determine the number of clusters. Using several
algorithms is the principle behind clustering.
4. The training of data in several clusters. The Kmeans
model is developed using preprocessed data, and it is
then stored for future prediction usage.
5. Model Selection - We choose the best model for each
cluster after the clusters have been formed. We use the
"XGBoost regressor" and the "Decision Tree
Regressor" methods. The best parameters from
GridSearch are used to pass both algorithms for each
cluster. The Rsquared scores for the two models are
computed, and the model with the higher score is
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1860
chosen. The model is chosen similarly for every cluster.
Every cluster's models are all kept for use in
prediction.
7. RESULT
Machine Learning models analyze historical weather data,
such as temperature, humidity, wind speed, and air
pollution levels, to estimate visibility conditions. By
training the model on past visibility data and
corresponding weather variables, it can learn patterns and
relationships to predict visibility in the future. The results
of such predictions can provide insights into potential
changes in visibility due to climate factors, allowing for
better planning and mitigation strategies.
The Flask Web application provides URL for user interface
to predict the visibility distance.
Fig -6: URL generated for predictions
Fig -7: Graphical User Interface
The prediction done using flask web application or URL
will be saved in a .csv file. The Predictions.csv file contains
the predictions based on different
climatological conditions.
Fig -8: Prediction data
8. CONCLUSION
The prediction of visibility distance based on climatic
indicators is a significant research area with practical
implications in various domains such as transportation,
aviation, and safety. Through the analysis of climatic data
and the application of regression models, researchers have
made notable advancements in understanding the
relationship between climatic indicators and visibility
distance. The literature survey has provided valuable
insights into the selection of relevant indicators, the use of
regression algorithms, feature engineering techniques, and
model evaluation methods. However, there are still
opportunities for further research and improvement.
Future studies can explore advanced machine learning
techniques, incorporate additional data sources, consider
spatial and temporal variability, and develop real-time
prediction systems. Furthermore, the integration of
uncertainty estimation, application-specific studies, and
benchmarking efforts can enhance the accuracy, reliability,
and applicability of visibility distance prediction models.
Overall, continued research in this field has the potential to
improve safety measures, enhance decision-making
processes, and contribute to a better understanding of the
impact of climatic indicators on visibility conditions.
9. FUTURE SCOPE
The field of predicting visibility distance based on climatic
indicators holds significant potential for future research
and development. Advanced machine learning techniques,
such as deep learning and reinforcement learning, offer
promising avenues for improving prediction accuracy by
capturing complex relationships in the data. Additionally,
the integration of additional data sources, such as air
quality measurements and traffic data, can provide a more
comprehensive understanding of visibility conditions.
Spatial and temporal analysis can be explored to account
for localized variations and capture temporal trends.
Hybrid modeling approaches, combining different
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1861
regression models or ensemble methods, can enhance
prediction robustness. Real-time prediction systems that
leverage real-time data streams can be developed for
immediate decision-making. Incorporating uncertainty
estimation techniques can provide valuable insights for
risk assessment. Application-specific studies focused on
domains like autonomous driving or aviation can tailor
models to specific requirements. Validating and
benchmarking models using standardized datasets can
establish benchmarks and enable fair comparisons. By
pursuing these future research directions, the field can
advance, leading to more accurate and reliable visibility
distance prediction models that enhance safety and
decision-making in various applications.
10. ACKNOWLEDGEMENT
We would like to acknowledge the contribution of the
following people without whose help and guidance this
research would not have been completed. We
acknowledge and counsel and support of faculty from
Maharaja Agrasen Institute of Technology for providing us
a platform to research on the topic “Visibility Prediction in
Different Climate” and also would to thank our HOD Dr.
Namita Gupta for giving us the opportunities and time to
conduct and research on our topic. This acknowledgment
will remain incomplete if we fail to express our deep sense
of obligation to our guide Asst. Prof. Dr. Jyoti Kaushik (CSE
Department). We are indeed fortunate and proud to be
supervised by them during our research, which would
have seemed difficult without their motivation, constant
support, and valuable suggestion. We shall ever remain in
debt of our parents and friends for their support and
encouragement during the research. It would’ve been
impossible without their cooperation and support.
REFERENCES
[1]. Zhang, Y., & Yang, J. (2018). Visibility prediction based
on climatic parameters using machine learning methods.
IEEE Access, 6, 20924-20932.
[2]. Wang, Y., Sun, Y., & Zhang, Y. (2020). Predicting
visibility distance with meteorological parameters using
support vector regression. IEEE Access, 8, 161383-
161391.
[3]. Li, W., Li, B., & Zhou, X. (2019). Visibility forecasting
model based on a hybrid approach of statistical regression
and deep learning. Atmospheric Research, 226, 102-114.
[4]. Singh, R., Kumar, S., & Singh, R. K. (2020). Prediction of
visibility using machine learning techniques. Soft
Computing for Problem Solving, 1061-1069.
[5]. Sharma, P., Kumar, A., & Garg, K. (2020). Comparative
analysis of different regression models for visibility
prediction using meteorological parameters. International
Journal of Intelligent Systems and Applications, 12(3), 87-
94.
[6]. Wu, Y., Zhao, Z., & Liu, J. (2017). Visibility distance
prediction using random forest regression based on
meteorological data. Journal of Meteorological Research,
31(5), 837-850.
[7]. Fu, S., & Zhang, G. (2019). Estimation of visibility
distance using support vector regression with feature
selection. Theoretical and Applied Climatology, 136(3-4),
1505-1517.
[8]. Zhang, J., Yang, X., & Zhao, T. (2020). Visibility distance
prediction model based on optimized adaptive neuro-
fuzzy inference system. Journal of Ambient Intelligence
and Humanized Computing, 11(9), 4263-4274.
[9]. Zhao, Z., & Wang, Q. (2019). Predicting visibility
distance using a combined model of wavelet
decomposition and long short-term memory neural
network. IEEE Access, 7, 25261-25269.
[10].Karuppiah, R., and R. Gomathi (2020). Regression-
based distance estimation for visibility. The 11th
International Conference on Computing, Communication,
and Networking Technologies (ICCCNT) will be held in
2020 (pp. 1-6). IEEE.

More Related Content

Similar to Climate Visibility Prediction Using Machine Learning (11)

PDF
IRJET- Machine Learning for Weather Prediction and Forecasting for Local Weat...
IRJET Journal
 
PDF
IRJET- Analyze Weather Condition using Machine Learning Algorithms
IRJET Journal
 
PDF
ARIMA Based Weather Prediction Model using IoT and Open Source Data
IRJET Journal
 
PPTX
Weather forecasting presented by SRMist
yy0194
 
PDF
poster
Nina Culver
 
PPTX
Weather Prediction using programming.pptx
PugaLM8
 
DOC
Final Synopsis -Bharathi(21-4-23).doc
malli36
 
PDF
ENVIRONMENTAL QUALITY PREDICTION AND ITS DEPLOYMENT
IRJET Journal
 
PPTX
Final presentation MIS 637 A - Rishab Kothari
Stevens Institute of Technology
 
PDF
PPT.pdf internship demo on machine lerning
Misbanausheen1
 
PDF
Ijcatr04061003
Editor IJCATR
 
IRJET- Machine Learning for Weather Prediction and Forecasting for Local Weat...
IRJET Journal
 
IRJET- Analyze Weather Condition using Machine Learning Algorithms
IRJET Journal
 
ARIMA Based Weather Prediction Model using IoT and Open Source Data
IRJET Journal
 
Weather forecasting presented by SRMist
yy0194
 
poster
Nina Culver
 
Weather Prediction using programming.pptx
PugaLM8
 
Final Synopsis -Bharathi(21-4-23).doc
malli36
 
ENVIRONMENTAL QUALITY PREDICTION AND ITS DEPLOYMENT
IRJET Journal
 
Final presentation MIS 637 A - Rishab Kothari
Stevens Institute of Technology
 
PPT.pdf internship demo on machine lerning
Misbanausheen1
 
Ijcatr04061003
Editor IJCATR
 

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
PDF
Kiona – A Smart Society Automation Project
IRJET Journal
 
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
PDF
Breast Cancer Detection using Computer Vision
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Kiona – A Smart Society Automation Project
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Ad

Recently uploaded (20)

PPTX
Alan Turing - life and importance for all of us now
Pedro Concejero
 
PDF
lesson4-occupationalsafetyandhealthohsstandards-240812020130-1a7246d0.pdf
arvingallosa3
 
PDF
Plant Control_EST_85520-01_en_AllChanges_20220127.pdf
DarshanaChathuranga4
 
PPSX
OOPS Concepts in Python and Exception Handling
Dr. A. B. Shinde
 
PPTX
Computer network Computer network Computer network Computer network
Shrikant317689
 
PDF
Decision support system in machine learning models for a face recognition-bas...
TELKOMNIKA JOURNAL
 
PPTX
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Endang Saefullah
 
PDF
Clustering Algorithms - Kmeans,Min ALgorithm
Sharmila Chidaravalli
 
PDF
13th International Conference of Security, Privacy and Trust Management (SPTM...
ijcisjournal
 
PDF
Module - 5 Machine Learning-22ISE62.pdf
Dr. Shivashankar
 
PDF
June 2025 Top 10 Sites -Electrical and Electronics Engineering: An Internatio...
elelijjournal653
 
PPTX
Diabetes diabetes diabetes diabetes jsnsmxndm
130SaniyaAbduNasir
 
PPTX
Explore USA’s Best Structural And Non Structural Steel Detailing
Silicon Engineering Consultants LLC
 
PDF
輪読会資料_Miipher and Miipher2 .
NABLAS株式会社
 
PPTX
Precooling and Refrigerated storage.pptx
ThongamSunita
 
PDF
13th International Conference on Artificial Intelligence, Soft Computing (AIS...
ijait
 
PDF
LLC CM NCP1399 SIMPLIS MODEL MANUAL.PDF
ssuser1be9ce
 
DOCX
Engineering Geology Field Report to Malekhu .docx
justprashant567
 
PDF
Bayesian Learning - Naive Bayes Algorithm
Sharmila Chidaravalli
 
PDF
Tesia Dobrydnia - An Avid Hiker And Backpacker
Tesia Dobrydnia
 
Alan Turing - life and importance for all of us now
Pedro Concejero
 
lesson4-occupationalsafetyandhealthohsstandards-240812020130-1a7246d0.pdf
arvingallosa3
 
Plant Control_EST_85520-01_en_AllChanges_20220127.pdf
DarshanaChathuranga4
 
OOPS Concepts in Python and Exception Handling
Dr. A. B. Shinde
 
Computer network Computer network Computer network Computer network
Shrikant317689
 
Decision support system in machine learning models for a face recognition-bas...
TELKOMNIKA JOURNAL
 
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Endang Saefullah
 
Clustering Algorithms - Kmeans,Min ALgorithm
Sharmila Chidaravalli
 
13th International Conference of Security, Privacy and Trust Management (SPTM...
ijcisjournal
 
Module - 5 Machine Learning-22ISE62.pdf
Dr. Shivashankar
 
June 2025 Top 10 Sites -Electrical and Electronics Engineering: An Internatio...
elelijjournal653
 
Diabetes diabetes diabetes diabetes jsnsmxndm
130SaniyaAbduNasir
 
Explore USA’s Best Structural And Non Structural Steel Detailing
Silicon Engineering Consultants LLC
 
輪読会資料_Miipher and Miipher2 .
NABLAS株式会社
 
Precooling and Refrigerated storage.pptx
ThongamSunita
 
13th International Conference on Artificial Intelligence, Soft Computing (AIS...
ijait
 
LLC CM NCP1399 SIMPLIS MODEL MANUAL.PDF
ssuser1be9ce
 
Engineering Geology Field Report to Malekhu .docx
justprashant567
 
Bayesian Learning - Naive Bayes Algorithm
Sharmila Chidaravalli
 
Tesia Dobrydnia - An Avid Hiker And Backpacker
Tesia Dobrydnia
 
Ad

Climate Visibility Prediction Using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1855 Climate Visibility Prediction Using Machine Learning Gaurav Kadam1, Aman Tobaria2, Sahil Arya3, Asst. Prof. Dr. Jyoti Kaushik(Guide)4 1Gaurav kadam, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India 2Aman Tobaria, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India 3Sahil Arya, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India 4Asst. professor Dr. Jyoti Kaushik, Dept. of Computer Science Engineering, Maharaja Agrasen Institute of Technology, Delhi, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Visibility distance prediction based on climatic indicators plays a crucial role in ensuring safety and efficiency in various sectors, including transportation, aviation, and environmental monitoring. This research paper presents a comprehensive analysis of a carefully curated dataset encompassing diverse climatic indicators, such as temperature, humidity, wind speed, precipitation, atmospheric pressure, and corresponding visibility distance measurements. By exploring the intricate relationships between these indicators and visibility distance, a robust regression model is developed using state-of-the-art techniques. The model is trained and rigorously evaluated, employing appropriate performance metrics and cross- validation techniques. Additionally, feature selection methods are applied to identify the most influential indicators impacting visibility distance. The research showcases the significance of regression modeling in accurately estimating visibility distance, enabling stakeholders to make informed decisions, mitigate risks, and implement effective safety measures. The findings highlight the practical applications of climatic indicator-based visibility distance prediction and provide valuable insights for optimizing operations across diverse domains. Key Words: Machine Learning, weather Visibility, Decision Tree, XGBoost, KNN-Clustering 1. INTRODUCTION The accurate prediction of visibility distance based on climatic indicators is of paramount importance in various sectors, including transportation, aviation, and environmental monitoring. Visibility plays a crucial role in determining the safety and efficiency of operations in these domains. By developing a regression model that leverages the relationships between different climatic indicators and visibility distance, we can effectively estimate visibility under diverse weather conditions. In this research paper, our objective is to build a robust regression model capable of predicting visibility distance using a comprehensive dataset of climatic indicators. These indicators may include temperature, humidity, wind speed, precipitation, and atmospheric pressure, among others. By analyzing the historical data and understanding the complex interactions between these variables, we aim to develop a model that provides accurate and reliable predictions of visibility distance. The outcomes of this research have significant implications for various stakeholders. Meteorologists can benefit from a deeper understanding of how climatic indicators influence visibility distance, enabling them to enhance weather forecasting and advisory services. In the transportation sector, accurate visibility predictions can help mitigate risks and improve safety measures for drivers, pilots, and other operators. Moreover, environmental monitoring agencies can use this information to assess air quality and identify regions with poor visibility due to weather-related factors. By building a regression model that effectively captures the relationships between climatic indicators and visibility distance, we aim to contribute to the body of knowledge in this field and provide a valuable tool for decision-making processes and operational planning. Ultimately, this research aims to enhance safety, efficiency, and environmental awareness by accurately predicting visibility distance based on diverse climatic indicators. 1.1 OBJECTIVE The objective of this research is to develop a regression model capable of predicting visibility distance using various climatic indicators. Visibility distance plays a crucial role in numerous applications such as transportation, aviation, and safety. By understanding the relationship between climatic factors and visibility, accurate predictions can be made to improve decision- making processes and enhance safety measures. The proposed regression model will leverage a dataset containing historical records of visibility distance along with corresponding climatic indicators such as temperature, humidity, wind speed, and atmospheric pressure. These indicators serve as potential predictors for visibility distance. The model development process involves several steps. First, the dataset will be preprocessed to handle missing values, outliers, and perform any necessary feature engineering techniques to extract meaningful information. Next, the dataset will be
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1856 divided into training and testing sets to evaluate the model's performance. Various regression algorithms, such as linear regression, decision trees, or ensemble methods like XGBoost or Random Forest, will be considered for modeling the relationship between the climatic indicators and visibility distance. The model will be trained on the training set, and its performance will be evaluated using appropriate metrics such as mean squared error (MSE), mean absolute error (MAE), or R-squared value. To enhance the model's predictive capabilities, techniques such as feature selection, regularization, or model optimization may be applied. Additionally, cross- validation techniques can be used to assess the model's generalization ability. The ultimate goal of this research is to build a regression model that accurately predicts visibility distance based on the given climatic indicators. The model can then be used to forecast visibility distance in real-time or future scenarios, assisting in decision- making processes related to transportation planning, weather forecasting, and ensuring safety in various domains. 2. DATASET The dataset used in this research paper consists of a comprehensive collection of climatic indicators and corresponding visibility distance measurements. Data is collected from NOAA dataset that contain hourly observation of various climate data, climate variables like visibility, temperature, wind speed and direction, humidity, dew point, and pressure. The dataset creation consists of various measures like dry bulb temperature, wet bulb temperature, wind speed or wind direction. It encompasses a diverse range of variables, including temperature, humidity, wind speed, precipitation, and atmospheric pressure. The dataset is carefully curated to cover a significant time period, capturing various weather conditions and their impact on visibility. Each observation in the dataset provides detailed information about the climatic indicators at a specific location and time, along with the corresponding measured visibility distance. The dataset's size and quality enable in-depth analysis and modeling, facilitating the development of a robust regression model for predicting visibility distance based on climatic indicators. The dataset's availability and reliability ensure that the research outcomes are accurate and applicable in real-world scenarios. 2.1 Data Description: Based on many variables, this dataset estimates the visibility distance as follows: 1. VISIBILITY - Distance from which an object can be seen. 2. DRYBULBTEMPF-Dry bulb temperature (degrees Fahrenheit). Most commonly reported standard temperature. 3. WETBULBTEMPF- Wet bulb temperature (degrees Fahrenheit). 4. DewPointTempF- Dew point temperature (degrees Fahrenheit). 5. RelativeHumidity- Relative humidity (percent). 6. WindSpeed-Wind speed (miles per hour). 7. WindDirection- Wind direction from true north using compass directions. 8. StationPressure-Atmospheric pressure (inches of Mercury; or ‘in Hg’). 9. SeaLevelPressure- Sea level pressure (in Hg). 10. Precip- Total-precipitation in the past hour (in inches). A "schema" file, which includes all the necessary details about the training files, is also something we need from the customer in addition to training files. Names of the files, the lengths of the date and time values in the filenames, the number of columns, the names of the columns, and the datatypes of the columns. 3. METHODOLOGY To build a regression model for predicting visibility distance based on climatic indicators, a systematic methodology is followed in this research. The steps involved in the methodology include data collection, data preprocessing, feature selection, model selection, model training, model evaluation, and model tuning. 1. Dataset collection: First, a dataset is collected that contains historical records of climatic indicators such as temperature, humidity, wind speed, precipitation, atmospheric pressure, and corresponding visibility distance measurements. The dataset is carefully curated to ensure an adequate number of observations and a diverse range of climatic conditions. 2. Data validation: Data validation techniques for predicting visibility distance based on climatic indicators involve identifying and handling outliers, addressing missing data, ensuring data consistency, conducting cross-validation, performing sensitivity analysis, and comparing predictions with ground truth measurements. These steps help ensure the accuracy and reliability of the dataset used for regression
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1857 modeling. We have used different sets of validation like Name validations, Number of columns, Name of columns, Datatype of columns and Null values of columns. 3. Data preprocessing: Next, the dataset undergoes preprocessing to handle missing values, outliers, and inconsistencies. Techniques such as imputation, outlier detection, and data normalization or standardization are applied to ensure data quality and uniformity. 4. Feature selection: Feature selection techniques are then employed to identify the most relevant climatic indicators that have a significant impact on visibility distance. Correlation analysis, feature importance ranking, or other statistical methods are used to select the optimal set of features. 5. Model training: Once the features are selected, a suitable regression model is chosen based on the nature of the problem and dataset. Linear regression, decision tree regression, random forest regression, or other regression algorithms are considered. The chosen model is trained using the training set, where it learns the relationship between the climatic indicators and visibility distance. 6. The trained model is then evaluated using appropriate evaluation metrics such as mean squared error (MSE), mean absolute error (MAE), or R-squared value. This evaluation helps assess the model's performance and identify areas for improvement. 7. Hyperparameter tuning In the model tuning stage, hyperparameter optimization techniques like grid search or cross-validation are employed to fine-tune the model and optimize its performance. By adjusting hyperparameters such as regularization parameters, tree depth, or kernel parameters, the model's predictive capabilities are enhanced. 8. Prediction: The final regression model, after proper training, evaluation, and tuning, can be utilized to predict visibility distance based on given climatic indicators. The methodology ensures a systematic and rigorous approach to building an accurate regression model for visibility distance prediction in diverse weather conditions. 9. Deployment: The deployment of the regression model for predicting visibility distance based on climatic indicators involves packaging the trained model, implementing software infrastructure, integrating data sources, processing input data, performing model inference, visualizing predictions, monitoring and maintaining the system, ensuring user access and security, considering scalability and integration, and continuously evaluating and improving the model. This process enables the model to be utilized in real-world scenarios for making visibility distance predictions and supporting data-driven decision-making. 10. Monitoring and maintenance: Continuously monitor the model's performance over time and update it as needed. This ensures the model remains accurate and reliable as new data becomes available This methodology provides a general framework, and the specific implementation details may vary based on the complexity of the visibility prediction problem, available data, and chosen machine learning algorithms. 4. ARCHITECTURE Fig -1: Architecture of Project 5. ALGORITHMS 5.1 Clustering Algorithms: K-means clustering is used in the project as a supportive technique for data exploration, feature engineering, preprocessing, and visualization. It helps identify patterns, group similar data points, handle outliers, and generate cluster features for regression modeling. However, K- means clustering itself does not directly predict visibility distance, and the main prediction task still relies on a regression model trained on the climatic indicators. The incredibly powerful clusters that the K-means clustering algorithm generates are what make it so successful. It might be challenging to choose the right amount of clusters, though. There are a few alternative methods for figuring out how many clusters are optimum, but in this post we concentrate on the most effective one. The steps are explained below:
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1858 The sum of the squares of the distances between each data point and its cluster1 centroid is known as the "Pi in Cluster1 distance" (abbreviated "Pi C1) 2" in the WCSS formula). Any method, such as the Manhattan distance or the Euclidean distance, can be used to calculate the distance between the data points and the centroid. The elbow approach carries out the subsequent actions to determine the clusters' ideal value: On a given dataset, K-means clustering is carried out for various K values (which vary from 1 to 10).The WCSS value is computed for each value of K. draws a curve between the estimated WCSS values and the K-fold clustering factor. If a bend's sharp edge or a point on the plot resembles an arm, that point is said to have the highest K value. 5.2 XgBoost Algorithms: XGBoost is an ensemble learning algorithm used to predict visibility distance based on climatic indicators. It combines the predictions of multiple decision tree models to improve accuracy. The algorithm involves preparing the data, training the XGBoost model, tuning hyperparameters for optimal performance, and evaluating the model using metrics like MSE or R-squared. XGBoost is known for its ability to handle complex relationships and handle both numerical and categorical features effectively. The XGBoost tree for Regression may be built using the formulae shown below. Step 1: Calculate the similarity scores; this aids in the tree's growth. Similarity Score is equal to: (Sum of Remainders)2 / Remainders + Lambda Step 2: Determine how to partition the data by calculating the gain. Gain is equal to the sum of the similarity scores for the left tree, the right tree, and the root tree. Step 3: Prune the tree using the user-defined tree- complexity parameter, gamma, to find its difference from Gain. Gamma gain If the outcome is a positive number, do not prune; if it is a negative number, prune and once again deduct gamma from the subsequent Gain value up the tree. Step 4: For the remaining leaves, determine the output value. Lambda + Number of residuals / Sum of residuals is the output value. The loss function can be calculated as: Fig -2: Regularization graph 5.3 Decision Tree Algorithms: The decision tree algorithm is used to predict visibility distance by constructing a tree-like structure based on climatic indicators. It selects informative features and determines optimal splits using criteria like Gini impurity or entropy. Decision trees are interpretable and can handle numerical and categorical features. However, they can overfit the training data, so pruning techniques are employed to enhance generalization. Decision trees are widely used due to their simplicity, interpretability, and ability to capture non-linear relationships. It is a tool with applications in several industries. Decision trees can be used to address classification and regression concerns. The name itself suggests that it uses a flowchart that mimics a tree structure to represent the predictions that result from a series of feature-based splits. The leaves at the end, which come after the root node, decide.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1859 The mean square error is a measurement that indicates how far our forecasts stray from the intended course. A decision tree's root node is the node from which the population first begins to branch out based on different features. Decision Nodes: These are the nodes that result from separating the root nodes. Leaf Nodes - Leaf nodes or terminal nodes are the nodes where further splitting is not allowed. Similar to how a small area of a graph is referred to as a sub-graph, this decision tree's sub-section is known as a sub-tree. Pruning simply involves removing certain nodes to prevent overfitting. Fig -3: Decision Tree 6. MODEL TRAINING 1. Data Export from Db - To be used for model training, data from a stored database is exported as a CSV file. 2. Data Preprocessing  Remove columns that won't help the model be trained. These columns were chosen during the EDA.  Substitute numpy "nan" for the erroneous numbers so that we may run imputer on them.  Check the columns for null values. If present, use the KNN imputer to impute the null values.  Scale the training and test sets of data independently. Fig -4: Correlation between the columns Fig -5: Dropping the columns with high correlaion 3. Clustering - The preprocessed data is clustered using the KMeans technique. The elbow plot is used to determine the ideal number of clusters, and the "KneeLocator" function is used to dynamically determine the number of clusters. Using several algorithms is the principle behind clustering. 4. The training of data in several clusters. The Kmeans model is developed using preprocessed data, and it is then stored for future prediction usage. 5. Model Selection - We choose the best model for each cluster after the clusters have been formed. We use the "XGBoost regressor" and the "Decision Tree Regressor" methods. The best parameters from GridSearch are used to pass both algorithms for each cluster. The Rsquared scores for the two models are computed, and the model with the higher score is
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1860 chosen. The model is chosen similarly for every cluster. Every cluster's models are all kept for use in prediction. 7. RESULT Machine Learning models analyze historical weather data, such as temperature, humidity, wind speed, and air pollution levels, to estimate visibility conditions. By training the model on past visibility data and corresponding weather variables, it can learn patterns and relationships to predict visibility in the future. The results of such predictions can provide insights into potential changes in visibility due to climate factors, allowing for better planning and mitigation strategies. The Flask Web application provides URL for user interface to predict the visibility distance. Fig -6: URL generated for predictions Fig -7: Graphical User Interface The prediction done using flask web application or URL will be saved in a .csv file. The Predictions.csv file contains the predictions based on different climatological conditions. Fig -8: Prediction data 8. CONCLUSION The prediction of visibility distance based on climatic indicators is a significant research area with practical implications in various domains such as transportation, aviation, and safety. Through the analysis of climatic data and the application of regression models, researchers have made notable advancements in understanding the relationship between climatic indicators and visibility distance. The literature survey has provided valuable insights into the selection of relevant indicators, the use of regression algorithms, feature engineering techniques, and model evaluation methods. However, there are still opportunities for further research and improvement. Future studies can explore advanced machine learning techniques, incorporate additional data sources, consider spatial and temporal variability, and develop real-time prediction systems. Furthermore, the integration of uncertainty estimation, application-specific studies, and benchmarking efforts can enhance the accuracy, reliability, and applicability of visibility distance prediction models. Overall, continued research in this field has the potential to improve safety measures, enhance decision-making processes, and contribute to a better understanding of the impact of climatic indicators on visibility conditions. 9. FUTURE SCOPE The field of predicting visibility distance based on climatic indicators holds significant potential for future research and development. Advanced machine learning techniques, such as deep learning and reinforcement learning, offer promising avenues for improving prediction accuracy by capturing complex relationships in the data. Additionally, the integration of additional data sources, such as air quality measurements and traffic data, can provide a more comprehensive understanding of visibility conditions. Spatial and temporal analysis can be explored to account for localized variations and capture temporal trends. Hybrid modeling approaches, combining different
  • 7. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 05 | May 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1861 regression models or ensemble methods, can enhance prediction robustness. Real-time prediction systems that leverage real-time data streams can be developed for immediate decision-making. Incorporating uncertainty estimation techniques can provide valuable insights for risk assessment. Application-specific studies focused on domains like autonomous driving or aviation can tailor models to specific requirements. Validating and benchmarking models using standardized datasets can establish benchmarks and enable fair comparisons. By pursuing these future research directions, the field can advance, leading to more accurate and reliable visibility distance prediction models that enhance safety and decision-making in various applications. 10. ACKNOWLEDGEMENT We would like to acknowledge the contribution of the following people without whose help and guidance this research would not have been completed. We acknowledge and counsel and support of faculty from Maharaja Agrasen Institute of Technology for providing us a platform to research on the topic “Visibility Prediction in Different Climate” and also would to thank our HOD Dr. Namita Gupta for giving us the opportunities and time to conduct and research on our topic. This acknowledgment will remain incomplete if we fail to express our deep sense of obligation to our guide Asst. Prof. Dr. Jyoti Kaushik (CSE Department). We are indeed fortunate and proud to be supervised by them during our research, which would have seemed difficult without their motivation, constant support, and valuable suggestion. We shall ever remain in debt of our parents and friends for their support and encouragement during the research. It would’ve been impossible without their cooperation and support. REFERENCES [1]. Zhang, Y., & Yang, J. (2018). Visibility prediction based on climatic parameters using machine learning methods. IEEE Access, 6, 20924-20932. [2]. Wang, Y., Sun, Y., & Zhang, Y. (2020). Predicting visibility distance with meteorological parameters using support vector regression. IEEE Access, 8, 161383- 161391. [3]. Li, W., Li, B., & Zhou, X. (2019). Visibility forecasting model based on a hybrid approach of statistical regression and deep learning. Atmospheric Research, 226, 102-114. [4]. Singh, R., Kumar, S., & Singh, R. K. (2020). Prediction of visibility using machine learning techniques. Soft Computing for Problem Solving, 1061-1069. [5]. Sharma, P., Kumar, A., & Garg, K. (2020). Comparative analysis of different regression models for visibility prediction using meteorological parameters. International Journal of Intelligent Systems and Applications, 12(3), 87- 94. [6]. Wu, Y., Zhao, Z., & Liu, J. (2017). Visibility distance prediction using random forest regression based on meteorological data. Journal of Meteorological Research, 31(5), 837-850. [7]. Fu, S., & Zhang, G. (2019). Estimation of visibility distance using support vector regression with feature selection. Theoretical and Applied Climatology, 136(3-4), 1505-1517. [8]. Zhang, J., Yang, X., & Zhao, T. (2020). Visibility distance prediction model based on optimized adaptive neuro- fuzzy inference system. Journal of Ambient Intelligence and Humanized Computing, 11(9), 4263-4274. [9]. Zhao, Z., & Wang, Q. (2019). Predicting visibility distance using a combined model of wavelet decomposition and long short-term memory neural network. IEEE Access, 7, 25261-25269. [10].Karuppiah, R., and R. Gomathi (2020). Regression- based distance estimation for visibility. The 11th International Conference on Computing, Communication, and Networking Technologies (ICCCNT) will be held in 2020 (pp. 1-6). IEEE.