Forest Fire Prediction Using Machine Learning Techniques
Forest Fire Prediction Using Machine Learning Techniques
Abstract—Forest Fire Prediction is a key component of forest areas. Recently, simulation tools are wont to predict forest
fire control. This is a major environmental problem that creates fires, however simulation tools round-faced some issues like
ecological destruction in the form of a threatened landscape of the accuracy of the computer file and the simulation tool
natural resources that disrupts the stability of the ecosystem,
increases the risk for other natural hazards, and decreases execution time. the machine learning could be a sub-branch of
resources such as water that causes global warming and water a computing (AI) to be told computers side.Machine learning
pollution. Fire Detection is a key element for controlling such may be divided into 2 classes: supervised, unattended and
incidents. Prediction of forest fire id expected to reduce the reinforcement. Supervised machine learning algorithms are as
impact of forest fire in the future. Many fire detection algorithms regression, Support Vector Machine (SVM), Artificial Neural
are available with different approach towards the detection of
fire. In the existing work processes the fire affected region is Networks (ANN) and Decision trees. In the unattended
predicted based on the satellite images. To predict the occur- learning, the information attributes don ’t seem to be tagged.
rences of a forest fire the proposed system processes using the This leads that the formula should outline the labels. The
meteorological parameters such as temperature, rain, wind and structure of the information set and also the relationship
humidity were used. Random forest regression and Hyperpa- between the options is going to be learned by the formula.[1]
rameter tuning using RandomizedSearchCV algorithm we used
a various sub-samples of dataset on which it fits several decision • The main motivation for forest fire prediction is to provide
trees and uses averaging to improve the predictive accuracy proper resource allocation and to help in best possible way to
and control over-fitting. Based on the analysis of the models firefighters of Fire Management team.
with all the selected meteorological parameters can represent the • The main factors of fire are Meteorological conditions. the
forest fire events.This paper discusses about a comparative study climatic information is gotten from nearby sensors which are
of different models for predicting forest fire such as Decision
Tree, Random Forest, Support Vector Machine, Artificial Neural fused in the closest meteorological stations.
networks (ANN) algorithms. The study of calculation of Ran- • Land with a possible high fire risk has many indicators that
domizedSearchCV coefficient using Hyperparameter tuning gives can be used to measure the forecast by closely evaluating the
best results of Mean absolute error(MAE) 0.03, Mean squared indications.
error(MSE) 0.004, Root mean squared error(RMSR) 0.07 • Every year, fire destroys millions of hectares of land.
Index Terms—Decision Tree, Random Forest, SVM, ANN,
Forest Fire Prediction. These fires have burned vast areas and generate more carbon
monoxide than total vehicle traffic.
• Monitoring potential danger areas and early warning of fire
I. I NTRODUCTION
can greatly reduce response time, as well as the potential for
Forest fires are a matter of concern as a result the cause in damage and firefighting costs.
depth injury to surroundings, property and human life. Hence,
it’s crucial to notice the fire at Associate in nursing earlier
stage. One of the most reasons of the incidence of forest fires II. L ITERATURE SURVEY
is heating as a result, the increasing in average temperature A. Forest Fire Prediction using Artificial Intelligence
of the world. The opposite reasons are because of lightning,
throughout thunderstorms, and human negligence. Annually George E. Sakr et al. (2010), An approach to the study of
a mean of one.2 million acres of the forest within the U.S. forest fire prediction methods based on artificial intelligence
get destroyed because of the wildfires. In The Asian nation has been suggested. Forest fire risk forecast algorithm is built
forest fires have exaggerated by hundred and twenty-fifth on help vector machines. Lebanon data were used for the
between the years 2016 and 2018.Nowadays, there are application of the algorithm and has proven the ability to
numerous technologies for fireplace models to predict the correctly estimate the risk of fire.
unfold of 5 fires, like physical models and mathematical
models. These models rely upon knowledge assortment B. Forest Fire Prediction using Image Mining Technique
throughout forest fires simulations, and sciences laboratory Divya T L et al. (2015) in their paper have presented the by
experiments to specify and predict fireplace growth in several analysing a series of pixel values, an image mining technique
Fig. 1. Block diagram of Forest Fire Prediction. Fig. 3. Low Level Diagram of Forest Fire Prediction.
The Proposed System Block Diagram discusses where – A few key points regarding detailed design are given
we had gathered the data set from Kaggle which consists below:-
of meteorological data then we have performed the ex-
∗ The data utilized in this paper collected from
ploratory analysis that pre-processing where we will try
kaggle. The dataset contains 517 observations and
to remove the noisy data and converting that categorical
13 variables from the natural park of Montesano in
data to numerical data so it will be easy to understand
the European republic. For each incident weekday,
that dataset. After the preprocessing technique and then
month coordinates, and hence the burnt house
hotspot location is identified based on the meteorological
square measure registered, in addition to many
data available in the data set then apply the models to
earth science data such as rain, temperature, hu-
predict the chances of occurrence of fire and send the
midity, and wind. Progress reads input and de-
notification to the nearest station.
velops a regression model assisted by abstraction,
IV. M ETHODOLOGY time and weather variables.
A. SYSTEM DESIGN ∗ After data collection data pre-processing takes
place in which dataset to be formed in standard
System architecture or system architecture is a compu- format.
tational paradigm that describes the structure, behavior ∗ After data preparation suitable model to be se-
and views of the system. A system architecture may be lected based on the dataset.
2
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on May 12,2024 at 09:18:14 UTC from IEEE Xplore. Restrictions apply.
∗ In this project, we are using regression techniques 13 variables in the data set.
used for prediction are Random forest (RF), De- • Null values: It displays the null values in the dataset
cision Tree (DT) and Support Vector Regression • Information of the dataset: It gives us complete
(SVR) and Naive Bayes. information about the dataset to understand and to
∗ After model implementation model evaluate. analyze it.
∗ Predicting the data and accuracy of each model[3].
Now we’re able to read the information, and we can plot
a graph to visualize data.
• Distribution graph(histogram/bar graph) was plotted for
column data that display the unique values between 1 and
50.
• Correlation matrix was plotted which shows columns
where there are more than 1 unique values and also that
we can evaluate the dependence between two variables
and also calculate how the two variables move together.
This Section discusses about the Exploratory Analysis[4].
2) PreProcessing Analysis: When we go through and
talk about data, Normally we think about any big
databases with a massive number of rows and columns.
Although this is likely to be the case, it is not necessarily
the case that the data may be in too many different
Fig. 4. Architecture Level Diagram of Forest Fire Prediction.
forms: Structured tables, images, audio files, videos, etc.
Data: Data Preprocessing is the stage in which the data
– A few key points regarding detailed design are given
is converted, or encoded, to get it to such a state that
below:-
now the computer can quickly parse it.
∗ An architectural diagram is a diagram of the Preprocessing is one of the features that offers a number
system used to explain the overall outline of the of functions and transform classes to translate raw data
software system and the interactions, limits and vectors into representation and also to convert raw data
boundaries between the elements. into a clean data set i.e. (when data is gathered from
∗ After splitting of data into training data and test different sources, in turn, it is collected from the raw
data, for training data we do preprocessing and format which is not feasible for the analysis).
extraction of features and the data will be stored • correlation matrix was plotted for the preprocessed
in database. data were tried to display the correlation between the
∗ from the database we take data for testing data and meteorological features like relative humidity, wind
test it and quantitative analysis will be carried out speed, temperature, rain by this we understand the
and the result will be displayed prediction of fire. dependence between two variable and also measures
This Section discusses about the Development. how two variables move together.
1) Exploratory Analysis: One of the best methods • By this result, we can see that relative humidity and
used in data science these days is Exploratory Data temperature are less correlated.
Analysis. There is a difference between Data Analysis • If the sum is greater than 0, there is a positive
and Exploratory Data Analysis which people fail to correlation. If the value is less than zero it is a negative
understand in the initial days of their career. Exploratory correlation, so that is why we can assume that relative
Data Analysis is a tribute to inferential statistics which humidity and temperature are less correlated since it has
uses random data to protect fairly rigid with rules and a negative value of correlation.
formulas. We will explore a Data set and perform the • We tried converting the categorical data to numerical
exploratory data analysis. To begin this exploratory data which is called one-hot encoding.
analysis, we first import libraries and then define data
plotting functions using matplotlib. And we’ll load the One hot encoding: Machine learning models require
Forest Fire dataset and carry the Data Exploration both input and output variables to be in numeric form.
process were are trying to display the following things: This ensures that if the data includes categorical data,
• Head of the dataset: The head of the dataset function you must encode it into numbers before you can adjust
will help us to display only the top 5 records. and evaluate the model.
• column of the dataset: The column of dataset attribute • We tried converting categorical data of month and day
tells us several observations and variables we have made to numerical data as 1 and 0.
in the data set. It is used to check the dimension of • After encoding we tried to split our dataset in the ratio
data. The forest fire data set has 518 observations and of 75:25 as training dataset and testing dataset. Then we
3
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on May 12,2024 at 09:18:14 UTC from IEEE Xplore. Restrictions apply.
tried in Fitting Random Forest Regression to the dataset
and predict the result.[5]
4
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on May 12,2024 at 09:18:14 UTC from IEEE Xplore. Restrictions apply.
Exploratory Analysis Results:
1. Distribution graphs of month column verses count
We’ve tried to show the column graph values that are
between 1 and 50 unique values.
5
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on May 12,2024 at 09:18:14 UTC from IEEE Xplore. Restrictions apply.
5. Graph of Scatter and density plots We tried to show Speed) are taken into account. Extreme temperatures,
columns with more than 1 unique value. moderate humidity, high wind speeds, significantly raise
the chance of burning. It is also found that the number
of fires in forests is higher than in other surface areas.
As the risk of forest fire increases significantly in the
forest, data mining techniques are to be used for fire
prediction purposes. This project can be further expanded
to do better so that the models are better equipped and
the effects are better. We may also have a UI built for
the application to provide some real-time performance.
The workflow of the UI model could be, the user may
Fig. 15. Scatter plot graph enter the local and the zip code. Using the zip code, we’ll
get latitude and longitude using any APIs and consume
6. Graph of Correlation We tried to show columns with
the coordinates as parameters, get the weather conditions
more than 1 unique value.
like peak temperature, min temperature, humidity, wind
speed, etc. for a given day.
.
R EFERENCES
[1] A. Alonso-Betanzos, O. Fontenla-Romero, B. Guijarro-Berdi nas,
E. Hern´andez-Pereira, M. Inmaculada Paz Andrade, E.
Jim´enez, J. LuisLegido Soto, and T. Carballas, “An intelligent
system for forest fire risk prediction and fire fighting management
in Galicia,”Expert systems with applications, vol. 25, no. 4, pp.
545–554, 2003.
[2] N. Aronszajn, Introduction to the theory of Hilbert spaces. Still-
Fig. 16. Corrrealtion graph water, Oklahoma:Reasearch [sic] Foundation, 1950.
[3] T. Cheng and J. Wang, “Applications of spatio-temporal data
7. Graph of correlation of features We tried to display mining and knowledge for forest fire,” in ISPRS Technical Com-
the dependency between two variables. mission VII Mid Term Symposium, Enschede, 2006, pp. 148–153.
[4] “Integrated Spatio-temporal Data Mining for Forest Fire Predic-
tion,” Transactions in GIS, vol. 12, no. 5, pp. 591–611, 2008
[5] K. Clarke, J. Brass, and P. Riggan, “A cellular automaton model
of wildfire propagation and extinction,” Photogrammetric Engi-
neering and Remote Sensing, vol. 60, no. 11, pp. 1355–1367,
1994.
[6] Z. Li, Y. Kaufman, C. Ithoku, R. Fraser, A. Trishchenko, L.
Giglio, J. Jin, and X. Yu, “A review of AVHRR-based active
fire detection algorithms: Principles, limitations, and recommen-
dations,” Global and Regional Vegetation Fire Monitoring from
Space.
Fig. 17. Correlation graph [7] J. Han, K. Ryu, K. Chi, and Y. Yeon, “Statistics Based Predictive
Geo-spatial Data Mining: Forest Fire Hazardous Area Mapping
8. Graph of Comparison of different Models. Application,” Lecture notes in computer science, pp. 370–381,
2003.
[8] R. Jaiswal, M. Saumitra, D. Kumaran, and S. Rajesh, “Forest fire
risk zone mapping from satellite imagery and gis,” International
Journal of Applied Earth Observation and Geo-information, vol.
4, pp. 1–10,2002.
[9] G. Mitri and I. Gitas, “A semi-automated object-oriented model
for burned area mapping in the Mediterranean region using
Landsat-TM imagery,” International Journal of Wildland Fire, vol.
13, no. 3, pp. 367–376, 2004.
[10] A. Muzy, T. Marcelli, A. Aiello, P. Santoni, J. Santucci, and J.
Balbi, “An object oriented environment applied to a semi-physical
model of fire spread across a fuel bed,” in Actesde la conf´erence
ESS 2001 conference, 2001, pp. 641–643.
Fig. 18. Distribution graph
6
Authorized licensed use limited to: Sri Sai Ram Engineering College. Downloaded on May 12,2024 at 09:18:14 UTC from IEEE Xplore. Restrictions apply.