0% found this document useful (0 votes)

6 views

cse_ai_batch no.3

Uploaded by

vishvendra0912

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

cse_ai_batch no.3

Uploaded by

vishvendra0912

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

INTERDISCIPLINARY PROJECT REPORT

at
Sathyabama Institute of Science and Technology
(Deemed to be University)

Submitted in partial fulfillment of the requirements for the award of

Bachelor of Engineering Degree in Computer Science and Engineering

By
LEVAKU VENKATA KOWSHIK REDDY
(Reg No : 40731054)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF COMPUTING

SATHYABAMA INSTITUTE OF SCIENCE AND TECHNOLOGY

JEPPIAAR NAGAR, RAJIV GANDHI SALAI,
CHENNAI – 600119, TAMILNADU

APRIL 2023
SATHYABAMA
INSTITUTE OF SCIENCE AND TECHNOLOGY
(DEEMED TO BE UNIVERSITY)
Accredited with Grade “A” by NAAC
(Established under Section 3 of UGC Act, 1956)
JEPPIAAR NAGAR, RAJIV GANDHI SALAI, CHENNAI– 600119
www.sathyabama.ac.in

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

BONAFIDE CERTIFICATE

This is to certify that this Project Report is the bonafide work of LEVAKU
VENKATA KOWSHIK REDDY (Reg No : 40731054) who carried out the project
entitled “ Early risk prediction of forest fires based on Machine Learning
Algorithm” under my supervision from Feb 2023 to April 2023.

Internal Guide

Ms. G.Anbu Selvi M.Tech.,(Ph.D)

Head of the Department

Dr. S. Vigneshwari, M.E., Ph.D

Submitted for Viva voce Examination held on 17-04-2023

Internal Examiner External Examiner

DECLARATION

I,LEVAKU VENKATA KOWSHIK REDDY(40731054) hereby declare that the Project Report
entitled “ Early risk prediction of forest fires based on Machine Learning Algorithm”
done by me under the guidance of Ms.G.Anbu Selvi, M.Tech.,(PhD) is submitted in partial
fulfillment of the requirements for the award of Bachelor of Engineering degree in
Computer Science and Engineering.

17-04-2023
DATE:

PLACE: Chennai SIGNATURE OF THE CANDIDATE

ACKNOWLEDGEMENT

I am pleased to acknowledge my sincere thanks to Board of Management of

SATHYABAMA for their kind encouragement in doing this project and for completing it
successfully. I am grateful to them.

I convey my thanks to Dr.T.Sasikala M.E.,Ph.D, Dean, School of Computing,

Dr.S.Vigneshwari M.E., Ph.D., Head of the Department of Computer Science and
Engineering for providing me necessary support and details at the right time during the
progressive reviews.

I would like to express my sincere and deep sense of gratitude to my Project Guide
Ms.G.Anbu Selvi, M.Tech.,(Ph.D),for her valuable guidance, suggestions andconstant
encouragement paved way for the successful completion of my project work.

I wish to express my thanks to all Teaching and Non-teaching staff members of the
Department of Computer Science and Engineering who were helpful in many ways for
the completion of the project.
TRAINING CERTIFICATE
ABSTRACT

Algerian forest fires have become a major environmental and socio-

economic concern due to their devastating impacts on forests, ecosystems,
and human lives. Accurately predicting forest fires is critical for timely
detection, management, and prevention of these disasters. Machine
learning algorithms have shown promising results in predicting forest fires,
and their potential for use in Algeria is currently being investigated. This
study aims to explore the use of machine learning algorithms for predicting
forest fires in Algeria by analyzing historical data on weather, vegetation,
and forest fires. The research will focus on building predictive models using
machine learning algorithms such as Random Forest, Support Vector
Machines, and Artificial Neural Networks. The models will be trained and
tested on a dataset of forest fire incidents in Algeria to evaluate their
accuracy and performance. The results will provide insights into the
effectiveness of machine learning algorithms in predicting forest fires in
Algeria and their potential use in the country's fire management strategies.
This study will contribute to the development of early warning systems for
forest fires in Algeria, which could help minimize the devastating impacts of
these disasters on the environment and society.

i
TABLE OF CONTENTS

CHAPTER No TITLE PAGE No

ABSTRACT

LIST OF FIGURES

1 INTRODUCTION

1.1 ESTIMATING FOREST FIRES 1

1.1.1 TYPES OF FOREST FIRES 1

1.2 ESTIMATION FOREST FIRES 2

DATASET INFORMATION
1.3 COMMON MACHINE LEARINING 3
ALGORITHMS AND GOALS

2. AIM AND SCOPE OF THE PRESENT

INVESTIGATION

2.1 AIM 10

2.2 SCOPE 10

2.3 DATA PREPARATION 10

2.4 READING THE DATASET 11

ii
EXPERIMENTAL OR MATERIALS AND
3 METHODS, ALGORITHMS USED

3.1 TYPES OF CLASSIFICATION 13

ALGORITHMS USED
3.2 DECISION TREE ALGORITHM 13

3.3 RANDOM FORESTS ALGORITHM 16

3.4 MACHINE LEARINING LIBRARIES 18

3.5 IMPORTED LIBRARIES 20

3.6 IMPLEMENTATION OF DECISION TREE 21

ALGORITHM
3.7 IMPLEMENTATION OF RANDOM 23
FOREST ALGORITHM

RESULTS AND DISCUSSION,

PERFORMANCE ANALYSIS
4

4.1 MODEL ANALYSIS

SUMMARY AND CONCLUSIONS

5
5.1 SUMMARY AND CONCLUSIONS 26

REFERENCES 27

APPENDIX 29
SCREENSHOTS AND OUTPUTS

iii
LIST OF FIGURES

FIGURE NO FIGURE NAME PAGE NO

1.1.1 Types of forest fires 2

1.2 Excel sheet of the given estimated 3

forest fires predictions dataset

1.3.1 Types of machine learning with field 4

of use

1.3.2 Linear Regression 5

1.3.3 Naive Bayes Classifier 6

1.3.4 Logistic Regression 7

1.3.5 Decision Tree 8

1.3.6 Random Forest 8

1.3.7 K Means Clustering 9

2.4.1 syntax for reading dataset 11

2.4.2 imported estimated forest fires dataset 11

in jupyter notebook

2.4.3 using isnull() function checking 12

for null values

3.2.1 Decision Tree Algorithm 14

3.3.1 Random Forests Algorithm 16

3.4.1 Various python libraries for machine 19

learning
3.5.1 Pandas library is used to read the data 20
3.5.2 Imported libraries in jupyter notebook 21
3.6.1 Given dataset 22
3.6.2 Applying decision tree algorithm 22
4.1.1 Accuracy,precision,recall,f1-score for 24
decision tree algorithm
4.1.2 Accuracy of Random forest algorithm 25

iv
CHAPTER-1
INTRODUCTION

1.1 ESTIMATING FOREST FIRES

Estimating forest fires is a crucial aspect of fire management, as it helps identify the size,
intensity, and spread of the fire. One of the primary methods used for estimating forest
fires is remote sensing, which involves using satellite data to detect heat signatures and
smoke plumes. This data is processed to create maps and images that provide insights
into the location and intensity of the fire. Other methods used for estimating forest fires
include ground-based observations, aerial surveys, and fire behavior modeling. These
methods provide additional information on fire behavior, fuel types, and weather
conditions that can help in predicting the fire's future behavior and potential impact.

Accurate estimation of forest fires is essential for timely and effective fire management.
Early detection of forest fires and accurate estimation of their size and intensity can help
fire managers make informed decisions about resource allocation and prioritization of
firefighting efforts. It can also help in the evacuation of affected communities and the
implementation of preventative measures to minimize the fire's impact. Therefore,
ongoing research on improving estimation methods and developing more advanced
technologies, such as machine learning algorithms, is critical for enhancing forest fire
management and reducing the negative impacts of these disasters.

1.1.1 TYPES OF FOREST FIRES

There are three basic types of wildfires:

 Crown Fires
 Surface Fires
 Ground Fires

1
Fig: 1.1.1 Types of Forest Fires

1.2 ESTIMATING FOREST FIRES DATASET INFORMATION

There are 14 attributes in total related to the habits of forest fires and values that are likely
to determine forest fires such as day, month, year, temperature, relative humidity, wind
speed, rain, fine fuel, druff moisture code, drought code, initial spread index, buildupindex,
fire weather index, output

2
Fig 1.2- Excel sheet of the given estimated forest fires predictions dataset

1.3 COMMON MACHINE LEARNING ALGORITHMS AND GOALS

There are three types of Machine learning algorithms which are been widely used.
They are:

 Supervised Learning
 Reinforcement Learning
 Unsupervised Learning

Supervised Learning:

Supervised Learning is a machine learning paradigm for problems where the available
data consists of labelled examples, meaning that each data point contains features and
an associated label. The model is used widely to predict the label of new observations
using the features. Depending on the characteristics of the target variable i.e., it can be
either classification(discrete variable) or regression(continuous variable).

3
Unsupervised Learning:

Unsupervised learning is a Machine learning paradigm for problems where the available
data consists of un labelled examples, meaning that each data point contains features
only, without an associated label. The goal of unsupervised learning algorithms is learning
useful patterns or structural properties of the data. It finds the structures in un labelled
data.

Reinforcement Learning:

Reinforcement learning is an area of machine learning concerned with how intelligent

agents ought to take actions in an environment in order to maximize the notion of
cumulative reward. It works on the action-reward principle. An agent learned to reach
the goal by continuously calculating the rewards that it gained from the actions.

Fig 1.3.1 -Types of machine learning with field of use

4
ALGORITHMS

1. Linear Regression:
Linear regression is the type of supervised machine learning algorithm where the
anticipated output is continuous and features a constant slope. It predicts the values
withinendless range, instead of trying to classify them into categories. It is used to
predict the worth of a variable supported the worth of another variable. For choosing
this algorithm, there needs be a linear relation between independent and target
variable. As scatter plot shows the positive correlation between an independent
variable(x-axis) and dependent variable (y-axis).

Fig 1.3.2-Linear Regression

2. Naive Bayes classifiers:

Naive Bayes classifiers is a supervised machine learning model for constructing
classifiers or models that assign class labels to problem instances, represented as
vectors of feature values, where the class labels are drawn from some finite set. It
assumes that the factors are independent of each other and there is no correlation
between features. As assumptionof features are being uncorrelated, it gets its name as
Naive Bayes.

5
Fig 1.3.3-Naive Bayes Classifier
Where,

p(A|B): Probability of event A given event B has already

occurredp(B|A): Probability of event B given event A has

already occurredp(A): Probability of event A

p(B): Probability of event B

3. Logistic Regression:

Logistic Regression is a supervised learning algorithm which is mostly used in binary

classification problems. Even when regression contradicts with classification, here the
spotis for logistic that refers to logistic function which does the classification task. It is
simple but effective classification algorithms most commonly used for binary
classification problems. Itis also called sigmoid function.

Logistic regression takes linear equation as input and uses sigmoid function and logs
oddsto performs a binary problem. As a result ‘s’ or sigmoid curve will be obtained as
the output.

6
Fig 1.3.4 – Logistic Regression

4. Decision Trees:

Decision trees are a non parametric supervised learning method used for
classification andregression. The goal is to create a model that predicts the value of a
target variable by learning simple decision rules inferred from the data features. A tree
can be seen as a piecewise constant approximation. Though it achieves high
accuracy with training set but poorly on new. The depth of the tree is controlled by
max_depth parameter for decision treealgorithm in scikit-learn

7
Fig 1.3.5-Decision Tree

5. Random Forest:

Random forests or Random decision forests is an ensemble learning method for

classification, regression and other tasks that operates by constructing a multitude of
decisiontrees at training time. For classification tasks, the output of the random forests
is the class selected by the most trees.

Fig1.3.6-Random Forest

8
6. K Means Clustering:

K-means clustering is a method of vector quantization, originally from signal

processing, that aims to partition n observations into k clusters in which each
observation belongs to the cluster with the nearest mean, serving as a prototype of the
cluster. It was a way to group ofset of data points are together. Thus, they took look for
dissimilarities or similarities among data points. It is an unsupervised learning so there
is no label associated with data points. They try to find the underlying structures of the
data. Clustering is not classification.

Fig 1.3.7-K Means Clustering

9
CHAPTER-2

AIM AND SCOPE OF PRESENT INVESTIGATION

2.1 AIM:
To predict the forest fires from the given forest fires dataset.

2.2 SCOPE:
There is a significant scope for predicting forest fires using advanced technologies such
as satellite imagery, artificial intelligence (AI), and machine learning (ML) algorithms.
These tools can help monitor and analyze various environmental factors that contribute
to the risk of forest fires, such as temperature, humidity, wind speed and direction,
vegetation moisture content, and topography.Satellite imagery, in particular, can provide
valuable information about the extent and severity of forest fires, as well as the location
and distribution of smoke and other airborne pollutants. AI and ML algorithms can also
be used to analyze historical data and real-time environmental data to identify patterns
and predict the likelihood of a forest fire in a particular region.
In the given dataset,
 Data set characteristics are Multivariate
 Attribute characteristics are categorical,integer.

 Associated task – Classification

 no.of instances – 122
 no.of attributes – 14

2.3 DATA PREPARATION:

In this project, Python serves as the key tool which carries out important machine
learningalgorithms. With the help of Anaconda Navigator, a desktop Graphical User
Interface(GUI), a web based interactive application called Jupyter Notebook that allows
editing and running notebook documents via web browser. It is an incredibly powerful

10
tool for interactively developing and presenting Machine learning and Data Science
projects.After importing the required libraries, the dataset will be read in the notebook
with the help of data frame(two-dimensional labeled data structure with columns of
potentially different types) and read_csv(desired file type).

2.4 READING THE DATASET:

Fig 2.4.1 -syntax for reading the dataset

The data set provided cannot always be fully valued set, in that case we need to prepare
the data in such way the machine understands what is the value that has been entered.

Fig 2.4.2- imported estimated forest fires dataset in jupyter notebook

11
Check the values for null using isnull() function

Fig 2.4.3 -using isnull()function checking for null values

12
CHAPTER 3
EXPERIMENTAL OR MATERIAL AND METHODS
ALGORITHMS USED

The given data set is in the form of classification algorithm. So we used classification
types to predict the accuracy

3.1 TYPES OF CLASSIFICATIONS ALGORITHMS USED:

1. Decision Tree Algorithm
2. Random Forests Algorithm

3.2 DECISION TREE ALGORITHM:

Decision Tree is a decision support tool that uses a tree like model of decisions and
theirpossible consequences, including chance event outcomes, resource costs, and
utility. It is one way to display an algorithm that only contains conditional control
statements.

The goal is to create a model that predicts the value of a target variable by learning
simpledecision rules inferred from the data features. A tree can be seen as a piecewise
constantapproximation.

For instance, decision trees learn from data to approximate a sine curve with a set of
if- then-else decision rules. The deeper the tree, the more complex the decision rules
and the fitter the model.

Some of the advantages of decision trees are being simple to understand and to
interpretand it requires very little data preparation whereas other methods require data
normalization, dummy variables etc.

Decision Tree classifier is a class capable of performing multi-class classification on a

dataset. As with other classifiers, Decision Tree Classifier takes as input two arrays:
an array X, sparse or dense, of shape holding the training samples, and an array Y of
integervalues, shape holding the class labels for the training samples.
13
Why use Decision Trees?

 Decision Trees usually mimic human thinking ability while making a decision,
so itis easy to understand.

 The logic behind the decision tree can be easily understood because it
shows atree-like structure.

Fig 3.2.1-Decision Tree Algorithm

Decision Tree Terminologies

Root Node: Root node is from where the decision tree starts. It represents the entire
dataset, which further gets divided into two or more homogeneous sets.

Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated
further after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into subnodes
according to the given conditions

Parent/Child node: The root node of the tree is called the parent node, and other
nodesare called the child nodes.

14
How does the Decision Tree algorithm Work?

In a decision tree, for predicting the class of the given dataset, the algorithm starts from
the root node of the tree. This algorithm compares the values of root attribute with the
record (real dataset) attribute and, based on the comparison, follows the branch and
jumps to the next node. For the next node, the algorithm again compares the attribute
value with the other sub-nodes and move further. It continues the process until it
reachesthe leaf node of the tree. The complete process can be better understood using
the below algorithm:

 Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
 Step-2: Find the best attribute in the dataset using Attribute Selection
Measure (ASM).
 Step-3: Divide the S into subsets that contains possible values for the
best attributes.
 Step-4: Generate the decision tree node, which contains the best attribute.
 Step-5: Recursively make new decision trees using the subsets of the
dataset created in step -3. Continue this process until a stage is reached
where you cannotfurther classify the nodes and called the final node as a
leaf node.

Example: Suppose there is a candidate who has a job offer and wants to decide
whetherhe should accept the offer or not. So, to solve this problem, the decision tree
starts with the root node (Salary attribute by ASM). The root node splits further into the
next decision node (distance from the office) and one leaf node based on the
corresponding labels. The next decision node further gets split into one decision node
(Cab facility) and one leaf node.

Finally, the decision node splits into two leaf nodes (Accepted offers and Declined offer).

15
3.3 RANDOM FORESTS:
Random forests or Random decision forests is an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of
decision trees at training time. For classification tasks, the output of the random forest
is the class selected by most trees. It is a supervised learning machine learning
algorithm made up of decision trees which are used in classification and regression. It
plays a majorrole in our model.It is called a “forest” because it grows a forest of decision
trees. The data from these trees are then merged together to ensure the most accurate
predictions.While a solo decision tree has one outcome and a narrow range of groups,
the forest assures a more accurate result with a bigger number of groups anddecisions.
It has theadded benefit of adding randomness to the model by finding the best feature
among a random subset of features. Overall, these benefits create a modelthat has wide
diversitythat many data scientists favor.

Fig 3.3.1 -Random Forest Algorithm

16
Assumptions for Random Forest
Since the random forest combines multiple trees to predict the class of the dataset, it
is possible that some decision trees may predict the correct output, while others may
not. But together, all the trees predict the correct output. Therefore, below are two
assumptions for a better Random Forest classifier:

There should be some actual values in the feature variable of the dataset so that the
classifier can predict accurate results rather than a guessed result.
The predictions from each tree must have very low correlations.

Why use Random Forest?

 Below are some points that explain why we should use the Random
Forestalgorithm. It takes less training time as compared to other algorithms.
 It predicts output with high accuracy, even for the large dataset it runs
efficiently.

 It can also maintain accuracy when a large proportion of data is missing.

How does Random Forest algorithm work?

Random Forest works in two-phase first is to create the random forest by combining N
decision tree, and second is to make predictions for each tree created in the first phase.
The Working process can be explained in the below steps and diagram:

Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points 13

Step-3: Choose the number N for decision trees that you want to build.

Step-4: Repeat Step 1 & 2.

Step-5: For new data points, find the predictions of each decision tree, and assign

the new data points to the category that wins the majority votes.
17
Applications of Random Forest:

There are mainly four sectors where Random Forest mostly used:

1. Banking: Banking sector mostly uses this algorithm for the identification of loan risk.

2. Medicine: With the help of this algorithm, disease trends and risks of the disease
canbe identified.

3. Land Use: We can identify the areas of similar land use by this algorithm.

4.Marketing: Marketing trends can be identified using this algorithm.

Advantages of Random Forest:

 Random Forest is capable of performing both Classification and Regression tasks.

 It is capable of handling large datasets with high dimensionality.
 It enhances the accuracy of the model and prevents the overfitting issue.

Disadvantages of Random Forest:

Although random forest can be used for both classification and regression tasks, it is not
more suitable for Regression tasks.

3.4 MACHINE LEARNING LIBRARIES:

Libraries are collections of prewritten code that users can use to optimize tasks. In
projetas python is used for implementation tool, it has the most libraries as compared
to otherprogramming languages. More than of 60% machine learning developers use
and goes for python as it is easy to learn. As python has comparatively large collection
of librarieslets look at the libraries that came handy for our dataset.

18
Fig 3.4.1-Various python libraries for machine learning

1. Sklearn:

Sklearn stands for Scikit-learning, a machine learning library. It is imported for various
classification, regression and clustering algorithms including k-means,random forest,
support vector machines, gradient boosting and DBSCAN. It is designed using libraries
Numpy and Scipy. From the sklearn library and from the tree inside the library
DecisionTreeClassifier.It is a class capable of performing multi-class classifier on a
dataset. When compared with other classifiers, DecisionTreeClassifier takes input as
two arrays:an array X, aparse or dense,of shape(n_samples,n_features) holding
training samples and an array Y of integer values, shape holding class labels for training
sample.From sklearn another one called model_selection for training and testing the
model imports train_test_split. It is a method setting a blueprint to analyze data and
using it to measure new data. Selecting a proper model allows to generate accurate
results while making prediction. For proceeding, we need to train the model by using a
specific datasetand test the model by using a specific dataset and test the model against
another dataset.By default,sklearntrain_test_split will make random partitions for two
subsets. We can also specify a random state for the operation. First, we need to split
the dataset and thenallocate the size for train and test.
19
2. Math:

Math is a built-in module that you can use for mathematical tasks. It has set of methods
and constants. It is a standard module in python and is always available

3. Seaborn:

Seaborn is a library built on top of matplotlib. It is used for data visualization and
exploratory data analysis. They work easily with dataframes and pandas library. The
graphs created can also be customized easily. It provide default styles and color
palettesto make statistical plots more attractive. Also closely integrated to the data
structures from pandas

4. Matplotlib.pyplot:

Matplotlib.pyplot is a state-based interface to matplotlib. It provides a MATLAB-like way

of plotting. It make changes to figures.

3.5 IMPORTED LIBRARIES

1.Pandas:
Pandas is a widely used data analysis and manipulation library for python. It provides
a lot of functions and methods that expedite the data analysis and preprocessing steps.
It also provides fast, flexible and expressive data structures working with relational or
labeled or both easy and intuitive. Considered as fundamental high-level building block
in performing practical, real world data analysis in python. It has Data Frame and series
for analyzing.

Fig 3.5.1- Pandas library is used to read the dataset

20
2.Numpy:
Numpy stands for Numerical python, is a library consisting of multidimensional array
objects and a collection of countless of routines for processing those arrays. Using this
mathematical and logical operations on arrays can be performed. The difference in
Numpy from Pandas is, it works on numerical data whereas pandas on tabular data.

Fig 3.5.2 – Imported libraries in jupyter notebook

3.6 IMPLEMENTATION OF DECISION TREE ALGORITHM

Decision trees are a non parametric supervised learning method used for classification

and regression. The goal is to create a model that predicts the value of a target variable
by learning simple decision rules inferred from the data features. A tree can be seen
asa piecewise constant approximation. Though it achieves high accuracy with training
set but poorly on new. The depth of the tree is controlled by max_depth parameter for
decision tree algorithm in scikit-learn.For instance, decision trees learn from data to
approximate a sine curve with a set of if-then-else decision rules. The deeper the tree,
the more complex the decision rules and the fitter the model. Some of the advantages
ofdecision trees are being simple to understand and to interpret and it requires very
little data preparation whereas other methods require data normalization, dummy
21
variables etc.

1. Import the packages and classes you need.

2. Provide data to work with and eventually do appropriate transformations.
3. Create a classification model and fit it with existing data.
4. Check the results of model fitting to know whether the model is satisfactory.
5. Apply the model of predictions.

Fig 3.6.1-Given dataset

Fig 3.6.2-Applying decision tree algorithm

22
3.7 IMPLEMENTATION OF RANDOM FORESTS ALGORITHM:
Random forests or Random decision forests is an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of
decision trees at training time. For classification tasks, the output of the random forest
is the class selected by most trees. It is a supervised learning machine learning
algorithm made up of decision trees which are used in classification and regression. It
plays a majorrole in our model. As we know that a forest is made up of trees and more
trees means more robust forest. Similarly, random forest algorithm creates decision
trees on data samples and then gets the prediction from each of them and finally
selects the best solution by means of voting. It is an ensemble method which is better
than a single decision tree because it reduces the over-fitting by averaging the result.

Fig 3.7.1 Applying random forest algorithm

23
CHAPTER 4

RESULTS AND DISCUSSION,PERFORMANCE ANALYSIS

4.1 MODEL ANALYSIS

The above algorithms are written in python with the help of numpy and executed using
jupyter notebook. The accuracy rate of decision tree algorithm when executed for the
given dataset i.e Forest Fires prediction Dataset is 100.00 and for random forest is
100.00

The dataset contains 122 training samples with 14 features (day, month, year,
temperature, relative humidity, wind speed, rain, fine fuel, druff moisture code, drought
code, initial spread index, buildup index, fire weather index, output) our task is to
determine the accuracy of working of decision tree and random forest algorithm when
trained with the provided dataset

ACCURACY, PRECISION, RECALL, F1-SCORE

Fig 4.1.1 Accuracy,precision,recall,f1-score for decision tree algorithm

24
Fig 4.1.2 Accuracy of Random forest algorithm

25
CHAPTER 5
5.1 SUMMARY AND CONCLUSIONS

Our goal was to predict diabetes risk accurately using Random forests and Decision Tree
algorithms.Here, we will train both the models and test the accuracy of them . There was
a significant improvement in the accuracy levels upto 100 in decision tree classification
and 100 in random forest classifications respectively.we got a significant accuracy while
training the model using random forest classifier.While the accuracy is high, the model
is not substantially overfitted, as the cross-validation scores for each of the ensemble
methods all differed from the model accuracy by less than 1%.Also, we concluded that
f1_score,precision,recall by using these random forest and decision tree classifier.

26
REFERENCES

[1] Refferal code for forest fires

https://www.kaggle.com/datasets/elikplim/forest-fires-data-set/code

[2] Dataset

https://drive.google.com/drive/folders/1xf6uwCPMXsHxHhpax8agm4dkJQWNavDE

[3] Dataset Information

https://archive.ics.uci.edu/ml/datasets/Algerian+Forest+Fires+Dataset++

WORKING ENVIRONMENT
ANACONDA NAVIGATOR is desktop GUI used to launch applications and also manage
packages in one place.

27
CODING ENVIRONMENT

Jupyter notebook from the anaconda navigator is launched along with all the
preinstalled packages for python.

28
SCREENSHOTS AND OUTPUTS

29
30
31
32
SOURCE CODE

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
import sklearn
from sklearn import tree
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
from sklearn.model_selection import train_test_split
forest_data = pd.read_csv("AFFA.csv")
forest_data = pd.DataFrame(forest_data)
forest_data
forest_data.shape
forest_data.head()
forest_data = forest_data.drop(['year'], axis=1)
forest_data.shape
forest_data.head()
X = np.asarray(forest_data[['month', 'Temperature', 'Relative_Humidity', 'Wind_Speed',
'Fine_Fuel_Moisture_Code', 'Druff_Moisture_Code', 'Drought_Code',
'Initial_Spread_Index', 'Buildup_Index', 'Fire_Weather_Index']])
Y = np.asarray(forest_data['Output'])
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 3, random_state = 100)
gini_classifier = DecisionTreeClassifier(criterion = "gini", random_state = 100,
max_depth=None, min_samples_leaf=5)
gini_classifier.fit(X_train, y_train)
y_pred = gini_classifier.predict(X_test)
33
print("Predicted Values using Gini: ")
print(y_pred)
print()

print("Results using Gini")

print("Confusion Matrix: ", confusion_matrix(y_test, y_pred))
print ("Accuracy : ", accuracy_score(y_test,y_pred)*100)
print("Report : ", classification_report(y_test, y_pred))
entropy_classifier = DecisionTreeClassifier(criterion = "entropy", random_state = 100,
max_depth = 3, min_samples_leaf = 5)
entropy_classifier.fit(X_train, y_train)
y_pred1 = entropy_classifier.predict(X_test)

print("Predicted Values using entropy: ")

print(y_pred1)
print()
print("Confusion Matrix: ", confusion_matrix(y_test, y_pred1))
print ("Accuracy : ", accuracy_score(y_test,y_pred1)*100)
print("Report : ", classification_report(y_test, y_pred1))
fig = plt.figure(figsize=(25,20))
t = tree.plot_tree(gini_classifier, filled = True)
X = np.asarray(forest_data[['month', 'Temperature', 'Relative_Humidity', 'Wind_Speed',
'Fine_Fuel_Moisture_Code', 'Druff_Moisture_Code', 'Drought_Code',
'Initial_Spread_Index', 'Buildup_Index', 'Fire_Weather_Index']])
Y = np.asarray(forest_data['Output'])
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state =
100)
random_classifier = RandomForestClassifier(n_estimators = 100)
random_classifier.fit(X_train, y_train)
y_pred = random_classifier.predict(X_test)
print("ACCURACY OF THE MODEL: ", metrics.accuracy_score(y_test, y_pred))
------THE END-----

EDWARDS Photographic Types The Pursuit of Method 1990
No ratings yet
EDWARDS Photographic Types The Pursuit of Method 1990
25 pages
Forest Fire Prediction Sem 8 - Review 1
No ratings yet
Forest Fire Prediction Sem 8 - Review 1
33 pages
Forest Firepredictionusing Machine Learning Methods
No ratings yet
Forest Firepredictionusing Machine Learning Methods
6 pages
Forest Fires Data Set Analysis Using Machine Learning: Name: 1.pawan Jakke (111815018) 2.utkarsh Dubey (111815047)
No ratings yet
Forest Fires Data Set Analysis Using Machine Learning: Name: 1.pawan Jakke (111815018) 2.utkarsh Dubey (111815047)
8 pages
45-47
No ratings yet
45-47
3 pages
Forest Fire Prediction Using Machine Learning Techniques
No ratings yet
Forest Fire Prediction Using Machine Learning Techniques
6 pages
A Brief Review of Machine Learning Algorithms in Forest Fires Science
No ratings yet
A Brief Review of Machine Learning Algorithms in Forest Fires Science
15 pages
1-s2.0-S2405844023052726-main
No ratings yet
1-s2.0-S2405844023052726-main
19 pages
A Review On Prediction and Analysis of Forest Fires Using AI and ML Algorithms
No ratings yet
A Review On Prediction and Analysis of Forest Fires Using AI and ML Algorithms
6 pages
Project Report Forest Fire Final
No ratings yet
Project Report Forest Fire Final
26 pages
Forest Fires
No ratings yet
Forest Fires
23 pages
Predicting Burned Area of Forest Fires PDF
No ratings yet
Predicting Burned Area of Forest Fires PDF
5 pages
Forest Fire Prediction System Using Machine Learning
100% (1)
Forest Fire Prediction System Using Machine Learning
10 pages
Forest Fire Prediction Using Machine Learning
No ratings yet
Forest Fire Prediction Using Machine Learning
28 pages
Forest Fire Prediction Using Machine Learning
No ratings yet
Forest Fire Prediction Using Machine Learning
28 pages
Ragu
No ratings yet
Ragu
16 pages
Mini Project Report
No ratings yet
Mini Project Report
20 pages
Forest Fire Prediction Using Random Forest Regressor: A Comprehensive Machine Learning Approach
No ratings yet
Forest Fire Prediction Using Random Forest Regressor: A Comprehensive Machine Learning Approach
9 pages
Forest Fire Prediction
No ratings yet
Forest Fire Prediction
8 pages
Wildfire Prediction Technique Using Machine Learning
No ratings yet
Wildfire Prediction Technique Using Machine Learning
6 pages
Inventions 07 00015
No ratings yet
Inventions 07 00015
30 pages
BOOK Chapter Springer ABID
No ratings yet
BOOK Chapter Springer ABID
9 pages
Forests 14 00170 v2
No ratings yet
Forests 14 00170 v2
17 pages
Classification_of_Fire_and_Smoke_Images_using_Decision_Tree_Algorithm_in_Comparison_with_Logistic_Regression_to_Measure_Accuracy_Precision_Recall_F-score (1)
No ratings yet
Classification_of_Fire_and_Smoke_Images_using_Decision_Tree_Algorithm_in_Comparison_with_Logistic_Regression_to_Measure_Accuracy_Precision_Recall_F-score (1)
5 pages
Assessing the suitability of soft computing approaches for forest fires prediction (2018)
No ratings yet
Assessing the suitability of soft computing approaches for forest fires prediction (2018)
11 pages
A review of machine learning applications in wildfire science and mngt
No ratings yet
A review of machine learning applications in wildfire science and mngt
71 pages
NCSTEM_2023_paper_5
No ratings yet
NCSTEM_2023_paper_5
6 pages
Updated Case Study Forest Fire Prediction[1]
No ratings yet
Updated Case Study Forest Fire Prediction[1]
6 pages
Sapmle Report Fire Detection - Python
No ratings yet
Sapmle Report Fire Detection - Python
57 pages
Springer Nature LaTeX Template
No ratings yet
Springer Nature LaTeX Template
23 pages
D4 Forest Fire
No ratings yet
D4 Forest Fire
47 pages
Analysis of Deep Learning Methods for Early Wildfire Detection Systems
No ratings yet
Analysis of Deep Learning Methods for Early Wildfire Detection Systems
7 pages
Project - Report - Forest Fire Prediction - Group 119
No ratings yet
Project - Report - Forest Fire Prediction - Group 119
26 pages
Forest Fire Prediction Based On Long - and Short-Term Time-Series Network
No ratings yet
Forest Fire Prediction Based On Long - and Short-Term Time-Series Network
18 pages
Project Report
No ratings yet
Project Report
30 pages
S0379711218303941
No ratings yet
S0379711218303941
1 page
Last Doc Mini
No ratings yet
Last Doc Mini
43 pages
Expert Systems With Applications
No ratings yet
Expert Systems With Applications
13 pages
Wildfire Danger Prediction Optimization With Transfer Learning
No ratings yet
Wildfire Danger Prediction Optimization With Transfer Learning
6 pages
Advancements in Wildfire Detection and
No ratings yet
Advancements in Wildfire Detection and
10 pages
Iot Sensor and Deep Neural Network Based i Eee Paper
No ratings yet
Iot Sensor and Deep Neural Network Based i Eee Paper
5 pages
Forest Fires Documentataion Using ML
No ratings yet
Forest Fires Documentataion Using ML
73 pages
madagascar_fires
No ratings yet
madagascar_fires
5 pages
Predicting Forest Fires With Machine Learning
No ratings yet
Predicting Forest Fires With Machine Learning
4 pages
Forest Fires Application Demonstration
No ratings yet
Forest Fires Application Demonstration
4 pages
Thesis 112
No ratings yet
Thesis 112
34 pages
Forestfire Report
No ratings yet
Forestfire Report
25 pages
Predicting Probability of Forest Fire: Team: Guide
No ratings yet
Predicting Probability of Forest Fire: Team: Guide
15 pages
s41467-025-58097-7
No ratings yet
s41467-025-58097-7
12 pages
SOHAM JoSHI (4)
No ratings yet
SOHAM JoSHI (4)
1 page
ForestFire_Synopsis
No ratings yet
ForestFire_Synopsis
15 pages
Final Year Project Report
No ratings yet
Final Year Project Report
25 pages
Beige Brown Minimal Organic Creative Project Presentation
No ratings yet
Beige Brown Minimal Organic Creative Project Presentation
8 pages
Modeling Wildfires
No ratings yet
Modeling Wildfires
9 pages
Technical_report(1) Anil123
No ratings yet
Technical_report(1) Anil123
26 pages
Climate_Change_Forecast_for_Forest_Fire_Risk_Prediction_using_Deep_Learning
No ratings yet
Climate_Change_Forecast_for_Forest_Fire_Risk_Prediction_using_Deep_Learning
6 pages
fire-08-00017
No ratings yet
fire-08-00017
20 pages
Computer Vision Based Early Fire Detection Using Machine Learning
No ratings yet
Computer Vision Based Early Fire Detection Using Machine Learning
11 pages
2. Le Van Hung 2-14
No ratings yet
2. Le Van Hung 2-14
13 pages
Remotesensing 14 03228 v2 PDF
No ratings yet
Remotesensing 14 03228 v2 PDF
24 pages
Using Remote Sensing to Monitor Natural Resources
From Everand
Using Remote Sensing to Monitor Natural Resources
Samir Ganaka
No ratings yet
pn23 102
No ratings yet
pn23 102
4 pages
Firearms and Explosives
No ratings yet
Firearms and Explosives
46 pages
Vayu Jal 120 L - JPR BLR SHIMLA NAGPUR
No ratings yet
Vayu Jal 120 L - JPR BLR SHIMLA NAGPUR
20 pages
CL Arora Phy-2 Optics-1
No ratings yet
CL Arora Phy-2 Optics-1
148 pages
The Origin of Sumerians
No ratings yet
The Origin of Sumerians
3 pages
QB & SQP Links by @procbse
No ratings yet
QB & SQP Links by @procbse
26 pages
Heating With Coils and Jackets - Spirax Sarco
No ratings yet
Heating With Coils and Jackets - Spirax Sarco
14 pages
Math of Photogrammetry
No ratings yet
Math of Photogrammetry
16 pages
Conversational Hypnosis Spotting Unconscious Moments Scott Jansen Hypnosis Method Introducing The ABSURD Hypnosis Formula
No ratings yet
Conversational Hypnosis Spotting Unconscious Moments Scott Jansen Hypnosis Method Introducing The ABSURD Hypnosis Formula
157 pages
AY 2024-2025 TERM 2 Y8 ADDITIONAL REVISION MATERIAL
No ratings yet
AY 2024-2025 TERM 2 Y8 ADDITIONAL REVISION MATERIAL
13 pages
MSc 2025 Entrance Test Pattern Syllabus
No ratings yet
MSc 2025 Entrance Test Pattern Syllabus
8 pages
Intelligent BOP RAM Actuation Sensor System
No ratings yet
Intelligent BOP RAM Actuation Sensor System
53 pages
Apurwa Sarwajit: Work Experience Skills
No ratings yet
Apurwa Sarwajit: Work Experience Skills
1 page
Childcare Course Work Placement
67% (3)
Childcare Course Work Placement
8 pages
TDS Bodoxin Ao, En, 2020
No ratings yet
TDS Bodoxin Ao, En, 2020
2 pages
Immediate download Applying the Rasch Model Fundamental Measurement in the Human Sciences 1st Edition Trevor G. Bond ebooks 2024
100% (10)
Immediate download Applying the Rasch Model Fundamental Measurement in the Human Sciences 1st Edition Trevor G. Bond ebooks 2024
37 pages
Maths Difficult Qs From Examiner Reports Pure 1 2022-2019
No ratings yet
Maths Difficult Qs From Examiner Reports Pure 1 2022-2019
84 pages
SEM, TEM and EDX
No ratings yet
SEM, TEM and EDX
20 pages
Wind Load Parking
100% (1)
Wind Load Parking
6 pages
Boiler Failure
No ratings yet
Boiler Failure
6 pages
Internal Fluid Flow: The Fluid Dynamics of Flow On Pipes and Ducts, by A. J. Ward-Smith, Oxford University Press
No ratings yet
Internal Fluid Flow: The Fluid Dynamics of Flow On Pipes and Ducts, by A. J. Ward-Smith, Oxford University Press
1 page
Physics2A F4 2023
100% (1)
Physics2A F4 2023
3 pages
Instant Download A Primer On Stable Isotopes in Ecology M. Francesca Cotrufo & Yamina Pressler PDF All Chapter
100% (3)
Instant Download A Primer On Stable Isotopes in Ecology M. Francesca Cotrufo & Yamina Pressler PDF All Chapter
64 pages
Fixed Effect and Random Effect
No ratings yet
Fixed Effect and Random Effect
17 pages
Class 6-8 Half Yearly Exam Date Sheet 2022
No ratings yet
Class 6-8 Half Yearly Exam Date Sheet 2022
1 page
Xii _ Math Answer Key Set 3 Wcsc Xii 2024-25
No ratings yet
Xii _ Math Answer Key Set 3 Wcsc Xii 2024-25
11 pages
AmberPress Users Manual 0606EN 170706
No ratings yet
AmberPress Users Manual 0606EN 170706
18 pages
The Collocation Networks of Stance Phrases
No ratings yet
The Collocation Networks of Stance Phrases
13 pages
Tugas Rutin 2 PEMODELAN
No ratings yet
Tugas Rutin 2 PEMODELAN
3 pages