Mini Project Report
Mini Project Report
COLLEGE OF ENGINEERING
MANDYA, 571401
(An Autonomous Institution under VTU, Belgaum)
Submitted by PRAJWAL M D
LIKITH RAJ K R MANVITHA 4PS21CS066
SM 4PS21CS046
TANYA PRIYADARSHINI A R 4PS21CS051
4PS21CS106
1
P.E.S COLLEGE OF ENGINEERING,
MANDYA - 571401
(An Autonomous Institution under VTU, Belgaum)
CERTIFICATE
2
ACKNOWLEDGEMENT
We would like to express our heartfelt gratitude to all those who contributed to the successful
completion of our project on "Forest Fire Prediction".
First and foremost, we are deeply thankful to our instructor, S K Shivashankar, for guiding us throughout
this project. Your expertise, dedication, and commitment to nurturing our understanding of machine
learning and web development were invaluable. Your insights and feedback played a pivotal role in shaping
this project. We are also indebted to our fellow team members who provided a collaborative and supportive
environment during the course of this project. Your contributions, discussions, and shared knowledge
significantly enriched our understanding of forest fire prediction using Random Forest Regressor and
Django.
We extend our gratitude to the creators and contributors of Random Forest Regressor and Django for
providing exceptional tools that simplify machine learning and web development. Your open-source
dedication and continuous updates make it possible for countless developers to create effective and efficient
applications. We appreciate the immense resources and references available online, especially the Django
and scikit-learn documentation, Stack Overflow, and various web development and machine learning
forums. These sources were instrumental in addressing specific technical challenges and enhancing the
quality of our work.
We must acknowledge the academic institution and library resources that provided access to books,
journals, and online materials that were crucial for research and learning during this project.
Finally, we would like to express our gratitude to the readers and users of this project. It is our hope that
this work serves as a valuable resource for anyone seeking to explore forest fire prediction using Random
Forest Regressor and Django in web development.
Sincerely,
Prajwal M D
Likith Raj K R
Manvitha S M
Tanya Priyadarshini A R
3
ABSTRACT
Forest fires pose a significant threat to ecosystems, human lives, and property. Accurate prediction
and timely intervention can mitigate these risks substantially. This project presents a comprehensive
approach to predicting forest fires using a Random Forest Regressor model, integrated with a user-friendly
web application developed with Django. The primary objective of this project is to predict the occurrence
and severity of forest fires based on various environmental and meteorological factors. The project aims to
provide an accessible platform for stakeholders, including forest management authorities, environmental
researchers, and the general public, to obtain reliable forest fire predictions.
Our approach involves several key steps. We gathered historical data on forest fires, including variables
such as temperature, humidity, latitude, longitude, fire radiative power, scan level, confidence and many
more. This data was cleaned and preprocessed to handle missing values and outliers, ensuring a robust
dataset for model training. We selected the Random Forest Regressor due to its high accuracy and ability to
handle complex, non-linear relationships between input variables and fire occurrence. The model was
trained on a portion of the dataset and validated using cross-validation techniques to ensure its
generalizability and performance.
To make the predictions accessible, we developed a web application using Django, a high-level Python web
framework. The web app provides a user-friendly interface where users can input current environmental
conditions and receive real-time predictions of forest fire risks. The application also includes visualizations
to help users understand the factors contributing to fire risks. Extensive testing was conducted to verify the
accuracy of predictions and the reliability of the web application under various conditions.
The Random Forest Regressor demonstrated high accuracy in predicting forest fire occurrences,
outperforming other models like linear regression and decision trees. The web application successfully
integrates the model, providing users with intuitive and actionable insights. Visualizations within the app
highlight key predictors and their impact on fire risk, aiding users in making informed decisions. This
project underscores the potential of machine learning and web technologies in addressing environmental
challenges. The Random Forest Regressor, coupled with the Django web application, offers a powerful tool
for predicting forest fires, facilitating timely interventions and resource allocation. Future work will focus
on incorporating real-time data feeds and expanding the model to include more diverse environmental
factors.
4
CONTENTS
I. INTRODUCTION --6--
II. SOFTWARE REQUIREMENTS SPECIFICATIONS --7--
III. TECHNOLOGY --10--
IV. SCREENSHOTS --18--
V. CONCLUSION --19--
VI. BIBLIOGRAPHY --20--
5
I. INTRODUCTION
Forest fires, also known as wildfires, are uncontrolled fires that spread rapidly through vegetation,
forested areas, and grasslands. These fires pose significant threats to ecosystems, wildlife, human lives, and
property. The increasing frequency and intensity of forest fires have been linked to climate change, human
activities, and natural factors. As such, predicting forest fires has become a critical task for forest
management authorities, environmental researchers, and policymakers.
This project aims to develop a machine learning model using the Random Forest Regressor algorithm to
predict forest fire occurrences and severities based on various environmental and meteorological factors.
Additionally, we have developed a user-friendly web application using the Django framework to make
these predictions accessible to a wide range of users. This application allows users to input current
environmental conditions and receive real-time predictions of forest fire risks, along with visualizations to
help understand the contributing factors.
The primary objective of this project is to provide an effective and accurate tool for forest fire prediction,
which can aid in early warning systems, resource allocation, and preventive measures. By leveraging
historical data and advanced machine learning techniques, we aim to create a reliable model that can predict
the likelihood and intensity of forest fires with high accuracy. The integration of this model into a web
application ensures that it is accessible and user-friendly, allowing stakeholders to make informed decisions
based on the predictions.
In this report, we will discuss the software requirements specifications, the technology used, and provide
detailed insights into the development process of the forest fire prediction model and the Django web
application. We will also present screenshots of the web application and conclude with the potential
implications of this project and future directions for improvement.
6
II. SOFTWARE REQUIREMENTS SPECIFICATIONS
A. FUNCTIONAL REQUIREMENTS
1. Data Ingestion:
The system must be capable of reading and processing historical forest fire data from a CSV
file. This involves loading the data into a data frame using Python libraries such as Pandas and
performing initial checks for data integrity and completeness.
The system must handle missing values, outliers, and ensure data consistency. This includes
identifying and filling or removing missing values, detecting and correcting outliers, and
standardizing data formats. Preprocessing also involves feature engineering, where new relevant
features may be created to improve the model's performance.
3. Model Training:
The system must train a Random Forest Regressor model using the cleaned dataset. This
involves splitting the data into training and testing sets, selecting appropriate hyperparameters, and
using cross-validation techniques to optimize the model's performance. The trained model should be
saved using Pickle for later use in the web application.
4. Prediction Interface:
The web application must allow users to input environmental parameters such as
temperature, humidity, wind speed, and precipitation to receive forest fire risk predictions. The
interface should be intuitive and user-friendly, providing clear instructions on how to input the data
and interpret the results.
5. Visualization:
The web application must provide visualizations to help users understand the factors
contributing to fire risk. This includes charts and graphs that display the relationships between input
variables and predicted fire risk, as well as historical trends and patterns. Visualizations should be
generated using libraries like Matplotlib or Plotly and integrated seamlessly into the web
application.
7
B. NON-FUNCTIONAL REQUIREMENTS
1. Performance:
The system must provide predictions with minimal latency. This requires optimizing the
model and the web application to ensure quick response times, even when handling large datasets or
multiple concurrent users.
2. Scalability:
The web application should handle multiple concurrent users without significant degradation
in performance. This involves implementing efficient algorithms, using appropriate data structures,
and leveraging cloud services for scaling if needed.
3. Usability:
The interface should be intuitive and user-friendly. This includes designing a clean and
simple layout, providing clear instructions and feedback, and ensuring that the application is
accessible to users with varying levels of technical expertise.
4. Maintainability:
The codebase should be modular and well-documented to facilitate future updates. This
involves following best practices in software development, such as using version control, writing
unit tests, and maintaining comprehensive documentation for the code and the application.
8
C. SYSTEM REQUIREMENTS
1. Hardware Requirements:
RAM: 8 GB
Storage: 100 GB
2. Software Requirements:
The choice of hardware and software requirements ensures that the system is capable of handling the
computational demands of training a machine learning model and running a web application. The hardware
specifications, including a modern processor, sufficient RAM, and ample storage, provide the necessary
resources for efficient data processing and model training. The software stack, including Python and its
libraries, Django for web development, and SQLite for database management, offers a robust and scalable
platform for developing and deploying the application.
9
III. TECHNOLOGY
The dataset used in this project contains several key columns that are essential for predicting
forest fires: longitude, latitude, confidence, fire radiative power, scan, level, last date fire occurred,
type of forest, and time of day (night or day). These variables provide a comprehensive view of the
environmental and meteorological conditions associated with forest fires.
1. Data Collection:
The dataset was obtained from various sources, including government agencies, research
institutions, and publicly available databases. The data was compiled into a CSV file, which
facilitated easy manipulation and analysis using Python libraries.
2. Data Cleaning:
The raw data contained missing values, outliers, and inconsistencies that needed to be
addressed. We used Pandas to handle missing values by either filling them with appropriate values
(e.g., mean or median) or removing rows with missing data. Outliers were detected using statistical
methods and either corrected or removed to ensure data quality. Data consistency was ensured by
standardizing units and formats across all columns.
B. PANDAS
Pandas is a powerful data manipulation and analysis library for Python. It provides data structures such
as DataFrame and Series, which are ideal for handling structured data like the one used in this project.
Pandas offers various functions for data cleaning, transformation, and analysis. Some key features used in
this project include:
Pandas can read data from various file formats, including CSV, Excel, SQL databases, and
more. In this project, we used pandas.read_csv() to load the dataset into a DataFrame.
Data Transformation:
# Detecting outliers
from scipy import stats
z_scores = stats.zscore(df.select_dtypes(include=[float, int]))
df = df[(z_scores < 3).all(axis=1)]
3. Feature Engineering:
Feature engineering involves creating new features that may improve the model's
performance. In this project, we derived additional features such as the time elapsed since the last
fire occurrence, average temperature, humidity levels, and other relevant metrics. These engineered
features provided more informative input for the machine learning model.
4. Data Transformation:
Before feeding the data into the model, we transformed the categorical variables (e.g., type of forest,
time of day) into numerical formats using techniques such as one-hot encoding. This step ensured
that all input variables were in a format suitable for the Random Forest Regressor.
The Random Forest Regressor was chosen for this project due to its robustness and ability to handle
complex, non-linear relationships between input variables and fire occurrence. Random Forest is an
ensemble learning method that constructs multiple decision trees during training and outputs the average
prediction of the individual trees. This approach reduces overfitting and improves accuracy.
11
1. Model Selection:
2. Hyperparameter Tuning:
3. Model Training:
The cleaned and preprocessed dataset was split into training and testing sets. We used the
training set to fit the Random Forest Regressor model and evaluated its performance on the testing
set. The model achieved an accuracy of 94.07%, indicating its effectiveness in predicting forest fire
occurrences.
12
D. MODEL SERIALIZATION
To make the model accessible for real-time predictions in the web application, we serialized the
trained model using the Pickle library. Pickle allows us to save the model's state to a file, which can be
loaded later for making predictions without retraining.
import pickle
The Django framework was used to develop the web application, which provides a user-friendly
interface for inputting environmental parameters and receiving forest fire risk predictions. Django's MVC
architecture and robust features make it suitable for building scalable web applications.
1. Django Framework:
Django is a high-level Python web framework that promotes rapid development and clean,
pragmatic design. It follows the Model-View-Controller (MVC) architectural pattern, which
separates the data (Model), user interface (View), and control logic (Controller). This separation of
concerns enhances code maintainability and reusability.
2. Frontend Development:
The frontend of the web application was developed using HTML, CSS, and JavaScript. We
utilized Django's templating engine to dynamically generate web pages based on user input and
model predictions. The interface was designed to be intuitive and user-friendly, providing clear
instructions and feedback to users.
3. Backend Development:
The backend of the application was implemented using Django's ORM to handle database
operations and manage user input. We created views to handle HTTP requests, process the input
data, and generate predictions using the serialized model. The predictions were then displayed on
the web interface, along with visualizations to help users understand the contributing factors.
4. Model Integration:
The serialized model was loaded in the Django application to make predictions based on
user input. We created a view that accepts user input, preprocesses the data, and uses the loaded
model to generate predictions. The results were then rendered on a web page using Django
templates.
13
# Loading the serialized model in Django
import pickle
from django.shortcuts import render
from django.http import JsonResponse
from .forms import PredictionForm
def predict_fire_risk(request):
if request.method == 'POST':
form = PredictionForm(request.POST)
if form.is_valid():
data = form.cleaned_data
# Preprocess the input data
# ...
# Make prediction
prediction = model.predict([data])[0]
return JsonResponse({'prediction': prediction})
else:
form = PredictionForm()
return render(request, 'predict.html', {'form': form})
5. Deployment:
The final model and web application were deployed locally to ensure that all functionalities work as
expected. For a production environment, the deployment can be scaled using cloud services such as
AWS or Azure. The local deployment included setting up a web server (Apache or Nginx) and
configuring it to serve the Django application.
F. NUMPY
NumPy is a fundamental library for scientific computing in Python. It provides support for arrays,
matrices, and many mathematical functions to operate on these data structures. In this project, NumPy was
used for numerical computations, handling array operations, and performing mathematical transformations
required for data preprocessing and model training.
Array Handling:
NumPy arrays are efficient and convenient for storing and manipulating large datasets. We
used NumPy arrays to store feature matrices and target variables, facilitating fast and efficient
computations.
Mathematical Functions:
NumPy offers a wide range of mathematical functions for statistical analysis, linear algebra,
and random number generation. These functions were used for data normalization, statistical
analysis, and generating random subsets of data for model validation.
14
import numpy as np
# Normalizing features
normalized_features = (features - np.mean(features, axis=0)) / np.std(features,
axis=0)
G. SCIKIT-LEARN
Scikit-learn is a versatile machine learning library for Python that provides simple and efficient tools for
data mining and data analysis. It offers various machine learning algorithms, including regression,
classification, clustering, and model selection tools. In this project, Scikit-learn was used for model
training, hyperparameter tuning, and evaluation.
Model Training:
Scikit-learn provides easy-to-use classes for training machine learning models. We used the
RandomForestRegressor class to train our prediction model.
Hyperparameter Tuning:
The GridSearchCV class in Scikit-learn allows for exhaustive search over specified
parameter values for an estimator. We used this class to perform hyperparameter tuning for the
Random Forest model.
Model Evaluation:
Scikit-learn provides various metrics for model evaluation, such as mean squared error, R-
squared, and cross-validation scores. These metrics were used to assess the model's performance.
15
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=5,
scoring='neg_mean_squared_error')
grid_search.fit(X_train, y_train)
H. DJANGO
Django is a high-level Python web framework that promotes rapid development and clean, pragmatic
design. It follows the Model-View-Controller (MVC) architectural pattern, which separates the data
(Model), user interface (View), and control logic (Controller). This separation of concerns enhances code
maintainability and reusability.
Model-View-Controller Architecture:
Django's architecture promotes organized and maintainable code by separating the data
layer, presentation layer, and business logic. Models represent the data structure, views handle the
user interface and control flow, and templates render the HTML output.
Django's ORM provides an intuitive and convenient way to interact with the database using
Python code. It abstracts the complexities of SQL queries and allows developers to focus on the
application logic.
Form Handling:
Django provides robust form handling capabilities, including form validation, error
handling, and CSRF protection. This ensures that user inputs are processed securely and accurately.
16
last_date_fire_occurred = forms.DateField(label='Last Date Fire Occurred',
required=True)
In summary, the technology stack chosen for this project includes Pandas for data manipulation,
NumPy for numerical computations, Scikit-learn for machine learning model development, and Django for
web application development. These libraries and frameworks provide a robust and efficient platform for
developing a reliable and user-friendly forest fire prediction system.
17
IV. SCREENSHOT
Fig1-Web App
Fig2-Confidence Score
18
V. CONCLUSION
In conclusion, the development of the forest fire prediction model and the accompanying Django web
application demonstrates the effective use of machine learning and web technologies to address a critical
environmental issue. The Random Forest Regressor model achieved a high accuracy of 94.07%, indicating
its potential for reliable forest fire prediction. The integration of this model into a user-friendly web
application ensures that it is accessible to a wide range of users, from forest management authorities to the
general public.
The project highlights the importance of data preprocessing, feature engineering, and model optimization in
developing accurate predictive models. The use of Django for web development showcases the benefits of
modern web frameworks in creating scalable and maintainable applications. Future work could involve
enhancing the model with additional data sources, implementing real-time data integration, and deploying
the application on a cloud platform for wider accessibility and scalability.
19
VI. BIBILOGRAPHY
[1] Géron, A. (2022). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly Media.
[2] McKinney, W. (2024). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and
IPython. O'Reilly Media.
[3] Grinberg, M. (2023). Flask Web Development: Developing Web Applications with Python.
O'Reilly Media.
20