Report Reference
Report Reference
by
CERTIFICATE
Submitted by
is a bonafide work carried out by students under the supervision of Prof. C. R. Patil
and it is submitted towards the partial fulfilment of the requirement of Bachelor of
Engineering (Computer Engineering) during academic year 2023-2024.
I
Acknowledgment
First and foremost, We would like to thank my project guide, Prof.C. R. Patil, for
her guidance and support. We will forever remain grateful for the constant support
and guidance extended by our guide, in making this project report.
Through our many discussions, she helped us to form and solidify ideas. With a deep
sense of gratitude, we wish to express our sincere thanks to, Prof. Dr. S. S. Sane for
his immense help in planning and executing the works in time. Our grateful thanks
to the departmental staff members for their support.
We would also like to thank our wonderful colleagues and friends for listening our
ideas, asking questions and providing feedback and suggestions for improving our
ideas.
Vaidehi Patil
Pranav Shimpi
Sayali Kulkarni
Sanket Shirsath
(B.E. Computer Engg.)
II
INDEX
1 Introduction 1
1.1 Project Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Motivation of the Project . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Project Plan 13
3.1 Project Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Team Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Team structure . . . . . . . . . . . . . . . . . . . . . . . . 15
5 Detailed Design 23
5.1 Architectural Design . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 Data design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.1 Data structure . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.2 Database description . . . . . . . . . . . . . . . . . . . . . 30
5.3 Component design/ Data Model . . . . . . . . . . . . . . . . . . . 31
5.3.1 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.2 Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6 Experimental setup 33
6.1 Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.1.1 Biomass History . . . . . . . . . . . . . . . . . . . . . . . 34
6.1.2 Distance Matrix . . . . . . . . . . . . . . . . . . . . . . . . 34
6.2 Technology Used . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.1 Prophet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.2 MERN stack . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.3 React Native . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.4 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.5 KNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.6 TensorFlow . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2.7 PyCharm . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2.8 Visual Studio Code . . . . . . . . . . . . . . . . . . . . . . 37
6.3 Performance Parameters . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Efficiency Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
IV
Annexure C Sponsorship detail (if any) 46
V
List of Figures
INTRODUCTION
1.1 PROJECT IDEA
To develop a predictive model that utilizes advanced AI and time series anal-
ysis techniques, this research project aims to accurately forecast future biomass
availability by leveraging historical biomass data and integrating vital environmental
factors. The project employs time series analysis algorithms, such as ARIMA and
LSTM, to analyze the temporal patterns in the data and make more accurate predic-
tions. It also employs interpolation techniques to optimize the distribution network,
calculating optimal distances between harvesting sites, depots, and refineries. This
ensures efficient logistics, minimizes resource wastage, and creates cost-effective
distribution routes that reduce transportation costs and energy consumption. To sup-
port decision-making, the project incorporates statistical dashboards that seamlessly
integrate with Google Maps, providing visual aids to decode geospatial patterns and
identify efficient distribution pathways. Additionally, the research delves into the
identification and analysis of raw materials essential for biomass production and
their transformation into biofuel, aligning with sustainability goals and contribut-
ing to the generation of clean, renewable energy. This comprehensive approach not
only enhances resource management but also serves as a powerful tool in reducing
greenhouse gas emissions and mitigating climate change.
• Sustainable Operations
The project promotes sustainable and environmentally friendly operations by
ensuring timely deliveries and minimizing resource wastage. This aligns with
broader sustainability goals and contributes to a cleaner and more sustainable
future.
• Decision-Making Support
The project incorporates statistical dashboards that seamlessly integrate with
Google Maps, providing visual aids to decode geospatial patterns and identify
efficient distribution pathways. This supports decision-making processes and
provides actionable insights for resource management.
5. To identify and analyze the raw material for production and transformation
into biofuel
2.1.2.1 Assumptions
• Data Availability: The project assumes that historical biomass data, environ-
mental data, and relevant information on biomass resources are readily avail-
able for analysis and modeling.
• Testing and Validation: The project includes extensive testing and valida-
tion to ensure the effectiveness of the predictive model, optimized distribution
network, and visual representation in real-world scenarios and diverse envi-
ronmental conditions.
2.2 METHODOLOGY
• Time Series Analysis: Develop predictive models and analyze biomass trends
like ARIMA , Prophet
• Raw Material Analysis: Analyze the raw materials to find the components
needed to convert biomass into biofuel.
2.3 OUTCOME
• User Adoption and Efficiency: The system will be adopted by users effi-
ciently, leveraging its features to support decision-making processes and re-
source management.
PROJECT PLAN
3.1 PROJECT TIMELINE
A project timeline chart, or Gantt chart, is a visual tool used in project management
to display project tasks and their timing. It shows tasks as bars over time, with their
start and end dates, dependencies, milestones, and progress. It helps plan, track, and
communicate project schedules.
The team consist of 3 distinct elements namely, the project mentor, the project
leader, and the team members.
1. Project Mentor
Project Mentor receive regular updates about progress directly from the team
lead. For the team, the team lead is their only point of contact. Here the team
lead and mentor work more closely. Here, Prof. C. R. Patil act as the mentor.
2. Team leader
This individual will co-ordinate all directions from the Mentor. Team lead who
is responsible for guiding the technical aspects of the project. Here, Pranav
Shimpi takes up the role of team lead.
3. Team member(s)
Also known as work horse of the team. The ground implementation is carried
out by them. Here, Vaidehi Patil, Sanket Shirsath, and Sayali Kulkarni forms
part of the team.
Name Role
Vaidehi Patil Research, Documentation and Implementation
Pranav Shimpi UI Designing , Documentation and Implementation
Sayali Kulkarni Research, Documentation and Implementation
Sanket Sirsath Backend Implementation and Interface Design
SOFTWARE REQUIREMENT
SPECIFICATION
4.1 FUNCTIONAL REQUIREMENTS
• Reliability: The predictive model and distribution network can be made more
reliable by using redundant components and implementing fail-safe mecha-
nisms. The system can also be monitored for performance and errors to iden-
tify and fix any problems quickly.
• Security: The system can be made more secure by using encryption, authen-
tication, and authorization mechanisms. The system can also be regularly au-
dited for vulnerabilities to identify and fix any security holes.
4.3 CONSTRAINTS
• The dashboard should have a timeline view to show the historical trend of
biomass production and availability.
• The dashboard should have a forecasting view to show the predicted biomass
production and availability for future periods.
• The GIS should allow users to create and manage custom data layers, such as
biomass management zones.
• The GIS should allow users to export custom data layers to different formats,
such as Shapefile and GeoJSON.
• If the framework will be used to deploy the model in production, then more
powerful hardware may be required to handle the load.
• The framework must use the Prophet library to forecast time series data
– User interface hardware (e.g., touch screens, tablets) for interacting with
the system and providing decision-making support.
• Security Measures
– Regular data backups and disaster recovery plans to minimize data loss
and ensure business continuity.
• Environmental Monitoring
– Database: Necessary for storing historical and real-time biomass and en-
vironmental data. Popular choices include PostgreSQL, MySQL, and
SQLite.
4.6 INTERFACES
DETAILED DESIGN
5.1 ARCHITECTURAL DESIGN
• Data warehouse: The data warehouse stores all of the historical and real-time
data that is used by the system. This data includes biomass data, environmental
data, and distribution network data.
• Predictive model: The predictive model uses historical biomass data and en-
vironmental data to forecast future biomass availability.
• Distribution network: The distribution network uses the forecasts from the
predictive model to calculate optimal distribution routes and costs.
The different components of the system interact with each other as follows:
• The data preprocessing component takes data from the data warehouse and
cleans and prepares it for use by the predictive model and distribution network.
• The predictive model takes the preprocessed data and forecasts future biomass
availability.
• The distribution network takes the forecasts from the predictive model and
calculates optimal distribution routes and costs.
• The statistical dashboards take the data and calculations from the other com-
ponents and provide users with a visual representation.
• Real-time data collection: This component collects real-time data from vari-
ous sources, such as sensors and harvesting equipment.
* Latitude: This field indicates the latitude of the location where the
biomass data was collected.
* Year: This field indicates the year in which the biomass data was
collected.
* Biomass: This field indicates the quantity of biomass that was avail-
able at the location in the given year.
• Environmental Data
* raw material id: Unique identifier for the raw material type
* The global data structure should contain the same columns as the
internal data structure, plus an optional location id column to store
the location where the measurement was taken.
raw material id Integer Unique identifier for the raw material type
measurement type id Integer Unique identifier for the measurement
type
time period id Integer Unique identifier for the time period
value Float The value of the measurement
location id Integer Unique identifier for the location where
(Optional) the measurement was taken
Table 5.4: Raw Material Analysis Table
• Historical Biomass Dataset: This dataset contains historical biomass data, in-
cluding the location, biomass type, measurement type, time period, and value.
This data can be used to train and evaluate a time series-based AI model to
predict future biomass availability.
• Distance Matrix Dataset: This dataset contains the distances between all
pairs of locations. This data can be used to calculate the cost and time of
transporting biomass between different locations.
• Raw Material Analysis Dataset: This dataset contains information about the
composition, moisture content, and heating value of different biomass types.
This data can be used to identify the best biomass types for different applica-
tions and to optimize biomass supply chains.
EXPERIMENTAL SETUP
6.1 DATA SET
The internal data structure of the biomass history dataset includes essential attributes
such as a unique index, longitude, latitude, year, and biomass quantity collected at
various locations over time. This rich dataset provides a comprehensive repository
of historical biomass information, encompassing diverse locations, biomass types,
measurement methods, and temporal periods. Leveraging this extensive data, an
AI model can be trained and evaluated using time series analysis, enabling predic-
tive insights into future biomass availability. By utilizing the spatial and temporal
dimensions encapsulated in this dataset, the AI model can forecast and anticipate
the availability of biomass, supporting informed decision-making and planning in
various sectors reliant on sustainable biomass resources.
The dataset containing distances between all pairs of locations, represented as a 2418
x 2418 matrix, serves as a critical resource for assessing the transportation logistics
involved in moving biomass between different locations. This comprehensive ma-
trix provides insights into the costs and time requirements for transporting biomass
from a source grid block to various destination grid blocks. Notably, the asymme-
try within the matrix reflects nuanced factors such as one-way routes, U-turns, or
other transport-related variables, leading to differing distances for journeys from a
source to a destination and vice versa. Leveraging this detailed spatial informa-
tion, logistical planning and optimization strategies can be developed, considering
the varying distances and directional dependencies for effective biomass transporta-
tion, ultimately aiding in efficient resource allocation and decision-making within
the biomass industry.
6.2.1 Prophet
React Native is a framework for building mobile applications using JavaScript and
React. It allows developers to create native mobile apps for iOS and Android using
a single codebase. React Native provides a set of components that map to native
UI components, allowing developers to create a seamless user experience on both
platforms. React Native also supports hot-reloading, which means that changes to
the code can be instantly reflected in the app without requiring a rebuild. React
Native is popular for its ease of use, performance, and ability to create cross-platform
mobile apps.
6.2.5 KNN
6.2.7 PyCharm
Visual Studio Code (VSCode) is a highly popular, open-source code editor devel-
oped by Microsoft. It is designed to be lightweight, efficient, and capable of sup-
porting a wide array of programming languages including JavaScript, TypeScript,
Python, Java, C++, C#, and more. VSCode boasts an array of key features that cater
to developers’ needs, including an integrated terminal, rich extension support, Git
integration, debugging tools, remote development capabilities, and real-time code
sharing. The integrated terminal in VSCode enables developers to run shell com-
mands and scripts directly from the editor, while its vast ecosystem of extensions
allows developers to enhance its functionality and customize its appearance. Git
integration facilitates version control and collaboration, and the built-in debugger
assists in identifying and rectifying issues in the code. VSCode also supports remote
development, allowing developers to work on remote machines, containers, and vir-
tual machines, and its Live Share extension enables real-time collaborative coding.
• Recall: Recall is the proportion of true positive predictions among all actual
positive instances.
Recall = TP/ TP+ FN
• Error Rate: Error Rate is the proportion of incorrect predictions among all
predictions made.
Error Rate = FP +FN / P+N
1. Data set Size : Size of the data set may vary from one data set to other, so
we might face the scenarios where the model might go the condition of the
Under-fitting or Over-fitting. Therefore defining the minimum and maximum
size of the data set could be important.
2. Storage: With the increase in the size of the data set from the real world im-
ages the issues about the Storage could arise when the model is actually im-
plemented for the real world applications.
3. Scalability : We must Ensure that with the increase in the number of user and
the speed at which system is being used it should still work properly under
those conditions.
6. Model Complexity: Complex models may require more training time, mem-
ory, and processing power. Simplifying model architectures or using model
compression techniques can improve efficiency.
[4] D. Ageng, C.-Y. Huang, and R.-G. Cheng, “A short-term household load fore-
casting framework using lstm and data preparation,” IEEE Access, vol. 9,
pp. 167911–167919, 2021.
[5] G. Dudek, P. Pełka, and S. Smyl, “A hybrid residual dilated lstm and exponential
smoothing model for midterm electric load forecasting,” IEEE Transactions on
Neural Networks and Learning Systems, vol. 33, no. 7, pp. 2879–2891, 2022.
PLAGIARISM REPORT
ANNEXURE B