Crop and Fertilizer Recommendation System
Crop and Fertilizer Recommendation System
In
By
To
I hereby declare that the work presented in this report entitled “CROP AND FERTILIZER
RECOMMENDATION SYSTEM " in partial fulfillment of the requirements for the award
of the degree of Bachelor of Technology in Computer Science and
Engineering/Information Technology submitted in the department of Computer Science &
Engineering and Information Technology, Jaypee University of Information Technology
Waknaghat is an authentic record of my own work carried out over a period from January
2022 to May 2022 under the supervision of Prof. Dr. Vivek Kumar Sehgal, Professor &
Head Department of Computer Science & Engineering and Information Technology.
The matter embodied in the report has not been submitted for the award of any other degree or
diploma.
This is to certify that the above statement made by the candidate is true to the best of my
knowledge.
I
AKCNOWLEDGEMENT
Firstly, I express my heartiest thanks and gratefulness to almighty God for His divine blessing
makes us possible to complete the project work successfully.
I am grateful and wish my profound my indebtedness to Supervisor Prof. Dr. Vivek Kumar
Sehgal, Professor & Head Department of Computer Science & Engineering and Information
Technology, Jaypee University of Information Technology, Wakhnaghat. Deep Knowledge &
keen interest of my supervisor in the field of “CROP AND FERTILIZER
RECOMMENDATION SYSTEM” to carry out this project. His endless patience, scholarly
guidance, continual encouragement, constant and energetic supervision, constructive criticism,
valuable advice, reading many inferior drafts and correcting them at all stage have made it
possible to complete this project.
I would like to express my heartiest gratitude to Prof. Dr. Vivek Kumar Sehgal, Professor &
Head Department of Computer Science & Engineering, and Information Technology, for his
kind help to finish my project.
I would also generously welcome each one of those individuals who have helped me straight
forwardly or in a roundabout way in making this project a win. In this unique situation, I
might want to thank the various staff individuals, both educating and non-instructing, which
have developed their convenient help and facilitated my undertaking.
Finally, I must acknowledge with due respect the constant support and patients of my parents.
Mayank Tomar
Preetam Kumar Tripathy
II
TABLE OF CONTENTS
CERTIFICATE I
AKCNOWLEDGEMENT II
LIST OF FIGURES V
LIST OF GRAPHS VI
ABSTRACT VII
Chapter-1 Introduction 1
1.1) Introduction 1
1.3) Objective 3
1.5) Motivation 4
3.3) Approaches 11
3.4) Dataset 11
III
3.6) Algorithms of Machine learning used 19
4.4) Output 36
Chapter-5 Conclusions 43
5.1) Conclusion 43
References 45
IV
LIST OF FIGURES
S no. Figures Page no.
V
LIST OF GRAPHS
VI
ABSTRACT
In terms of population India is the second largest country in the world. Numerous people are
dependent on husbandry, but the sector lacks effectiveness and technology especially in our
country. By bridging the gap between traditional husbandry and data wisdom, effective crop
civilization can be achieved. Country like India, which is developing, husbandry and
agriculture is the main or major source of earning for numerous people. In ultramodern times,
agrarian development and growth is happening or engaged due to many factors like
inventions, surroundings, ways, and societies. Also, the involvement of modern technology or
information technology in several decisions by farmer in work help them in gaining better
results. For the process of decision making, data mining ways related to husbandry are used.
“Data Mining” is the process of finding the patterns and extracting that pattern from large
datasets or we can say that to find useful information from given or existing data. There are
mainly three steps involved in the process of “data mining”, which is data pre-processing (in
data pre-processing cleaning of data, integration, selection, and transformation take place),
extracting data (useful data is extracted in this step), evaluation of data and its presentation
(data analyzing and result presentation take place). Applying the data mining ways on literal
climate and crop product data several prognostications can be made grounded on knowledge
gathered which can help farmer in gaining better crop productivity.
VII
CHAPTER-1 INTRODUCTION
1.1) Introduction
As we know snice the humans have started practicing or doing agriculture activities
“Agriculture” has become the most important activity for humans. In today’s era or world
agriculture is not only for surviving it’s also play huge part or role in economy of any country.
Agriculture plays vital role in India’s economy and in human future too. In India it also
provides large portion of employment for Indians. As a result, with passage of time the need
for production has been accumulated exponentially. thus, on manufacture in mass amount
individuals are exploitation technology in associate degree extremely wrong method.
With the improvement of the technologies day by day there is creation of hybrid varieties day
by day. In comparison with naturally created crop these hybrid varieties don’t offer or provide
essential contents. Depending more on unnatural techniques may lead to soil acidification and
crust. These types of activities all lead up to environmental pollution. These types of activities
(unnatural activities) are for avoiding or reducing losses. However, once the farmer or
producer get or grasp the correct data on the crop yield, it will help the farmer in avoiding or
reducing the loss
Around the globe India is the second largest country in terms of population. Many people are
dependent on agriculture, but the sector lacks efficiency and technology especially in our
country. By bridging the gap between traditional agriculture and data science, effective crop
cultivation can be achieved. It is important to have a good production of crops. The crop yield
is directly influenced by the factors such as soil type, composition of soil, seed quality, lack of
technical facilities etc.
Agriculture sector act as backbone of India by providing food security and playing major role
in Indian economy. Due to drastically changes in climatic condition it is affecting farmers due
to poor yield, which also affect them economically. Due to it prediction of crop is getting
1
difficult for farmers. This project will help the upcoming farmers by giving the farmer ease in
predicting the crop to sow for maximum profit.
In India agriculture plays important role in economic sector and also plays the most important
role in global development. A lot more than 60% of the country's land is used for agriculture
to meet the needs of 1.3 billion people. So, adopting new technologies for agriculture plays
important role. This is going to lead our country's farmers to make a profit. Crop prediction
and fertilizer prediction in most part of part India is done on by the farmers experience. Most
farmers will prefer previous or neighboring crops or most prone to the surrounding region only
because of their land and do not have sufficient information about the content of soil like
phosphorus, potassium, nitrogen.
"An ML based website that recommends the best crop you can plant, the fertilizer you can
use."
In this project, we are launching a website where the following applications are used:
Crop recommendations, fertilizer recommendations, respectively.
Most of the Indians have farming as their occupation. Farmers plant the same crop over and
over again without trying new varieties and randomly fertilize without knowing the amount
and content that is missing. Therefore, it directly affects crop yield and acidifies the soil result
in reducing soil fertility.
We are designing the system using machine learning to help farmers in crop and fertilizer
prediction. Right crop will be recommended for a specific soil and also keeping in mind of
climatic boundaries. Also, the system provides information about the required content and the
needed amount of fertilizer, the seeds needed for planting.
With the help of our system farmers can try to grow or cultivate different varieties with right
2
technique, which will help farmers in maximizing their profit.
1.3) Objective
• Recommend crops that should be planted by farmers based on several criteria and help
them make an informed decision before planting.
• In this project, we are launching a website where the following applications are made:
• In the crop recommendation app, the user can provide soil data on his side and the app
will predict which crop the user should grow.
• With the fertilizer application, the user can enter soil data and the type of crop they are
planting, and the application will predict what the soil is lacking or overgrown and will
recommend improvement.
In the system, we propose testing of multiple algorithms and by reading the classification
report we compare the algorithms and select the best one.
It should find accuracy for the given datasets, test database accuracy, precision and recall by
comparing algorithms.
3
• Comparing algorithms
• Finding best algorithm
1.5) Motivation
Farming is a major Indian occupation. About 70% of small and medium enterprises are based
on agriculture. So, to improve farming many farmers have started using new technologies and
methods. In this case the concept of identifying crop suitability and yield based on various
production factors can increase crop quality and yield, thereby increasing economic growth
and profitability.
For agriculture to continue to grow, many farmers have begun to use the latest technology and
methods. However, there is a huge gap in knowledge about crop production and how it can
affect farm profitability.
Choosing a crop to plant is one of the biggest challenges farmers faces in growing crops.
There are several factors involved. By recommending the most suitable crops and promoting
the right crop fertilizer, a crop recommendation system can help farmers choose the right crop
yield crop.
4
CHAPTER-2 LITERATURE SURVEY
Recommendation system for crop and fertilizer are present in market and also many are on
developing stage which consider various factors such as climate condition at the time of
plantation, rainfall, humidity or soil contents.
Many research has been done in this field and following are some of the researches and paper
that has been carried out in this field.
The article “Prediction of crop yield and fertilizer recommendation using machine learning
algorithms “[1] concludes that the prediction of crop for all intents and purposes yield based
on location and proper implementation of algorithms basically essentially have essentially
proved that the pretty much higher crop actually kind of yield can generally particularly be
achieved, which definitely definitely is quite significant, or so they generally thought. From
above work I particularly particularly conclude that for soil classification really Random
Forest basically literally is definitely kind of good with accuracy 86.35% literally essentially
compare to Support Vector Machine, which definitely really is quite significant, or so they for
the most part thought.
For crop essentially yield prediction Support Vector Machine generally specifically is
particularly very good with accuracy 99.47% mostly compare to fairly kind of Random Forest
algorithm in a for all intents and purposes major way, sort of contrary to popular belief. The
work can basically literally be extended particularly particularly further to mostly for the most
part add following functionality, particularly contrary to popular belief. Mobile application
can essentially be kind of for the most part build to generally particularly help farmers by
uploading image of farms. Crop diseases detection using image processing in which user get
pesticides based on disease images, which generally is quite significant. Implement actually
fairly Smart Irrigation System for farms to for all intents and purposes mostly get pretty sort of
much kind of higher yield, or so they kind of for all intents and purposes thought.
5
Paper introduced [2] by Rakesh Kumar, M.P. Singh, Prabhat Kumar and J.P. Singh proposed
utilization of seven AI procedures i.e., ANN, SVM, KNN, Decision Tree, Random Forest,
GBDT and Regularized Gradient Forest for crop determination. The framework is intended to
recover every one of the harvests planted and season of developing at a specific season. Yield
pace of each harvest is gotten and the harvests giving better returns are chosen. The
framework likewise proposes an arrangement of harvests to be planted to get the more
significant returns.
Leo Brieman [3], is gaining practical experience in the precision and strength and connection
of arbitrary woods calculation. Arbitrary woods calculation makes choice trees on various
information tests and afterward foresee the information from every subset and afterward by
casting a ballot offers better the response for the framework. Irregular Forest utilized the
stowing strategy to prepare the information. To support the exactness, the arbitrariness infused
needs to limit the connection ρ while keeping up with strength.
6
CHAPTER-3 SYSTEM DEVELOPMENT
7
Graph 2 Crop recommendation Architecture
8
Graph 3 model deployment architecture
9
Methods of Machine Learning
In machine gaining knowledge of, obligations are regularly divided into large categories.
These
classifications are based totally on how data is acquired and the way the system responds to it.
Two of the maximum widely used gadget getting to know methods are unsupervised
mastering, which gives the algorithm without a labelled information so as for it to find
structure inside its input statistics, and supervised mastering, which trains algorithms primarily
based on example input and output facts that is labelled via humans. Let's take a deeper study
each of those strategies.
• Supervised
In this learning machine learning model is provide with dataset having inputs as well as their
correct outputs too. Or we can say that labelled datasets are provided to algorithms in machine
learning model for training (guided training).
• Unsupervised
In this learning labelled datasets is not provided. It tries to find pattern between the data in the
datasets. In this type of learning involvement of human or human supervision is required less
compared to the supervised learning.
It can manage or handle unstructured data and unlabeled data more easily. Though, it make
easier to analyzing, finding pattern in complex data.
10
3.3) Approaches
As a field, the information gadget is closely associated with computer knowledge, so having a
mathematical legacy helps you to better see and apply machine management techniques.
For those who have never studied mathematics before, the definition of relatability and
regression, the two most commonly used methods of assessing the correlation between
quantitative statistics, is a good place to start. The relationship of degree of communication
between unstructured or independent variables to each other. Reversal is used to look for the
correlation between a single supported variable and a neutral one at its basic level. Because
they can be used for fixed variable predictions while neutral variables are understood,
retrospective facts provide predictive capabilities.
3.4) Dataset
We have considered 2 datasets. One helps recommendation of crops, and second dataset helps
in prediction or recommendation of fertilizer.
As we all know that good crop production or good yield of crop depends on various factor, in
this dataset we are provided with various factors that is involved in production of crop. With
the help of this data set crop recommendation model can be created.
11
Dataset for crop recommendation have following data fields
12
• Dataset for fertilizer recommendation
Only finding right crop to grow is not enough for good yield or good yield production we must
also find what fertilizer must be used for crop care.
13
Figure 2 dataset for fertilizer prediction
Data is collected from various sources therefore it may contain many missing values or raw
data which is collected is processed in a manner so that it can be easily process in different
tasks like in machine learning model, data science tasks.
14
Model Building
Model building is a process to create a mathematical model which will help in predicting or
calculating the outcomes in future based on data collected in the past.
E.g.-
A retail wants to know the default behavior of its credit card customers. They want to predict
the probability of default for each customer in next three months.
A customer with healthy credit history for last years has low chances of default (closer to 0).
Algorithm Selection
Training Model
Prediction / Scoring
15
Algorithm Selection
Example-
Yes No
Supervised Unsupervised
Learning Learning
Is dependent
variable continuous?
Yes No
Regression Classification
Algorithms
Logistic Regression
Decision Tree
Random Forest
16
Training Model
Predictive Modelling
E.g.-
Types
17
Supervised Learning
Unsupervised learning
Clustering:
A clustering problem is where you want to discover the inherent groupings in the data, such
as grouping customers by purchasing behavior.
Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.
i. Problem definition
v. Predictive Modelling
18
3.6) Algorithm of Machine Learning used
• Logistic Regression
• Naive Bayes
This algorithm thinks that the the dataset features are all independent of each other.
Larger the dataset it works better. DAG (directed acyclic graph) is used for classification in
this or naïve bayes algorithm.
• Random forest
Random Forest has the ability to analyze crop growth related to the current climatic conditions
19
and biophysical change. Random forest algorithm creates decision trees on different data
samples and then predict the data from each subset and then by voting gives better solution for
the system. Random Forest uses the bagging method to train the data which increases the
accuracy of the result.
• Decision Tree
Decision tree is the most powerful and popular tool for classification and prediction. A
Decision tree is a flowchart like tree structure, where each internal node denotes a test on an
attribute, each branch represents an outcome of the test, and each leaf node (terminal node)
holds a class label.
Support Vector Machine is a relatively simple Supervised Machine Learning Algorithm used
for classification and/or regression. It is more preferred for classification but is sometimes
very useful for regression as well. Basically, SVM finds a hyper-plane that creates a boundary
between the types of data. In 2-dimensional space, this hyper-plane is nothing but a line.
In SVM, we plot each data item in the dataset in an N-dimensional space, where N is the
number of features/attributes in the data. Next, find the optimal hyperplane to separate the
data. So, by this, you must have understood that inherently, SVM can only perform binary
classification (i.e., choose between two classes).
20
Graph 5 support vector machine
21
3.7) Tools and libraries used
Python:
For carrying out this project in the best possible manner, we decided on using Python
Language, which comes with several pre-built libraries (such as pandas, NumPy, SciPy, and
etc.) and is loaded with numerous features for implementing data science and machine
learning techniques which allowed us to design the model in the most efficient manner
possible. For building this project we utilized numerous python libraries for executing
different operations.
● Python - Python is a robust programming language with a wide range of capabilities. Its
broad features make working with targeted programs (including meta-programming and meta-
objects) simple. Python takes advantage of power typing as well as the integration of reference
computation and waste management waste collecting. It also supports advanced word
processing (late binding), which binds the way the words change during the process.
Patches to fewer essential sections of C Python that can give a minor improvement in
performance at an obvious price are rejected by Python developers who try to prevent
premature execution. When speed is crucial, the Python program developer can use mod-
written modules in C-languages or PyPy, a timely compiler, to submit time-sensitive jobs.
22
Cython is a Python interpreter that transforms Python scripts to C and uses the C-1evel API to
call the Python interpreter directly. Python's creators attempt to make the language as fun to
use as possible. Python's architecture supports Lisp culture in terms of functionality. Filters,
maps, and job reduction, as well as a list comprehension, dictionaries, sets, and generator
expressions, are all included.
Two modules (itertools and functools) in the standard library use realistic Haskell and
Standard ML tools.
We're using Python because it works on a wide range of platforms. Python is a language with
no stages. Python is a as simple as English. Python have many libraries and has a simple
linguistic structure similar to English, whereas Java and C++ have complicated codes. Python
applications contain less lines than programs written in other languages. That is why we
choose Python for artificial intelligence, artificial consciousness, and dealing with massive
volumes of data. Python is an article-oriented programming language. Classes, objects,
polymorphism, exemplification, legacy, and reflection are all concepts in Python.
23
HTML:
The webpage can be divided into multiple small sections and each section has specific
information in it. So when you write an html document you are giving browser set of
instructions on how to display content on web page
As we know that html to define the structure of the web page with no styling. To style the
webpage like font, color, size and much more we use css or cascading style sheets. With help
of it we can any element of our web page. To define our web page browser reads html and css
together.
24
It is of three types:
Inline
Internal
External
JavaScript:
JavaScript, commonly known as JS, is an open and cross platform interpreted programming
language.
It helps in creating frontend and backend applications using various frameworks. It not only
makes your website good looking but functional too.
25
It is a small framework in python. It has no basis for website summaries, form verification, or
any other categories where third-party libraries provide similar services. However, Flask
supports extensions that can add features to the app as if they were made in Flask itself. There
are object-related map extensions, form verification, download management, various open
authentication technologies and several tools related to the standard framework.
Flask gives the developer a variety of options when designing web applications, by giving
person with means that help them to create or build website but will not force you to rely on or
tell you what the project should look like.
NumPy library:
NumPy is a Python program library, which adds support for large, multi-dimensional
collections and matrices, as well as a large collection of mathematical functions developed to
work in these components.
The use of NumPy in Python is basically the same as that of MATLAB as they both translate
and allow customers to create projects faster as long as multiple tasks are focused on clusters
or networks rather than scales. Along with these critical ones, there are several options:
26
Pandas’ library:
It is a software library in python to decrypt and analyze data. It provides data structures and
functions to manage number tables and time series. Free software released under a three-phase
BSD license. The term is taken from the term "panel data", an econometrics term for data sets
that incorporates visibility into many identical people.
Adding or modifying data engines by a robust community that allows different applications to
be integrated into data sets. High output of a combination of data and a combination.
Hierarchical indexing provides an accurate way of dealing with large-scale data in a small data
structure.
Matplotlib:
John Hunter and many others built a matplotlib Python library to create graphs, charts, and
high-quality statistics. The library can change very little information about mathematics, and it
is great. Some of the key concepts and activities in matplotlib are:
27
Picture
Every image is called an image, and every image is an axis. Drawing can be considered as a
way to draw multiple episodes.
Structure
Data is the first thing that a graph should be drawn. A keyword dictionary with keys and
values such as x and y values can be declared. Next, scatter (), bar (), and pie () can be used to
create a structure and a host of other functions.
Axis
Adjustments are possible using the number and axes obtained using the sub-sections (). Uses a
set () function to adjust x-axis and y-axis features.
Scikit learn:
The best Python Scikit-learn machine library. The sklearn library contains many practical
machine learning tools and mathematical modeling methods, including division, deceleration,
integration and size reduction. Machine learning models used by sklearn. Scikit-Learn charges
for tons of features and should not be used to read data or trick or summarize it. Some of them
are there to help you translate the spread.
28
Scikit-learn comes with many features. Some of them are here to help us explain the spread:
• Supervised learning algorithms: Consider any professional reading algorithms you may have
studied and may be part of science. Starting with the standard line models, SVM, decision
trees are all in the science toolbox. One of the main reasons for the high level of use of
scientists is the proliferation of machine learning algorithms. I started using scikit, and I would
recommend young people to learn the scikit / machine. I will solve supervised learning
problems.
• Unchecked learning algorithms: There are also a wide variety of machine learning
algorithms ranging from compilation, feature analysis, key component analysis to unchecked
neural networks.
• Contrary verification: a variety of methods are used by sklearn to ensure the accuracy of the
models followed with invisible details.
• Datasets for different toys: This was useful when studying science. I have studied SAS for
different educational data sets. It helped them a lot to support when they read the new library.
29
Chapter-4 PERFORMANCE ANALYSIS
The data used in this project is made by enlarging and consolidating India’s publicly available
data sets such as weather, soil, etc. This data is simple compared to very few factors but useful
as opposed to complex factors that affect crop yields.
The data are rich in Nitrogen, Phosphorus, Potassium, and soil pH. Also, it contains humidity,
temperature and rainfall required for a particular plant.
• Logistic regression
30
• Naive Bayes
31
• Random Forest
32
• Decision tree
33
• SVM: (Support vector machine)
34
4.3) Accuracy Comparison of Algorithms
35
4.4) Output
Home page
A home page is generally the first page of the website when a visitor visits to the website and
it also work as a navigating page visit other pages of website. Therefore, a good looking and
nicer home page design is essential for a website.
36
Figure 9 home page
37
Crop recommendation system
To recommend crops that should be planted by farmers based on a number of criteria and help
them make an informed decision before planting.
38
Figure 11 crop recommendation system with input
39
Figure 12 crop recommendation system giving output
40
Fertilizer recommendation system
In the fertilizer recommendation sytem, the user can input the soil data (like amount of
nitrogen, phosphorous, potassium, crop you wanna to grow), and the application will predict
what is best for soil to maximize the crop yield and will recommend suggestions for
improvements.
41
Figure 14 fertilizer recommendation system with input
42
Chapter-5 CONCLUSIONS
5.1) Conclusions
In this project we try to get best crop and fertilizer recommendation with the help of machine
learning. For the calculation of accuracy many machine learning techniques were imposed or
used. Numerous algorithms were used on datasets to get the best output which leads to best
crop and fertilizer recommendation for particular soil of particular region.
This system will help farmers to visualize crop yields based on that climatic and subsistence
boundaries
Using this farmer can decide whether to plant that crop or to look for another crop if yield
forecasts are incorrect.
This tool can help the farmer to make the best decisions when it comes to growing something
harvest. It may also predict the negative effects of the plant.
Currently our farmers use outdated technology or not use effectively, so there can be an
opportunity of the wrong choice of cultivated crops that will reduce the profit by production.
To reduce these types of loss we try to create a farmer-friendly system, which will help in
predicting which crop is best for a specific soil and this project will give the recommendation
about the fertilizer needed by the soil for cultivation, seeds needed for cultivation,
expectations yield and market price. Thus, this enables farmers to make the right choice in
choosing a crop farming so that the agricultural sector can develop with new ideas
43
5.2) Future Scope
For the upcoming updates in this project we can use deep learning techniques for plant
diseases prediction with the help of images and we can also implement IOT techniques for
getting contents of soil directly from the fields.
• Current Market Conditions and analysis for information on crop market rates, production
costs, fertilizer.
• The mobile app can be developed to assist farmers with uploading farm photos.
• Plant Disease Detection is used to process images where the user finds pesticides based on
their pictures of diseases.
44
REFERENCES
[4] Priya, P., Muthaiah, U., Balamurugan, M.”Predicting Yield of the Crop Using Machine
Learning Algorithm”,2015
[5] Mishra, S., Mishra, D., Santra, G. H.,“Applications of machine learning techniques in
agricultural crop production”,2016
[6] Ramesh Medar,Vijay S, Shweta, “Crop Yield Prediction using Machine Learning
Techniques”, 2019
[7] https://www.data.gov.in
[9] https://en.wikipedia.org/wiki/Agriculture
[10] https://www.ibm.com/weather
[11] https://openweathermap.org
[12] https;//builtin.com/data-science/random-forest-algorithm
45
[13] https://tutorialspoint/machine -learning/logistic-regression
[14] http://scikit-learn.org/modules/naive-bayes
46