0% found this document useful (0 votes)
51 views

CROP & FERTILIZER RECOMANDATION SYSTEM USING ML

Uploaded by

malik cp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

CROP & FERTILIZER RECOMANDATION SYSTEM USING ML

Uploaded by

malik cp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

CROP & FERTILIZER RECOMANDATION SYSTEM

USING ML

Project report submitted in partial fulfillment of the requirement for


the award of the Degree of Master of Computer Applications

By

ASHUTOSH RAWAT 2201406020


ANKUR UPADHYAY 2201406087
ANKIT YADAV 2201406013
ROHIT KUMAR 2201406093
GAURAV KUMAR 2201406089

Under the guidance


of
Dr. ASHISH SAINI
Assistant Professor , Department of Computer Science & Engineering

Department of Computer Application


Quantum School of Technology
Quantum University, ROORKEE

2024
CERTIFICATE

This is to certify that the project report entitled “CROP & FERTILIZER
RECOMANDATION SYSTEM USING ML”being submitted by ASHUTOSH
RAWAT, ANKUR UPADHYAY, ANKIT YADAV, ROHIT KUMAR , Gaurav
Kumer in partial fulfillment for the award of the Degree of Master of Computer
Applications is a record of bonafide work carried out by him under my guidance and
supervision.

(Name & Signature) (Name & Signature)


Dr. Ashish Saini Program Officer, Computer Application

[i]
DECLARATION

I hereby declare that the project entitled “CROP & FERTILIZER

RECOMANDATION SYSTEM USING ML” submitted for the MCA

Degree in Computer Application is my original work and the project has

not formed the basis or submitted for the award of any degree,

diploma, or any other similar titles in any other college / institute /

university.

Name:

Signatur

e: Place:

Date:

[ii]
ACKNOWLEDGEMENT

On the very outset of this report, we would like to extend our sincere and
heartful obligation towards all the personages who have helped me in this
endeavor. Without their active guidance, help, cooperation and
encouragement, we would not have made headway in the project. We are
extremely thankful and pay our gratitude to our guide Dr. Ashish Saini
for valuable guidance and support on completion of this project in its
presently. We extend our gratitude to Quantum University for giving us
this opportunity. We also acknowledge with a deep sense of reverence,
our gratitude towards our parents and member of our family, who has
always supported us morally as well as economically.
At last, but not the least, gratitude goes to all my friends who directly
or indirectly helped me to complete this project report.

Thanking You

[iii]
TABLE OF CONTENTS

CERTIFICATE OF THE SUPERVISOR i


DECLARATION ii

ACKNOWLEDGEMENT iii

TABLE OF CONTENTS iv

LIST OF FIGURES vi

ABSTRACT vii

1. Introduction
1.1 Overview ......................................................................................... 02
1.2 What is crop fertilization and recommendation system..................... 02
1.3Importance of crop fertilization and recommendation system ……...03
2. Literature Review...................................................................................... 05
3. Methodology
3.1 Data Pre-processing ....................................................................... 13
3.2 Splitting dataset into testing and traing data sets…………………....13
3.3 Random Forest……………………………………………………….14
3.4 Support Vector Machine(SVM)……………………………………...14
3.5 Gradient Descent……………………………………………………….15
3.6 Long Short – term memory(LSTM)……………………………………16
3.7 Logistic Regression…………………………………………………….17
3.8 Decision Tree……………….………………………………………….18
3.9 KNN (K- Nearest Neighbors)………………………………………. 19
3.10 Linear Regression……………………………………………………20

3.11 Naive Bayes Classifiers……………………………………………..21

3.12 K- Means Clustering…………………………………………………22

[iv]
4. Implementation
4.1 Crop_app.py
4.2 Home_1.html
4.3 index.html
4.4 Prediction.html
4.5 First.css
4.6 My home.css
4.7 Prediction_css.css
5. Conclusion
6. Future Work
7. References

[iv]
TABLE OF FIGURES

Figure Title Page


No. No.

01 Working of Crop Fertilization System 02

02 Yield Forecast 04

03 Data Preprocessing 08

04 Splitting Dataset 09

05 Working of Random Forest 10

06 Working of SVM 11

07 Working of Gradient Descent 12

08 Working of LSTM 13

09 Working of Decision Tree 18

10 KNN Algorithm Working Visulization 19

11 Linear Regression 21

12 Working of K-Means 26

[vi]
ABSTRACT

India is an Agriculture based economy whose most of the GDP comes


from agriculture and its allied sectors Which accounts for 20% of total
GDP. Agriculture is the one that plays important role in the economy
of India. India is an agricultural country and its economy largely
based upon crop production. The application (Smart Farm) developed
in this research helps users to predict the crop yield using different
climatic parameter. Deep learning has been applied for the crop yield
prediction problem, however, there is a lack of systematic analysis of
the studies. Therefore, this study aims to provide an overview of the
state-of-the-art application of deep Learning in crop yield prediction.
Machine Learning (ML), with its prediction power to tackle complex
system, may solve this barrier in the development of locally based N
recommendation. Machine Learning approach to suggest the suitable
crop based on soil parameters can help the farmers to cultivate the
crops accordingly and can produce more yield. In this paper random
Forest Classifier is used to train the Machine Learning model on soil
dataset using Python. Model performance is evaluated using
confusion matrix and classification report having precision, recall and
F1 score. Model accuracy achieved is 99% without parameter tuning.
This paper shows the best way of crop selection and yield prediction
in minimum cost and effort. Artificial Neural Network is considered
robust tools for modelling and prediction. We observed that
Convolutional Neural Network (CNN) is the most common algorithm
and it has the best performance in terms of Root Mean Square Error
(RMSE). This paper explores various ML techniques utilized within
the field of crop yield estimation and provided an in depth analysis in
terms of accuracy using the techniques.

Keywords: Agriculture, Artificial Neural Network, Convolution


Neural Network, Crop yield prediction, Machine learning method.
[vii]
1 INTRODUCTION

Agriculture is extremely important to the global economy.


Understanding global crop yield is critical for resolving. Food security
issues and mitigating the effect of climate change as the human
population continues to grow. Crop yield forecasting is a significant
agriculture problem. Weather condition (rain, temperature, etc.) and
pesticides have a great impact on agriculture yield. The government
supports mostly rice and wheat, with some others throw-in, through
factors like “minimum support price”. Funding crop is not a strong
and healthy system. Agriculture is the broadest economic sector and
plays a significant role in the overall socio-economic fabric of India.
It contributes to an increase in total GDP and employs over half of the
populations. Many researchers are aimed at improving agricultural
planning and crop productivity with good quality of crops. So our
main goal is to get the maximum yield if crops. There are many
Machine Learning Classification Techniques are available to get a
good quality of yield of crops. The Machine Learning learns the
algorithm based on the Supervised, Unsupervised, and Reinforcement
Learning each has their importance and limitations. Supervised
learning the algorithm builds a mathematical model from a set of data
that contains both the inputs and the desired outputs. Unsupervised
learning-the algorithm builds a mathematical model from a set of data
which contains only inputs and no desired output labels. Semi-
supervised learning- algorithm develop mathematical models from
incomplete training data, where a potion of the sample input doesn’t
have labels. Although Deep learning algorithm can provide a better
performance, challenges of using Deep Learning techniques for crop
yield prediction are lacking in the literature. They both depends on the
crop type, the kind of data , the sources, and the implementation
framework. In the study, we perform a Systematic Literature
1
Review(SLR) to get an overview of the literature on these topics. The
study aims to implement the Random Forest classifier in python on
the dataset containing 22 varieties of crops. The model’s performance
os calculated under two criterions- Entropy and Gini index. The aim
of the model is to suggest crop for cultivation for the particular soil
type and climatic condition more accurately. Fertilization is one of the
essential links in agriculture production, and it is a necessary means to
supplement soil nutrients and improve crop yield and quality [1].
About 30-50% of crop yield increases with chemical fertilizers [2].
Studies have shown that China’s chemical fertilizer use in the past
was at the highest level among major countries worldwide, and the
intensity of use is increasing yearly [3,4]. Meanwhile, the
phenomenon of excessive fertilization and unreasonable fertilization
structures still exists in China [5]. However, there is no simple linear
relationship between the amount of fertilizer applied and the derived
economic benefits of crop plating.

Fig 1 - Working of Crop Fertilization System

2
Our observations regarding this study are beneficial not only for
researchers in this field but also for practitioners who would like to
develop novel crop yield prediction models for their own usage. For
researchers in this field, the challenges are important since they will
be aware of these issues before they develop their own models. For
practitioners, the development of new crop yield prediction models
involves several challenging steps that are addressed in this SLR
paper. For instance, the selection of model parameters and the
algorithms require critical thinking using the literature.

1.1 OVERVIEW

Crop Recommendation Systems (CRS) are computer-based tools that


help farmers make informed decisions about which crops to plant
based on factors such as soil type, weather patterns, and historical
crop yields. CRS can optimize crop yields while minimizing resource
usage such as water, fertilizer, and pesticides. Machine learning
models, such as decision trees, support vector machines, and neural
networks, are commonly used in CRS, but these models are often
considered "black boxes" with limited transparency and
interpretability, which can reduce trust in the system.

1.2 What is Crop Fertilization and recommendation


system?

In Indian Economy, a major role is played by agriculture since more


than half of the population depends on agriculture as their income.
Growing crops over thousands of years on the same land without
caring about replenishing has led to depletion. Soil nutrients can
directly affect the growth of a crop and its production. Plant diseases
can also be caused due to insufficient levels of soil nutrients. On the
contrary, applying an excessive amount of soil fertilizer may also
result in the adverse crop development. As the wet season to dry
3
season, the soil's nutrient content also changes. An observation was
made that by using fertilizer with the recommended dose which is
calculated based on soil test values, farmers can harvest
approximately 8–21% maximum yield of various types of crops as
compared to farmer’s usual practice. Our project aims to find a
suitable fertilizer for the given crop based on the parameters such as
moisture, temperature, and nitrogen, potassium, and phosphorus
levels. Based on the given fertilizer recommendation, the farmer will
also be provided with a few fertilizer shops in the nearby area.

Fig 2 - Yield Forecast


4
1.3 Importance of Crop Fertilization and
recommendation system.

Nutrient Supply:

 Essential Nutrients: Fertilization provides essential nutrients like Nitrogen,


Phosphorus, and Potassium, which are crucial for plant growth.
 Micronutrients: It also supplies micronutrients such as Zinc, Copper, and
Manganese, vital for plant health and development.

Improved Yield:

 Enhanced Productivity: Proper fertilization significantly increases crop yields,


ensuring higher productivity per unit area.
 Quality Improvement: It improves the quality of the produce, enhancing attributes
such as size, color, and nutritional content.

Soil Health:

 Soil Fertility: Regular application of fertilizers maintains soil fertility by replenishing


lost nutrients.
 Microbial Activity: It promotes beneficial microbial activity in the soil, which aids in
nutrient cycling and soil structure improvement.

Economic Benefits:

 Increased Income: Higher yields translate to increased income for farmers.


 Cost-Effectiveness: Efficient use of fertilizers can reduce the cost per unit of
produce, making farming more profitable.

Sustainable Agriculture:

 Balanced Use: Proper fertilization practices ensure the balanced use of nutrients,
preventing soil degradation and promoting sustainable farming practices.
 Environmental Protection: It reduces the risk of nutrient runoff and leaching,
protecting water bodies from contamination.

5
2. LITERATURE REVIEW:-

[1] In this paper SVM method used to classify crop data and CNN is
used to reduce the relative error. By using these, methods losses of
crop yield reduce irrespective of environment distraction.

[2] In this paper K-means clustering is used to create clusters, Aprior


algorithm is used to count frequent of a crop for specific location and
Naive Baye’s algorithm is used to find exact crop.

[3] In this paper Decision tree uses greedy methodology and Random
Forest algorithm used to predict the best crop. It helps the farmers in
decision making of which crop of to cultivate in the field.

[4] In this paper K nearest algorithm, Naïve Bayes and Decision tree
are used to predict the crop yield. It helps the farmers to identify the
yield of crops in different soil and atmospheric conditions.

[5] In this paper J48 and IBK are used for classification, LWL used to
assign instance weights, LAD tree used to classify based on binary
target value. It is useful to the farmers for early prediction and
decision making.

[6] In this paper naïve Bayes and KNN algorithm have been used in
order to achieve maximum crop yield. By this we can also get the
accuracy yield by checking for different method.

6
[7] In this paper LSTM and Simple RNN methods are used to predict
the temperature and rainfall. Finally, we got know to that Random
Forest Regressor will more accuracy.

[8] In this paper Feed Forward Neural and Recurrent Neural Network
techniques is used. Comparing the FNN and RNN based on loss of
error RNN has low error rate at the same it is better for crop yield
prediction.

[9] In this paper they have developed a user friendly webpage and the
accuracy of predictions are above 75 percent by Random Forest
Algorithm.

[10] In this proposed work a Hadoop framework based on Random


Forest Algorithm described works faster

7
3. METHODOLOGY:-

3.1. Data preprocessing: Data preprocessing is a technique for


transforming unprocessed data into a flawless data set. At the end of
the day, whenever data is gathered from various sources, it is gathered
in a raw or crude form that cannot be analyzed by machine learning or
deep learning methodologies.

Fig 3 – Data Preprocessing

8
3.2. Splitting Dataset into Testing and Training Sets:

The final step of data preprocessing is testing and training the data.
The train -test- split method has been used to split the data, with the
test set being 2% of the total dataset and the random state set to 71.

Fig 4 – Splitting Dataset

9
3.3. Random Forest: Random forest is a very famous machine
learning algorithm that felicitates in cases of both classification and
regression issues. This algorithm works on the notion of ensemble
learning, which works on the principle of merging several classifiers
to give the solution for any complex problem and improve the
precision and performance of the applied model. l. Random Forest is
a classifier that uses numerous decision trees on subsets of a dataset
and takes the average into account to increase the dataset's prediction
accuracy.

Fig 5 Working of Random Forest

10
3.4. Support Vector Machine (SVM): The goal of the support
vector machine algorithm is to discover a hyperplane in N-
dimensional space (N refers to the number of features present in the
dataset) that classifies the data points very distinctly. To detach the
two classes of information points, various possible hyperplanes could
be chosen. Extending the edge distance gives some help so future data
points can be organized with more assurance.

Fig 6 – Working of SVM

11
3.5. Gradient Descent: Gradient descent is a well-known
optimization algorithm that is frequently used in machine learning and
deep learning. It finds the coefficients that minimize the cost function
as far as possible by identifying a local minimum of the differentiable
function. Gradient descent begins by characterizing the initial
parameter values and then uses analytics and calculus to iteratively
change the values so that they limit the given cost function. The
greater the gradient, the greater the slope, and thus the faster the
model's learning rate.
The following equation describes what the given algorithm does:

𝑏 = 𝑎 − 𝛾∇𝑓(𝑎)
Whereas:
b = next position
a = current position
(-) = The negative sign denotes the minimization component of
gradient descent
𝛾 = gamma denotes the waiting feature and the gradient term
(Δf(a)) = direction of the sharpest decrease.

Fig 7 – Working of Gradient Descent


3.6. Long Short-Term Memory (LSTM): Long short-term
memory (LSTM) is a type of recurrent neural network (RNN) that is
capable of long-term dependence. RNN is recurrent in nature, as the
name implies, and so performs the exact function for all data inputs,
although the output of the current input is significantly dependent on
prior calculations. It cannot only cycle single information points that
are the data points, for example, pictures, but also the whole
groupings of information, for example, discourse or video.
RNN, helps resolve one of the major problems faced on any RNN
network, which is the vanishing gradient problem. It trains any
model by back-propagation.

Fig 8 – Working Of LSTM


3.7 Logistic Regression

Logistic regression is used for binary classification where we


use sigmoid function, that takes input as independent variables and
produces a probability value between 0 and 1.
For example, we have two classes Class 0 and Class 1 if the value of
the logistic function for an input is greater than 0.5 (threshold value)
then it belongs to Class 1 otherwise it belongs to Class 0. It’s
referred to as regression because it is the extension of linear
regression but is mainly used for classification problems.

Key Points:
 Logistic regression predicts the output of a categorical dependent
variable. Therefore, the outcome must be a categorical or discrete
value.
 It can be either Yes or No, 0 or 1, true or False, etc. but instead of
giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
 In Logistic regression, instead of fitting a regression line, we fit an
“S” shaped logistic function, which predicts two maximum values
(0 or 1).

Logistic Function – Sigmoid Function

 The sigmoid function is a mathematical function used to map the


predicted values to probabilities.
 It maps any real value into another value within a range of 0 and
1. The value of the logistic regression must be between 0 and 1,
which cannot go beyond this limit, so it forms a curve like the “S”
form.
 The S-form curve is called the Sigmoid function or the logistic
function.
 In logistic regression, we use the concept of the threshold value,
which defines the probability of either 0 or 1. Such as values
above the threshold value tends to 1, and a value below the
threshold values tends to 0.

Types of Logistic Regression


On the basis of the categories, Logistic Regression can be classified
into three types:
1. Binomial: In binomial Logistic regression, there can be only two
possible types of the dependent variables, such as 0 or 1, Pass or
Fail, etc.
2. Multinomial: In multinomial Logistic regression, there can be 3 or
more possible unordered types of the dependent variable, such as
“cat”, “dogs”, or “sheep”
3. Ordinal: In ordinal Logistic regression, there can be 3 or more
possible ordered types of dependent variables, such as “low”,
“Medium”, or “High”.

Sigmoid Function
Now we use the sigmoid function where the input will be z and we
find the probability between 0 and 1. i.e. predicted y.

(𝑧)=11−e−𝑧σ(z)=1−e−z1
Sigmoid function

As shown above, the figure sigmoid function converts the


continuous variable data into the probability i.e. between 0
and 1.
 𝜎(𝑧) σ(z) tends towards 1 as 𝑧→∞z→∞

 𝜎(𝑧) σ(z) tends towards 0 as 𝑧→−∞z→−∞

 𝜎(𝑧) σ(z) is always bounded between 0 and 1

where the probability of being a class can be measured as:


𝑃(𝑦=1)=𝜎(𝑧)𝑃(𝑦=0)=1−𝜎(𝑧)P(y=1)=σ(z)P(y=0)=1−σ(z)
3.8 Decision Tree in Machine Learning
A decision tree is a type of supervised learning algorithm that is
commonly used in machine learning to model and predict outcomes
based on input data. It is a tree-like structure where each internal
node tests on attribute, each branch corresponds to attribute value
and each leaf node represents the final decision or prediction. The
decision tree algorithm falls under the category of supervised
learning. They can be used to solve both regression and classification
problems.

Decision Tree Terminologies

There are specialized terms associated with decision trees that denote
various components and facets of the tree structure and decision-
making procedure. :
 Root Node: A decision tree’s root node, which represents the
original choice or feature from which the tree branches, is the
highest node.
 Internal Nodes (Decision Nodes): Nodes in the tree whose choices
are determined by the values of particular attributes. There are
branches on these nodes that go to other nodes.
 Leaf Nodes (Terminal Nodes): The branches’ termini, when
choices or forecasts are decided upon. There are no more branches
on leaf nodes.
 Branches (Edges): Links between nodes that show how decisions
are made in response to particular circumstances.
 Splitting: The process of dividing a node into two or more sub-
nodes based on a decision criterion. It involves selecting a feature
and a threshold to create subsets of data.
 Parent Node: A node that is split into child nodes. The original
node from which a split originates.
 Child Node: Nodes created as a result of a split from a parent node.
 Decision Criterion: The rule or condition used to determine how
the data should be split at a decision node. It involves comparing
feature values against a threshold.
 Pruning: The process of removing branches or nodes from a
decision tree to improve its generalisation and prevent overfitting.
Understanding these terminologies is crucial for interpreting and
working with decision trees in machine learning applications.

Fig 9 – Working of Decision Tree

3.9 KNN (K-Nearest Neighbors)

The K-Nearest Neighbors (KNN) algorithm is a supervised machine


learning method employed to tackle classification and regression
problems. Evelyn Fix and Joseph Hodges developed this algorithm in
1951, which was subsequently expanded by Thomas Cover. The
article explores the fundamentals, workings, and implementation of
the KNN algorithm.
KNN is one of the most basic yet essential classification algorithms
in machine learning. It belongs to the supervised learning domain
and finds intense application in pattern recognition, data mining, and
intrusion detection.
It is widely disposable in real-life scenarios since it is non-
parametric, meaning it does not make any underlying assumptions
about the distribution of data (as opposed to other algorithms such as
GMM, which assume a Gaussian distribution of the given data). We
are given some prior data (also called training data), which classifies
coordinates into groups identified by an attribute .

Fig 10 - KNN Algorithm working visualization


3.10 Linear Regression

Machine Learning is a branch of Artificial intelligence that focuses


on the development of algorithms and statistical models that can
learn from and make predictions on data. Linear regression is also a
type of machine-learning algorithm more specifically a supervised
machine-learning algorithm that learns from the labelled datasets
and maps the data points to the most optimized linear functions.
which can be used for prediction on new datasets.
First of we should know what supervised machine learning
algorithms is. It is a type of machine learning where the algorithm
learns from labelled data. Labeled data means the dataset whose
respective target value is already known. Supervised learning has
two types:
 Classification: It predicts the class of the dataset based on the
independent input variable. Class is the categorical or discrete
values. like the image of an animal is a cat or dog?
 Regression: It predicts the continuous output variables based on
the independent input variable. like the prediction of house prices
based on different parameters like house age, distance from the
main road, location, area, etc.

What is Linear Regression?


Linear regression is a type of supervised machine learning algorithm
that computes the linear relationship between the dependent variable
and one or more independent features by fitting a linear equation to
observed data.
When there is only one independent feature, it is known as Simple
Linear Regression, and when there are more than one feature, it is
known as Multiple Linear Regression.
Similarly, when there is only one dependent variable, it is
considered Univariate Linear Regression, while when there are more
than one dependent variables, it is known as Multivariate Regression.

What is the best Fit Line?

Our primary objective while using linear regression is to locate the


best-fit line, which implies that the error between the predicted and
actual values should be kept to a minimum. There will be the least
error in the best-fit line.
The best Fit Line equation provides a straight line that represents the
relationship between the dependent and independent variables. The
slope of the line indicates how much the dependent variable changes
for a unit change in the independent variable(s).

Fig 11 - Linear Regression


3.11 Naive Bayes Classifiers

Naive Bayes classifiers are a collection of classification algorithms


based on Bayes’ Theorem. It is not a single algorithm but a family of
algorithms where all of them share a common principle, i.e. every pair
of features being classified is independent of each other. To start with,
let us consider a dataset.
One of the most simple and effective classification algorithms, the
Naïve Bayes classifier aids in the rapid development of machine
learning models with rapid prediction capabilities.
Naïve Bayes algorithm is used for classification problems. It is highly
used in text classification. In text classification tasks, data contains
high dimension (as each word represent one feature in the data). It is
used in spam filtering, sentiment detection, rating classification etc.
The advantage of using naïve Bayes is its speed. It is fast and making
prediction is easy with high dimension of data.

Why it is Called Naive Bayes?

The “Naive” part of the name indicates the simplifying assumption


made by the Naïve Bayes classifier. The classifier assumes that the
features used to describe an observation are conditionally
independent, given the class label. The “Bayes” part of the name
refers to Reverend Thomas Bayes, an 18th-century statistician and
theologian who formulated Bayes’ theorem.
Consider a fictional dataset that describes the weather conditions for
playing a game of golf. Given the weather conditions, each tuple
classifies the conditions as fit(“Yes”) or unfit(“No”) for playing
golf.Here is a tabular representation of our dataset.
Play
Outlook Temperature Humidity Windy Golf

0 Rainy Hot High False No

1 Rainy Hot High True No

2 Overcast Hot High False Yes

3 Sunny Mild High False Yes

4 Sunny Cool Normal False Yes

5 Sunny Cool Normal True No

6 Overcast Cool Normal True Yes

7 Rainy Mild High False No

8 Rainy Cool Normal False Yes

9 Sunny Mild Normal False Yes

10 Rainy Mild Normal True Yes

11 Overcast Mild High True Yes

12 Overcast Hot Normal False Yes

13 Sunny Mild High True No

The dataset is divided into two parts, namely, feature matrix and
the response vector.

 Feature matrix contains all the vectors(rows) of dataset in which


each vector consists of the value of dependent features. In above
dataset, features are ‘Outlook’, ‘Temperature’, ‘Humidity’ and
‘Windy’.
 Response vector contains the value of class variable(prediction or
output) for each row of feature matrix. In above dataset, the class
variable name is ‘Play golf’.

3.12 K-means Clustering


Unsupervised Machine Learning is the process of teaching a
computer to use unlabeled, unclassified data and enabling the
algorithm to operate on that data without supervision. Without any
previous data training, the machine’s job in this case is to organize
unsorted data according to parallels, patterns, and variations.
K means clustering, assigns data points to one of the K clusters
depending on their distance from the center of the clusters. It starts
by randomly assigning the clusters centroid in the space. Then each
data point assign to one of the cluster based on its distance from
centroid of the cluster. After assigning each point to one of the
cluster, new cluster centroids are assigned. This process runs
iteratively until it finds good cluster. In the analysis we assume that
number of cluster is given in advanced and we have to put points in
one of the group.

How k-means clustering works?


We are given a data set of items, with certain features, and values for
these features (like a vector). The task is to categorize those items
into groups. To achieve this, we will use the K-means algorithm, an
unsupervised learning algorithm. ‘K’ in the name of the algorithm
represents the number of groups/clusters we want to classify our
items into.
(It will help if you think of items as points in an n-dimensional
space). The algorithm will categorize the items into k groups or
clusters of similarity. To calculate that similarity, we will use the
Euclidean distance as a measurement.
The algorithm works as follows:
1. First, we randomly initialize k points, called means or cluster
centroids.
2. We categorize each item to its closest mean, and we update the
mean’s coordinates, which are the averages of the items
categorized in that cluster so far.
3. We repeat the process for a given number of iterations and at the
end, we have our clusters.
The “points” mentioned above are called means because they are the
mean values of the items categorized in them. To initialize these
means, we have a lot of options. An intuitive method is to initialize
the means at random items in the data set. Another method is to
initialize the means at random values between the boundaries of the
data set (if for a feature x, the items have values in [0,3], we will
initialize the means with values for x at [0,3]).
The above algorithm in pseudocode is as follows:

Initialize k means with random values


--> For a given number of iterations:

--> Iterate through items:

--> Find the mean closest to the item by calculating


the euclidean distance of the item with each of the means

--> Assign item to mean

--> Update mean by shifting it to the average of the items in that


cluster
Fig 12 – Working of K-Means
4 DEVELOPMENT

4.1 Crop_app.py

import joblib
from flask import Flask, render_template,request,redirect
#from flask import Flask, render_template, request, redirect
app = Flask(__name__)

@app.route('/')
def home():
return render_template('Home_1.html')

@app.route('/Predict')
def prediction():
return render_template('Index.html')

@app.route('/form', methods=["POST"])
def brain():
Nitrogen=float(request.form['Nitrogen'])
Phosphorus=float(request.form['phosphorus'])
Potassium=float(request.form['potassium'])
Temperature=float(request.form['temperature'])
Humidity=float(request.form['humidity'])
Ph=float(request.form['ph'])
Rainfall=float(request.form['rainfall'])

values=[Nitrogen,Phosphorus,Potassium,Temperature,Humidity,Ph,Rainfall]

if Ph>0 and Ph<=14 and Temperature<100 and Humidity>0:


joblib.load('crop_app','r')
model=joblib.load(open('crop_app','rb'))
arr = [values]
acc = model.predict(arr)
# print(acc)
return render_template('Prediction.html', prediction=str(acc))
else:
return "Sorry... Error in entered values in the form Please check the
values and fill it again"
if __name__ == '__main__':
app.run(debug=True)

#N P K temperature humidity ph rainfall label

4.2 Home_1.html

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Crop Recommendation System</title>
<style>
body{
background-image:url(./one.jpg);
background-repeat: no-repeat;
background-size: cover;
}

header{ background-color:rgb(34, 34, 34);

padding: 20px;}

nav ul {
list-style: none;
margin: 0;
padding: 0;
text-decoration: none;
display: flex;
justify-content: space-between;
color:#ffff;
}

nav li {
display: inline-block;
margin-right: 20px;
text-decoration: none;
}
nav ul li a{
color:#ffff;
text-decoration: none;
}

main {
text-align: center;
padding: 50px;
}

h1 {
font-size: 36px;
margin-bottom: 20px;
}

p {
font-size: 16px;
color: rgb(218, 213, 213);
}

h3 {
font-size: 20px;
color: rgb(224, 221, 221);
}

h2 {
font-size: 25px;
color: rgb(218, 216, 216);
}

form{text-align: center;}

input{color: gold;
background-color: black;
font-size: 25px;
font-weight: bold;
width: 10%;
border-radius: 5%;
font-family: Arial, Helvetica, sans-serif;}
#p{
color: black;
font-size: 23px;
font-weight: 500;
}

</style>
</head>
<body>
<header>
<nav>
<ul>
<li><a href="#">Home</a></li>
<li><a href="#">About</a></li>
<li><a href="#">Services</a></li>
<li><a href="#">Contact</a></li>
</ul>
</nav>
</header>
<main>
<h1>Crop Recommendation System</h1>
<p id="p">RS Farming Welcomes you on our website, Hope you are earning
well from your farm
but we are here for you to provide some suggetions about your crop
cultivation
We recommends you to which crop you should grow in your farm for better
yields. Its
very easy to predict which crop you should grow on the basis of your
soil nutrients
just you need to know your soil nutrients for this you can test your
soil in lab.</p>
<br>
<h3 style="color: black;">Let's Start New Journey With RS Farming</h3>
<br>
<h2 style="color: black;">Click following button for Crop
recommandation</h2>
</main>
<form action="/Predict">
<input style="cursor: pointer;" type="submit" id="Predict"
name="Predict" action="/Predict" value="Predict">
</form>
</body>
</html>
4.3 index.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>crop Recommendation System</title>

<style>
body{background-image:url(./image.jpg);
background-repeat: no-repeat;
background-size: cover;
}

form{ font-family:'Times New Roman', Times, serif;font-weight:bold;font-size:


32px; color:darkred;
text-align:center;padding:2%}

input{color:black;font-family:'Franklin Gothic Medium', 'Arial Narrow', Arial,


sans-serif;font-weight:bold;padding: 1%;border-radius: 5%;width:
15%;background-color:goldenrod;}

</style>

</head>

<body>
<form action="/form" method="POST">
<label for="Nitrogen">Nitrogen:</label>
<input type="text" id="Nirogen" name="Nitrogen" min="0" max="100"
placeholder="Nitrogen" required><br><br>

<label for="phosphorus">phosphorus:</label>
<input type="text" id="phosphorus" name="phosphorus" min="0" max="100"
placeholder="Phosphorus" required><br><br>

<label for="potassium">potassium:</label>
<input type="text" id="potassium" name="potassium" min="0" max="100"
placeholder="Potassium" required><br><br>
<label for="temperature">temperature:</label>
<input type="text" id="temperature" name="temperature" min="0" max="100"
placeholder="Temperature in Degree Celcius" required><br><br>

<label for="humidity">humidity:</label>
<input type="text" id="humidity" name="humidity" min="0" max="100"
placeholder="Relative Humidity in Percentage" required><br><br>

<label for="ph">ph:</label>
<input type="text" id="ph" name="ph" min="0" max="14" placeholder="ph should
be in between 0-14" required><br><br>

<label for="rainfall">rainfall:</label>
<input type="text" id="rainfall" name="rainfall" placeholder="Rainfall in
MM" required min="0" max="10000"><br><br>

<input type="submit" id="submit" name="submit" value="Predict Your Crop">


</form>

</body>
</html>

4.4 Prediction.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>crop recommandation System</title>

<style>
body {
background-color: blanchedalmond;
background-size: cover;
height: 100vh;
display: flex;
align-items: center;
justify-content: center;
font-size: 36px;
color:greenyellow;
font-family: Arial, sans-serif;
text-shadow: 2px 2px #333;
}
</style>

</head>
<body>
<h4>According to your soil neutrients</h4>
<h4> You Should Grow -> </h4>
<br>
<br>
<h4>{{prediction}}</h4>
</body>
</html>

4.5 first.css

body {
background-image:url(prediction_back.jpg);
background-size: cover;
background-repeat: no-repeat;
background-position: center;
}

form {
width: 500px;
margin: 0 auto;
text-align: center;
padding: 20px;
background-color: rgba(255, 255, 255, 0.5);
border-radius: 10px;
}

input[type="text"] {
width: 100%;
padding: 10px;
margin-top: 10px;
border-radius: 5px;
border: none;
}

input[type="submit"] {
width: 100%;
padding: 10px;
margin-top: 10px;
border-radius: 5px;
border: none;
background-color: #4CAF50;
color: white;
cursor: pointer;
}

label {
font-size: 16px;
margin-top: 20px;
display: block;
}

4.6 Myhome.css

input[type="submit"] {
display: block;
margin: 0 auto;
width: 100px;
padding: 10px;
border: none;
border-radius: 5px;
background-color: #4CAF50;
color: white;
cursor: pointer;
}

body {
background-image: url(one.jpg);
background-size: cover;
background-repeat: no-repeat;
}

header {
background-color: #333;
color: white;
padding: 20px;
}

nav ul {
list-style: none;
margin: 0;
padding: 0;
display: flex;
justify-content: space-between;
}

nav li {
display: inline-block;
margin-right: 20px;
}

nav a {
color: white;
text-decoration: none;
}

main {
text-align: center;
padding: 50px;
}

h1 {
font-size: 36px;
margin-bottom: 20px;
}

p {
font-size: 16px;
color: rgb(218, 213, 213);
}

h3 {
font-size: 20px;
color: rgb(224, 221, 221);
}
h2 {
font-size: 25px;
color: rgb(218, 216, 216);
}

4.7 Prediction_css.css

body {
background-image: url('image.jpg');
background-size: cover;
height: 100vh;
display: flex;
align-items: center;
justify-content: center;
font-size: 36px;
color: white;
font-family: Arial, sans-serif;
text-shadow: 2px 2px #333;
}
5. CONCLUSION:-

Based on all the discussions and analyses, it is clear that the machine
learning models used – Random Forest, Support Vector Machine
(SVM), and Lasso Regression – outperform the deep learning models
used – Gradient Descent and long short – Term Memory (LSTM) – in
terms of accuracy. This could be because, when compared to other
models, models like LSTM require a larger quantum of data for a
better predictive analysis. Furthermore, based on the observations,
most of the models perform better on the specified parameters, where
as models such as Gradient Descent and Lasso Regression perform
better when applied to the dataset with all of the characteristic. While
soil and rainfall quantity are important in crop production and general
farming, it can be concluded that a deeper investigation of these
elements, as well as a larger database, is required for real-life research
of such elements using prediction models. Finally, it can be concluded
that the Random Forest algorithm outperforms all other models when
applied to any of the datasets. The current research can be extended
into performing further analysis and forecasting the factors that
influence crop yield. A larger dataset and more historically accurate
data about the environment and weather during each crop year is
required to identify best performing model between deep learning and
machine learning models. To find the best- performing technique,
more deep learning models need to be tested on the datasets. In the
field of crop yield prediction, remote sensing data could be merged
with the district-level statistical data to improve the model’s
performance.
6 Future Work
Advanced Crop Fertilization Techniques

1. Precision Agriculture:
o Variable Rate Technology (VRT): Implementing VRT to apply fertilizers at
variable rates across a field based on soil nutrient levels and crop
requirements.
o Remote Sensing and Drones: Using drones equipped with sensors to monitor
crop health and soil conditions, enabling precise application of fertilizers.
2. Organic and Sustainable Fertilizers:
o Biofertilizers: Researching and developing biofertilizers that use beneficial
microbes to enhance soil fertility.
o Organic Amendments: Promoting the use of organic matter such as compost
and manure to improve soil structure and nutrient content.
3. Integrated Nutrient Management (INM):
o Combining Organic and Inorganic Fertilizers: Developing strategies that
combine the use of organic and inorganic fertilizers to optimize nutrient
availability and soil health.
o Nutrient Recycling: Investigating methods for recycling nutrients from
agricultural waste and other organic materials back into the soil.
4. Smart Fertilizers:
o Controlled-Release Fertilizers: Advancing the development of fertilizers that
release nutrients slowly over time, matching the crop's growth cycle.
o Nano-fertilizers: Exploring the use of nanotechnology to create fertilizers that
improve nutrient uptake efficiency.

Enhancements in Crop Recommendation Systems

1. Integration of IoT and Big Data:


o IoT Devices: Incorporating IoT devices such as soil moisture sensors, weather
stations, and GPS to gather real-time data for more accurate recommendations.
o Big Data Analytics: Utilizing big data analytics to process large volumes of
agricultural data and derive insights for better decision-making.
2. Advanced Machine Learning and AI:
o Deep Learning: Applying deep learning techniques to handle complex
patterns and relationships in agricultural data.
o Reinforcement Learning: Implementing reinforcement learning to
continuously improve recommendations based on feedback and changing
conditions.
3. Personalized Recommendations:
o Farmer-Specific Advice: Customizing recommendations based on individual
farmer's land characteristics, crop history, and management practices.
o Adaptive Systems: Developing systems that adapt recommendations over
time based on new data and farmer feedback.
7. REFERENCES

[1] Kavita, and Pratistha Mathur. (2021) “Satelite - Based Crop Yield
Prediction Using Machine Learning Algorithm. “In 2021 Asian
Conference on Invocation in Technology (ASIANCON), 1-5
dio:10.1109/ASIANCON5146.2021.9544562.
[2] Bali, Nishu, and Anshu Singla. (2022) “Emerging Trends in
Machine Learning to Predict Crop Yield and Study Its Influential
Factors: A Survey.” Archives of Computational Methods in
Engineering 29 (1): 95–112. doi:10.1007/s11831-021-09569-8.
[3] van Klompengs, Thomas, Ayalew Kassahun, and Cagatay Cate.
(2020) “Crop Yield Prediction Using Machine Learning: A
Systematic Literature Review.” Computers and Electronics in
Agriculture 177 (October): 105709.
doi:10.1016/j.compag.2020.105709.

[4] Yalta, Nelson, Kazuhiro Nakada, and Tetsuya Ogata. (2017)


“Sound Source Localization Using Deep Learning Models.” Journal
of Robotics and Mechatronics 29 (1): 37–48.
doi:10.20965/jrm.2017.p0037.

[5] Apolo-Apolo OE, Pérez-Ruiz M, Martínez-GuanJ, Valente J.


2020. A cloud-based environment for generating yield estimation
maps from apple orchards using UAV imagery and a deep learning
technique. Frontiers in Plant Science. 11.
doi:10.3389/fpls.2020.01086.

[6] Kamath, Pallavi, Pallavi Patil, Shrilatha S, Sushma, and Sowmya


S. (2021) “Crop Yield Forecasting Using Data Mining.” Global
Transitions Proceedings, International Conference on Computing
System and its Applications (ICCSA- 2021), 2 (2): 402–407.
doi:10.1016/j.gltp.2021.08.008.
[7] Wigh, Daniel S., Jonathan M. Goodman, and Alexei A. Lapkin.
“A Review of Molecular Representation in the Age of Machine
Learning.” WIREs Computational Molecular Science n/a (n/a): e1603.
doi:10.1002/wcms.1603.

[8] Kavita, and Pratistha Mathur. (2021) “Satellite-Based Crop Yield


Prediction Using Machine Learning Algorithm.” In 2021 Asian
Conference on Innovation in Technology (ASIANCON), 1–5.
doi:10.1109/ASIANCON51346.2021.9544562.

[9] Bali, Nishu, and Anshu Singla. (2022) “Emerging Trends in


Machine Learning to Predict Crop Yield and Study Its Influential
Factors: A Survey.” Archives of Computational Methods in
Engineering 29 (1): 95–112. doi:10.1007/s11831-021-09569-8.

[10] van Klompenburg, Thomas, Ayalew Kassahun, and Cagatay


Catal. (2020) “Crop Yield Prediction Using Machine Learning: A
Systematic Literature Review.” Computers and Electronics in
Agriculture 177 (October): 105709.
doi:10.1016/j.compag.2020.105709.

[11] Yalta, Nelson, Kazuhiro Nakadai, and Tetsuya Ogata. (2017)


“Sound Source Localization Using Deep Learning Models.” Journal
of Robotics and Mechatronics 29 (1): 37–48.
doi:10.20965/jrm.2017.p0037.

[12] Shen, Dinggang, Guorong Wu, and Heung-Il Suk. (2017) “Deep
Learning in Medical Image Analysis.” Annual Review of Biomedical
Engineering 19 (1): 221–248. doi:10.1146/annurev-bioeng-071516-
044442.

[13] Minaee, Shervin, Nal Kalchbrenner, Erik Cambria, Narjes


Nikzad, Meysam Chenaghlu, and Jianfeng Gao. (2021) “Deep
Learning--Based Text Classification: A Comprehensive Review.”
ACM Computing Surveys 54 (3): 1–40. doi:10.1145/3439726.
[14] Belgiu, Mariana, and Lucian Drăguţ. (2016) “Random Forest in
Remote Sensing: A Review of Applications and Future Directions.”
ISPRS Journal of Photogrammetry and Remote Sensing 114 (April):
24–31. doi:10.1016/j.isprsjprs.2016.01.011.

[15] Suthaharan, Shan. (2016) “Support Vector Machine.” In


Machine Learning Models and Algorithms for Big Data
Classification: Thinking with Examples for Effective Learning, edited
by Shan Suthaharan, 207–235. Integrated Series in Information
Systems. Boston, MA: Springer US. doi:10.1007/978-1-4899-7641-
3_9.

Hochreiter, Sepp, A. Steven Younger, and Peter R. Conwell. (2001)


“Learning to Learn Using Gradient Descent.” In Artificial Neural
Networks — ICANN 2001, edited by Georg Dorffner, Horst Bischof,
and Kurt Hornik, 87–94. Lecture Notes in Computer Science. Berlin,
Heidelberg: Springer. doi:10.1007/3-540-44668- 0_13.
Sherstinskys, Alex. (2020) “Fundamentals of Recurrent Neural
Network (RNN) and Long Short-Term Memory (LSTM) Network.”
Physical D: Nonlinear Phenomena 404 (March): 132306.
doi:10.1016/j.physd.2019.132306.
Yu, Yong, Xiaosheng Si, Changhua Hu, and Jianxun Zhang. (2019)
“A Review of Recurrent Neural Networks: LSTM Cells and Network
Architectures.” Neural Computation 31 (7): 1235–1270.
doi:10.1162/neco_a_01199
[21]
[22]

You might also like