0% found this document useful (0 votes)
30 views

Fertilizer_Forecasting_using_Machine_Learning

The document presents research on fertilizer forecasting using machine learning, specifically employing the Random Forest algorithm to improve prediction accuracy for fertilizer requirements in agriculture. It highlights the importance of precise fertilizer application for enhancing crop yield while minimizing environmental impacts, and discusses various machine learning models and their effectiveness. The study concludes that Random Forest outperforms other models in accuracy, achieving over 99% in predicting optimal fertilizer usage based on various input factors.

Uploaded by

grspoorthy48
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Fertilizer_Forecasting_using_Machine_Learning

The document presents research on fertilizer forecasting using machine learning, specifically employing the Random Forest algorithm to improve prediction accuracy for fertilizer requirements in agriculture. It highlights the importance of precise fertilizer application for enhancing crop yield while minimizing environmental impacts, and discusses various machine learning models and their effectiveness. The study concludes that Random Forest outperforms other models in accuracy, achieving over 99% in predicting optimal fertilizer usage based on various input factors.

Uploaded by

grspoorthy48
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Proceedings of the International Conference on Inventive Computation Technologies (ICICT 2023)

IEEE Xplore Part Number: CFP23F70-ART; ISBN: 979-8-3503-9849-6

Fertilizer Forecasting using Machine Learning


Dr. O. Rama Devi P. Naga Lakshmi S.Naga Babu
Department of Artificial Intelligence Department of Artificial Intelligence and Department of Artificial Intelligence and
and Data Science Data Science Data Science
Lakireddy Bali Reddy College of Lakireddy Bali Reddy College of Engineering Lakireddy Bali Reddy College of
Engineering (Autonomous) (Autonomous) Engineering (Autonomous)
Mylavaram.AP.India Mylavaram.AP.India Mylavaram.AP.India
Email:[email protected] Email: [email protected] Email:[email protected]

K. Vinaya Sree Bai Sowmya Akansha


2023 International Conference on Inventive Computation Technologies (ICICT) | 979-8-3503-9849-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICICT57646.2023.10134061

Department of Artificial Intelligence Department of Artificial Intelligence and Department of Artificial Intelligence and
and Data Science Data Science Data Science
Lakireddy Bali Reddy College of Lakireddy Bali Reddy College of Lakireddy Bali Reddy College of
Engineering (Autonomous) Engineering (Autonomous) Engineering (Autonomous)
Mylavaram.AP.India Mylavaram.AP.India Mylavaram.AP.India
Email: [email protected] Email:[email protected] Email:[email protected]

Abstract—Machine learning is a newer technology that deals Fertilizers are critical for increasing crop growth
with a greater volume of data than any other in the world and productivity, which is essential for ensuring
today. The primary source of income in India comes from food security and combating global hunger.
agriculture. The primary goals are to increase profitability However, be filled in order to create more precise
and produce enough food for everyone in India, though and effective models, excessive fertilizer use can
agriculture is combined with cutting - edge technology to
progress the industry and achieve the goals. In this research,
have negative environmental and health
predictions are made about the fertilizers that will increase consequences, such as soil and water pollution and
crop yield and boost profits. Fertilizer prediction is a crucial greenhouse gas emissions. As a result, accurate
task in agriculture that involves determining the appropriate prediction of fertilizer requirements for a specific
type and quantity of fertilizer to use for a certain crop. This crop is critical for efficient fertilizer use and
work has a variety of difficulties despite being crucial for management, which can reduce the negative
raising agricultural yields and reducing the environmental impacts on the environment and human health.
impact of farming. To overcome this, machine learning Several studies have been conducted over years
methods like Random Forest has been employed. This method to develop models for predicting fertilizer
is considered because, it demonstrates greater accuracy,
compared to other methods like linear regression, K-Nearest
requirements based on soil and plant characteristics,
Neighbours, etc. This paper considers the past conditions and weather conditions, and other relevant factors.
farmer’s experience and the answers, making or considering These models were created using a variety of
the datasets from Kaggle. The datasets are used to predict the techniques, including statistical analysis, machine
fertilizers based on the environmental, soil, and plant learning, and artificial intelligence. Most of these
conditions. Therefore, this research work predicts the models, however, have limited accuracy and may
fertilizers which are suitable for the above-mentioned not be suitable for predicting fertilizer requirements
conditions. under different soil and weather conditions [5].
In recent years, there has been an increase
Keywords— Advanced technology, Agriculture, Linear in interest in adopting cutting-edge technology to
Regression, K-Nearest Neighbors (KNN), Kaggle, Random boost the precision of fertilizer prediction models,
Forest, Fertilizer Prediction.
including remote sensing, Geospatial analysis, and
big data analytics. These technologies can offer
1. INTRODUCTION real-time information on the qualities of the soil and
plants, the weather, and other pertinent variables,
For future fertilizer needs, machine learning models are which can be utilized to create more precise and
built using past data on crop output, soil composition, and effective fertilizer prediction models. This study
other relevant characteristics. These algorithms are able to studies the state – of – the – art in fertilizer
evaluate enormous volumes of data and spot trends that prediction models and investigates the possibility
people might be unable to see, leading to more precise for applying cutting – edge technology to raise the
predictions and improved agricultural yields. models' accuracy.

979-8-3503-9849-6/23/$31.00 ©2023 IEEE 24


Authorized licensed use limited to: Zhejiang University. Downloaded on February 13,2025 at 15:58:30 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Inventive Computation Technologies (ICICT 2023)
IEEE Xplore Part Number: CFP23F70-ART; ISBN: 979-8-3503-9849-6

Moreover, the research gaps are outlined by most appropriate fertilizer to help rural farmers overcome their
highlighting the difficulties and limitations of lack of information.
current fertilizer prediction models. The work
addresses issues with global food security by
advancing the development of more effective and
sustainable agriculture practices. In this research, an
overall approach to fertilizer prediction using
algorithms has been presented.

2. LITERATURE REVIEW

● Behera et al., used Artificial neural network (ANN) Fig.1: Process to Follow
and the main advantage is to use satellite data and
machine learning to predict crop yield and fertilizer Choosing a model: There are many different types
requirements. The ANN model achieves high in of machine learning models that can be used for
predicting crop and fertilizer requirements for a fertilizer prediction, such as regression, decision
single crop. Where the disadvantage is to Limited trees, and neural networks. The selection of model
to a single crop and did not consider soil type or depends on the specific characteristics of the dataset
other environmental factors[6]. and the goals of the prediction.
Splitting the data: Before training the model, data is
● Lalitha and Ramanathan used the Fuzzy logic model divided into training, validation, and test sets, which
and proposed a fuzzy logic-based model to can be used to fit the model. The validation set will
recommend fertilizer doses based on soil test results be used to evaluate the performance of the model,
and crop requirements. The fuzzy logic model during training and make any necessary adjustments.
achieved high accuracy in recommending fertilizer The test set will be used to evaluate the final
doses, and the disadvantage is not to compare the performance of the model after training is complete.
performance of the fuzzy logic model to other Feature selection: It is used to select the most
machine learning approaches[9]. relevant features from the dataset to use as inputs to
the model. These features should be correlated with
fertilizer application and crop yield.
● Li et al., used Support Vector Regression (SVR),
Feature selection process:
the advantage of his model has developed a method
● Train a random forest model using all the
for predicting fertilizer requirements based on soil
available features in the dataset.
properties and crop characteristics. The SVR model
● Retrieve the feature importance s of each
achieved higher prediction accuracy than traditional
variable from the trained model. The feature
regression models. The disadvantage of his models
importance score for a variable measures the
is the Limited evaluation of the model's
relative importance of that variable in the
performance under different soil and climate
model's predictions.
conditions[10]. ● Sort the features based on their importance
scores, and select the top K features, where K
● Yang et al., used a Convolutional Neural Network is a number to be chosen based on specific
(CNN) and developed a deep learning model to problem and the number of available features.
predict rice yield and recommend fertilizer rates ● Train a new random forest model using only
based on weather and soil data. The CNN model the selected features.
achieved high accuracy in predicting rice yield and ● Evaluate the performance of the new model on
recommending optimal fertilizer rates. Although, a validation set to see if the model has
he restricted and limited to a single crop and did not improved compared to the original model.
consider the economic feasibility of the ● If the new model is performing better, use it
recommended fertilizer rates[15]. for prediction on new data. Otherwise, try
adjusting the number of features selected, or
● Zhang et al., followed Random Forest (RF) and use other feature selection techniques such as
proposed a hybrid approach combining spectral recursive feature elimination or L1
data and machine learning to predict nitrogen regularization.
content in the soil. The RF model achieved higher
prediction accuracy than traditional regression Model training: Once the model and features are
models. The disadvantage is to focus solely on selected, training the model using the training data set
nitrogen prediction and does not consider other is done. The goal of training is to find the optimal set
element nutrients[16]. of parameters for the model that minimize the error
between the predicted and actual values.
3. PROPOSED WORK Model evaluation: After training the model, its
performance is evaluated on the validation dataset. That
gives an idea of how well the model is simplifying to
A sophisticated machine learning model is
new data and whether any adjustments need to improve
developed to utilize a forest classifier to predict the
its performance.

Authorized licensed use limited to: Zhejiang University. Downloaded on February 13,2025
979-8-3503-9849-6/23/$31.00 at 15:58:30 UTC from IEEE Xplore. Restrictions apply.
©2023 IEEE 25
Proceedings of the International Conference on Inventive Computation Technologies (ICICT 2023)
IEEE Xplore Part Number: CFP23F70-ART; ISBN: 979-8-3503-9849-6

Hyper-parameter tuning: Machine learning parameter tuning involves trying different values for these
models have hyper parameters that can have an parameters to find the best combination that result in the
adjustment to improve their performance. Hyper- highest accuracy on the validation dataset.
parameter tuning involves trying different values
for these parameters to find the best combination, 1. Gather and preprocess data: Collect the relevant
that results in the highest accuracy on the data on fertilizer usage and crop yield, as well as any
validation dataset. other relevant factors that may affect crop growth, such
Final testing: Once the model is tuned, its performance as soil type, climate, and rainfall. Clean and preprocess
can be evaluated on the test dataset to get a final the data to remove any missing values, outliers, or
estimate of its accuracy. It will give the model's ability irrelevant features.
to predict fertilizer application and crop yield in real-
world situations. 2. Split the data into training and testing sets: Split
the preprocessed data into two sets, one for training the
a. Data Integration and Pre-processing model and the other for testing the model's accuracy.

The first step is to gather data from multiple sources and 3. Define the input features and target variable:
pre-process it in order to extract the dataset’s relevant Determine which variables to be used as input features
information. After that, the data is visualized to make to predict the target variable (i.e., fertilizer application).
sure it is valid. In order to generate the most accurate These could include factors such as crop type, soil type,
answer, a model is developed using random forest. rainfall, and previous fertilizer usage.
4. Train the random forest model: Use the training
b. Model Selection set to train the random forest model. Involve setting the
hyper parameters (e.g., number of trees, tree depth) and
The main objectives to select Random Forest are: fitting the model to the data.

Handling non-linearity: Random Forest can handle 5. Evaluate the model's performance: Use the
non-linear relationships between the input features and testing set to evaluate the model's performance.
the target variable. This is important in agriculture as the Calculate metrics, such as accuracy, precision, recall,
factors affecting crop yield are often complex and non- and F1 score assess the model's predictive power.
linear. 6. Refine the model: If the model's performance is not
Robustness to noise: Random Forest is less susceptible satisfactory try to adjust the hyper parameters or
to over-fitting than other algorithms such as decision add/remove input features to see if the model improves.
trees. This is important in agriculture as the data can often
be noisy due to various factors such as environmental 7. Deploy the model: Once the model has been trained
conditions and measurement errors. and evaluated, deploy it to predict fertilizer usage for
Feature importance: Random Forest can provide new crops based on their input features.
insights into the relative importance of different features 4. RESULTS
for predicting the target variable. This can be useful for
identifying the key factors affecting crop yield and a. Inputs and output predictions
determining which fertilizers are most effective.
Ensemble method: Random Forest is an ensemble After the model has been trained, some inputs
method, meaning it combines the predictions of multiple namely temperature, humidity, moisture, crop type, soil-
decision trees to improve the accuracy of the model. This type, nitrogen, potassium and phosphorus, are included.
can lead to more accurate predictions than using a single
decision tree.
Scalability: Random Forest is highly scalable and can
handle large datasets with many features. This is
important in agriculture where large datasets are often
used to train models.

To achieve this, various Python modules, such as sklearn


to partition the dataset and train the model, and
matplotlib to visualize the data, are imported. Then that
data is fit to the model by transforming it using several
methods like Standard Scaler, Lable Encoding, etc., after
dividing it into training and testing set.

c. Model Training
Fig.2: Scatter plot of the results
The point of training is to find the optimal set of
parameters for the model that minimize the error To generate the scatter plot as shown in the fig.2, random forest
between the predicted and actual values. Machine model is trained using the training data. Use the trained model to
learning models have hyper parameters that can predict the fertilizer values for the test data and a scatter plot is
be adjusted to improve their performance. Hyper created using the matplotlib library, with the predicted values on

Authorized licensed use limited to: Zhejiang University. Downloaded on February 13,2025
979-8-3503-9849-6/23/$31.00 at 15:58:30 UTC from IEEE Xplore. Restrictions apply.
©2023 IEEE 26
Proceedings of the International Conference on Inventive Computation Technologies (ICICT 2023)
IEEE Xplore Part Number: CFP23F70-ART; ISBN: 979-8-3503-9849-6

the x-axis and the actual values on the y-axis. The


proposed model suggests optimal fertilizer with
accuracy above 99% based on the input.

Table 1: Comparing Different Models’ Accuracy Fig.3: Accuracy for Random Forest

MODEL ACCURACY 5. CONCLUSION

K-Nearest Neighbors 75% Fertilizer prediction is the most crucial factor for crops
growing. In order to predict the fertilizer, among many
Linear Regression 60%
models in Machine learning, Random Forest has been used
Multiple Linear Regression 79% because it gives more accuracy compared to other models
like Linear Regression, Logistic Regression, KNN etc.
Decision Tress 83% Random Forest was also considered because it combines
many base models like Decision Tree and other models.
Support Vector Regression 88% Fertilizer prediction uses many parameters like pH of the
soil, nitrogen level, phosphorous level and potassium level,
RANDOM FOREST 99.27% and based on these parameters the fertilizer prediction is
used to pick the fertilizer which is used.
1. Random forest is a powerful and popular
machine learning algorithm that can be used for fertilizer
6. REFERENCES
prediction. It is an ensemble learning algorithm that
combines multiple decision trees to make predictions.
[1] "A machine learning approach for predicting fertilizer recommendations in
One advantage of random forest is that it is able to handle precision agriculture" by Mohanad Dawood, Laith Alkurdi, and Ali Rodan.
a large number of input variables and can deal with https://www.sciencedirect.com/science/article/pii/S2352340919300824
missing data. [2] "An Expert System for Fertilizer Recommendation Using Machine Learning
Techniques" by M. R. Islam, M. Z. Islam, and M. A. R. Sarkar.
2. Linear regression is a simple and widely used https://ieeexplore.ieee.org/document/8808612
model for fertilizer prediction. It works well when there [3] "A fuzzy logic-based approach for fertilizer recommendation in precision
agriculture" by F. Z. Khan, M. Y. Javed, and M. M. Iqbal.
is a linear relationship between the input and target https://www.sciencedirect.com/science/article/pii/S1364815217312756
variable. However, it may not be suitable when the [4] Anna Chlingaryana, Salah Sukkarieha, Brett Whelanb (2018) ― Machine learning
relationship between the variables is nonlinear. approaches for crop yield prediction and nitrogen status estimation in precision
agriculture: A review, Computers and Electronics in Agriculture 151 61–69,
3. Multiple linear regression assumes a linear Elisver.
relationship between the input variables and the target [5] Behera, S. K., Dutta, S., & Tripathi, S. (2018). Estimation of crop yield and fertilizer
variable, which may not be appropriate for fertilizer requirement using machine learning algorithms. International Journal of
Agricultural and Biological Engineering.
prediction. Multiple linear regression is sensitive to
[6] "Crop Nutrient Management: A Machine Learning Approach for Fertilizer
outliers, and the presence of outliers in the dataset can Recommendation" by A. K.M. Azad, M. Z. Islam, and M. A. R. Sarkar.
affect the accuracy of the predictions. Multiple linear https://ieeexplore.ieee.org/document/8824854
regression may not be suitable for large datasets as it may [7] "Fertilizer Recommendation for Crops: A Machine Learning Approach" by R.
suffer from the curse of dimensionality. Srinivasan, R. Krishnan, and A. Elango.
https://link.springer.com/chapter/10.1007/978-981-13-8410-2_32
4. Decision tree is a good starting point for [8] Lalitha, V., & Ramanathan, A. L. (2019). Development of a fuzzy expert system for
crop specific fertilizer recommendation. Computers and Electronics in
fertilizer prediction as it is simple to implement and Agriculture.
interpret. However, if the dataset is large and complex, a [9] Liu, Y., Hu, Z., Zhai, L., Zhang, X., & Cao, W.(2020). Development of a Support
random forest may be a better choice as it can handle high Vector Regression Model for Predicting Fertilizer Requirements Based on Soil
dimensionality and noisy data and provide better Properties and Crop Characteristics.
accuracy. [10] Niketa Gandhi et al. (2016)," Rice Crop Yield Forecasting of Tropical Wet and
Dry Climatic Zone of India Using Data Mining Techniques", IEEE International
5. Random forest is generally faster and more Conference on Advances in Computer Applications (ICACA).
accurate than support vector regression for large datasets [11] Rahul Tangia. (2019, OCTOBER 4). India’s Biggest Challenge: The Future of
Farming. Retrieved from https://www.theindiaforum.in/article/india-s-
with many features. On the other hand, support vector biggestchallenge-future-farming
regression is generally more accurate than random forest [12] S. Bhanumathi, M. Vineeth, N. Rohit ."Crop Yield Prediction and Efficient use of
for small datasets with few features. Additionally, Fertilizers", (2019) International Conference on Communication and Signal
random forest is easier to interpret than support vector Processing (ICCSP).
regression. [13] TongKe, Fan. "Smart agriculture based on cloud computing and IOT." Journal of
Convergence Information Technology 8.2 (2013).
[14] Yang, J., Yang, G., Zhang, Y., Sun, J., & Wang,Y. (2020). A deep learning- based
crop yield prediction and fertilizer recommendation system. Computers and
Electronics in Agriculture.
[15] Zhang, X., Xu, X., Jia, L., Li, Y., Wang, P.,& Li, Z. (2019). Prediction of soil
nitrogen content based on spectral data and machine learning algorithms.

979-8-3503-9849-6/23/$31.00 ©2023 IEEE 27


Authorized licensed use limited to: Zhejiang University. Downloaded on February 13,2025 at 15:58:30 UTC from IEEE Xplore. Restrictions apply.

You might also like