Deep_Learning (1)
Deep_Learning (1)
Submitted in partial fulfillment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
In
ELECTRONICS AND COMMUNICATION ENGINEERING
By
G. HARINI 21001A0443
T. SAI JAHNAVI 21001A0446
M. DEEPSIKA REDDY 21001A0433
I. ESWAR NAIK 21001A0441
N. HARIKA 21001A0439
Submitted to
CERTIFICATE
This is to certify that the project work entitled “A Deep Learning-Assisted Framework
for Plant Disease Identification with CNN Models”, being submitted in partial fulfillment of
requirements for the award of degree of “Bachelor of Technology” in Electronics and
Communication Engineering, Jawaharlal Nehru Technological University College of
Engineering Ananthapuramu and the bonafide record of project work carried out under my
supervision. The result provide in this report have not been submitted to any other University or
Institute for the award of any Degree
G. HARINI 21001A0443
T. SAI JAHNAVI 21001A0446
M. DEEPSIKA REDDY 21001A0433
I. ESWAR NAIK 21001A0441
N. HARIKA 21001A0439
Ananthapuramu-515002 Ananthapuramu-515002
DECLARATION
We here by declare that the dissertation entitled “A DEEP LEARNING-
We further declare that the work reported by this project has been submitted and
will not be submitted, either in part or full, for the award of any other degree or diploma in
Place: Anantapur
Date:
G. HARINI 21001A0443
T. SAI JAHNAVI 21001A0446
M. DEEPSIKA REDDY 21001A0433
I. ESWAR NAIK 21001A0441
N. HARIKA 21001A0439
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompany successful completion of any task would
be incomplete without the mention of people who made it possible, whose constant guidance
and encouragement crowned our efforts with success.
With Gratitude
G. HARINI 21001A0443
T. SAI JAHNAVI 21001A0446
M. DEEPSIKA REDDY 21001A0433
I. ESWAR NAIK 21001A0441
N. HARIKA 21001A0439
ABSTRACT
The rapid spread of plant diseases continues to be a major challenge in modern
agriculture, contributing to significant reductions in crop yield and quality. Early and
accurate diagnosis of plant diseases is critical to ensure timely intervention and
sustainable crop management. This project proposes an intelligent, automated system for
plant disease detection and affected area estimation by leveraging deep learning and
computer vision techniques. A lightweight yet powerful Convolutional Neural Network
(CNN) based on the MobileNetV2 architecture was trained on the extensive PlantVillage
dataset, encompassing 43,444 images across 38 distinct classes. The model achieved a
remarkable training accuracy of 98.02% and a validation accuracy of 97.82%, confirming
its effectiveness in identifying a wide range of plant diseases.
1 INTRODUCTION
1.1 Introduction 1
1.3 Objectives 5
2 LITERATURE SURVEY 7
3 EXISTING METHODS
4 METHODOLOGY
5.1 Results 20
6.1 Conclusion 26
7 REFERENCES 30
LIST OF FIGURES
1 Bacterial Blight 2
2 Rust 2
3 Powdery Mildew 3
4 Browning 3
5 Necrosis 3
1.1 INTRODUCTION
Agriculture plays a crucial role in the sustenance of human life and global economic
development. However, one of the major challenges in agriculture is the timely identification and
control of plant diseases, which significantly impact crop yield, quality, and farmer income.
Traditionally, plant disease detection relies on manual inspection, which is subjective, slow, and often
inaccurate due to limited expertise and environmental factors. The need for a reliable, automated
system has led to increased interest in artificial intelligence and computer vision techniques for solving
this problem.
This project focuses on developing a comprehensive system for plant disease detection and
affected area estimation using deep learning and image processing techniques[1]. The goal is to classify
plant leaf images into their respective disease categories and to quantify the percentage of the leaf area
that is affected bythe disease — providing bothdiagnosis and severityestimation.
The project began by downloading and preparing the Plant Village dataset, which contains
over 43,000 labeled images of healthy and diseased leaves across 38 classes. The dataset was
organized into training and validation folders. Image preprocessing steps such as resizing to a uniform
shape (128×128 or 64×64), normalization, and augmentation were applied to increase model
generalization and reduce overfitting.
A Convolutional Neural Network (CNN) was then designed and trained using TensorFlow and
Keras. Later, the CNN was enhanced using MobileNetV2, a pre-trained model, to achieve faster and
more accurate classification with fewer computational resources[1]. The training process
yielded excellent performance, reaching 98.02% training accuracy and 97.82% validation
accuracy after several epochs. Data was fed using TensorFlow’s image_dataset_from_directory
method, and overfitting was minimized through real-time augmentation and caching.
After successfully classifying diseases, the next major component was to calculate the
percentage of diseased area on the leaf. This was achieved through HSV-based image segmentation
using OpenCV[2]. Diseased regions were highlighted by defining color thresholds and
applying morphological operations to clean up noise. The contours of the diseased parts were
extracted, and their area was compared with the total image area to estimate the severity percentage.
By combining image classification with visual area analysis, this project offers a powerful
tool for farmers, agronomists, and researchers[3]. It not only identifies the disease accurately but
also informs the user of how much of the plant is affected, enabling early diagnosis, better
decision- making, and targeted treatment. This system can be further extended into mobile or real-
time IoT- based platforms for broader agricultural use.
Bacterial blight is a plant disease caused by various bacterial pathogens that leads to
symptoms like leaf spots, wilting, and stem lesions, ultimately affecting the plant's health
and yield. The bacteria can spread through water, wind, insects, and contaminated tools,
thriving in favorable environmental conditions like high humidity and warm temperatures.
Necrosis in plants describes the localized death of plant cells and tissues, often
appearing as browning or blackening on leaves, stems, or other parts. This cell death can be
triggered by various factors including pathogen infections, nutrient deficiencies, and
environmental stresses.
1.3 OBJECTIVES
The study aims to build a reliable and scalable system that leverages deep learning
techniques to automatically detect and identify various plant diseases. This reduces the
dependency on manual inspection and enables early intervention, which is critical in
minimizing crop damage[5].
Proper image preprocessing is essential for robust model training. This includes
resizing all input images to a uniform dimension (e.g., 128×128), normalizing pixel values,
and applying data augmentation techniques to increase dataset variability and reduce
overfitting.
The final objective is to design the system in a way that it can be extended into real-
world applications. This includes the potential integration into mobile apps or IoT
platforms for on-field diagnosis, making it a practical tool for farmers, agronomists, and
agricultural extension officers.
2.1 INTRODUCTION:
The lack of integration between classification and severity analysis presents a clear
gap in the existing literature. Most systems provide only a disease label without indicating
the extent of damage, which limits their practical utility. This project addresses that
limitation by combining a CNN-based classification model with image processing
techniques to segment and calculate the affected area of the leaf. By doing so, it provides
not just a diagnosis, but also an estimate of disease progression, thus offering a more
comprehensive and actionable tool for modern agricultural practices.
CHAPTER-3
EXISTING METHODS
Plant disease detection has traditionally relied on manual and semi-automated methods,
many of which present significant limitations in terms of accuracy, speed, and scalability.
Below is a summary of key existing approaches used prior to the emergence
of deep learning:
CHAPTER - 4
METHODOLOGY
The methodology adopted in this project is structured to systematically address the
dual objectives of disease classification and diseased area estimation from plant leaf images.
The approach integrates deep learning for disease identification and image processing
techniques for quantifying the extent of infection. The entire process can be divided into
several key stages, as detailed below.
The first step involves data acquisition and preprocessing. The publicly available
PlantVillage dataset was used as the primary source for model training. This dataset contains
over 43,000 images of plant leaves categorized into 38 distinct classes, representing both
healthy and diseased conditions. The images were organized into training and validation
directories. To prepare the images for input into the model, preprocessing steps such as
resizing (to 128×128 or 64×64 pixels), normalization (scaling pixel values between 0 and
1), and data augmentation (including rotation, flipping, and zooming) were performed.
These steps ensured consistency across the dataset and helped enhance the model's ability
to generalize.
Following preprocessing, a Convolutional Neural Network (CNN) was implemented
for plant disease classification. Initially, a custom CNN architecture was trained, and later
the performance was optimized using MobileNetV2, a lightweight pre- trained model
known for its efficiency and accuracy. Transfer learning was employed by loading the base
MobileNetV2 model without its top classification layers and fine-tuning it on the
PlantVillage dataset. The model was compiled using the RMSprop optimizer, categorical
crossentropy loss function, and accuracy as the evaluation metric. It was trained over
multiple epochs until it achieved a training accuracy of 98.02% and validation accuracy of
97.82%. TensorFlow’s image_dataset_from_directory function was used to streamline the
loading and batching of image data, with caching and prefetching applied to boost training
efficiency.
Once the classification component was successfully developed and evaluated, the
project proceeded to the second major task: estimating the affected area of the diseased leaf.
For this purpose, image processing techniques using OpenCV were applied. The input image
was first converted from BGR to HSV color space, which is more effective for isolating
specific color ranges associated with diseased regions. A lower and upper HSV threshold
was defined to segment the infected areas based on color. Morphological operations such as
erosion, dilation, and closing were applied to remove noise and refine the segmented
regions. Using contour detection, the system identified distinct diseased patches and
calculated the sum of their areas.
DEPARTMENT OF ECE, JNTUACEA 14
Plant Disease Detection
To estimate the percentage of the affected area, the total pixel area of all detected
contours was divided by the total pixel area of the leaf image and multiplied by 100. This
gave an accurate percentage value representing the extent of disease spread on the leaf. To
make the results more intuitive, the diseased regions were visually highlighted on the image
using contours, enabling users to verify the segmented output.
The final system includes both disease prediction and area estimation functionalities.
The user can input any image of a plant leaf from the dataset, and the system will output the
predicted disease name along with a visualization and percentage of the affected area. All
components were developed and tested using Python in Jupyter Notebook, with key libraries
including TensorFlow, OpenCV, NumPy, and Matplotlib.
In summary, the methodology adopted in this project effectively combines the
predictive power of deep learning with the analytical capability of image processing. This
two-stage approach offers a more comprehensive plant disease monitoring solution,
capable of supporting real-time diagnosis and facilitating better decision-making in
agricultural practices.
Image Processing
-Convert to HSV
-Color Thresholding
-Morphological Operations
Visual Output
(Original + Contour Image)
Horizontal Flipping
Random horizontal flipping was used to reflect the image along the vertical axis. Since
leaf structures are often symmetrical, this technique is effective in doubling the dataset
variation without changing the disease characteristics.
Brightness Adjustment
Minor changes in image brightness were applied to simulate different lighting
conditions such as natural daylight or shaded environments. This ensures that the model
does not misclassify based on lighting alone.
Rescaling
All images were rescaled by dividing pixel values by 255 to normalize them to a range
between 0 and 1. This improves convergence during training and standardizes input data.
These augmentation strategies not only improve model generalization and accuracy,
but also significantly reduce overfitting by ensuring that the model does not memorize
specific patterns or features from the original dataset. This is particularly valuable in
agricultural scenarios where input conditions can vary widely in the field.
The software implementation of this project involves the integration of deep learning
for plant disease classification and image processing techniques for quantifying the diseased
area on plant leaves. The implementation was carried out in a modular and systematic
manner, ensuring clarity, maintainability, and scalability of the code. The entire
development was conducted using Python in the Jupyter Notebook environment, which
allowed for interactive coding, visualization, and testing.
The first phase of implementation was focused on dataset preparation and
preprocessing. The PlantVillage dataset, which contains over 43,000 labeled images across
38 classes, was downloaded, extracted, and organized into train and val directories. Each
class folder contained corresponding disease or healthy leaf images. File paths were loaded
using Python’s os and pathlib libraries, and image augmentation techniques such as rotation,
zooming, flipping, and contrast adjustments were applied using TensorFlow’s
ImageDataGenerator and image_dataset_from_directory functions. This helped to
artificially expand the dataset and make the model robust against overfitting.
Next, the model training module was implemented using the TensorFlow and Keras
libraries. Initially, a basic CNN was constructed with convolutional, pooling, and dense
layers. However, to improve performance and speed, the project transitioned to using
MobileNetV2, a pre-trained lightweight model suitable for resource-constrained
environments. Transfer learning was employed by loading MobileNetV2 without the top
classification layer and adding custom dense layers tailored for the 38-class problem. The
model was compiled with the RMSprop optimizer and categorical_crossentropy as the loss
function, and trained for several epochs with real-time training accuracy and loss
visualization using Matplotlib. After training, the model was saved in both .keras and .h5
formats for future inference.
Following successful classification training, attention shifted to disease area
estimation. This was implemented using the OpenCV library for image processing. The
selected test image was read using cv2.imread(), and converted to HSV color space using
cv2.cvtColor() to facilitate color-based segmentation. HSV thresholds were defined to
isolate colors typically associated with diseased regions (brown, yellow, and dark patches).
A binary mask was created using cv2.inRange() and cleaned with morphological operations
such as opening and closing to reduce noise.
Contours representing diseased regions were extracted using cv2.findContours() and
drawn on the original image using cv2.drawContours() for visual confirmation. The area of
each contour was calculated and summed, and then divided by the total image area to
compute the percentage of affected area. This percentage was displayed alongside a side-
DEPARTMENT OF ECE, JNTUACEA 18
Plant Disease Detection
The system also included a prediction pipeline, where a new image could be passed
through preprocessing, prediction, and area estimation steps in a single workflow. This
modular structure made the system highly usable and testable. Several checkpoints and
error-handling mechanisms were included to verify if the input path was valid, the image
was readable, and the results were within expected bounds.
The complete software stack included:
Python – primary programming language
Jupyter Notebook – development and testing environment
TensorFlow/Keras – deep learning model development
OpenCV – image processing and area estimation
Matplotlib – data visualization and plotting
NumPy – numerical operations and matrix handling
scikit-learn – (optional) for metrics and performance evaluation
In conclusion, the software implementation effectively bridges the gap between
plant disease identification and severity estimation. It is designed in a modular fashion,
allowing each component—data loading, classification, area calculation, and
visualization—to be developed, tested, and improved independently. The result is a reliable
and user-friendly tool with the potential for real-world deployment in agriculture.
CHAPTER -5
The results of the project are evaluated in terms of the system’s accuracy in detecting
plant diseases, its ability to generalize across multiple classes, and the effectiveness of
estimating the diseased area on plant leaves. The implementation combined deep learning
for classification and image processing techniques for quantification, and both components
were tested thoroughly to assess their performance.
The training and validation accuracy/loss graphs were analyzed, and both sets of
curves showed a converging trend. The close proximity of the curves confirms that the
model does not suffer from significant overfitting or underfitting. These results demonstrate
that the use of a lightweight pretrained model like MobileNetV2, in combination with
transfer learning and data augmentation, can yield high performance even with limited
computational resources.
To evaluate the system’s prediction capability, several test images from the
validation set and a few manually selected images from the dataset were passed through the
model. The model was able to correctly classify the majority of samples, even when there
were variations in lighting, orientation, and background. This confirms that the model has
learned robust features that generalize well across different inputs.
In addition to classification, the project included the estimation of the affected area
on the leaf, which is a novel feature not present in many traditional systems. For this, the
OpenCV library was used to process the image and segment the infected region. The HSV
color space was particularly effective for isolating disease-like discolorations. Threshold
values were selected based on visual experimentation, and morphological operations helped
refine the segmented areas.
Despite minor false positives in area estimation, the system offers valuable insights
by combining qualitative classification with quantitative severity measurement. This dual
output can assist farmers not only in diagnosing plant health but also in determining the
urgency and intensity of the required treatment.
The figure above illustrates a case where the model has accurately identified a
healthy grape leaf with no signs of disease. The left panel displays the original image of the
leaf, which visually appears free from any discoloration, spots, or lesions. The right panel
shows the analysis output where the model confirms the absence of any infected regions,
indicating a diseased area of 0.00%. No segmentation or contour markings are present,
which aligns with the predicted class label: Grape healthy. This outcome demonstrates
DEPARTMENT OF ECE, JNTUACEA 21
Plant Disease Detection
the model's robustness in distinguishing between healthy and diseased leaves, ensuring that
false positives are minimized. It also highlights the model's capability to not only detect
disease when present but also accurately validate plant health, making it a reliable tool for
automated plant monitoring in agricultural settings.
The figure above demonstrates the effectiveness of the proposed model in detecting
plant disease and calculating the affected area on an apple leaf. The left panel displays the
original image of the leaf, which shows visible symptoms of infection. The right panel
presents the processed output where the diseased region, identified as Apple Black Rot, is
accurately segmented and highlighted with a bright green contour. The model successfully
classifies the disease type using a convolutional neural network trained on a comprehensive
plant disease dataset. In addition to disease classification, the model quantifies the severity
of infection, estimating that approximately 3.50% of the leaf area is affected. This
percentage is calculated by analyzing pixel data corresponding to the infected region. The
figure highlights the model’s dual capability to not only identify the disease but also provide
an accurate assessment of its impact, aiding in timely and informed decision-making for
crop protection.
The graph presented illustrates the training and validation accuracy of the plant
disease detection model over a span of 10 epochs. The blue curve denotes the training
accuracy, while the orange curve represents the validation accuracy. As the training
progresses, the training accuracy shows a steady improvement from approximately 83.5%
to nearly 96.8%, reflecting the model’s ability to effectively learn patterns from the training
dataset. Simultaneously, the validation accuracy also increases consistently, starting from
around 92% and stabilizing at approximately 94–95%, indicating good generalization on
unseen data.
The proximity of the training and validation accuracy curves suggests that the model
is not overfitting, maintaining a healthy balance between learning the training data and
performing well on the validation set. This steady convergence and the eventual flattening
of both curves imply that the model is reaching its optimal learning potential. Such a
performance outcome validates the robustness of the model architecture and the
effectiveness of the preprocessing, augmentation, and training strategies employed. This
accuracy progression supports the reliability of the system for practical applications in
detecting plant diseases with high precision.
The table above presents a sample output of the model’s dual functionality—disease
classification and affected area estimation—on various plant leaf images. Each row
corresponds to a different test image, detailing the predicted disease class and the percentage
of the leaf area determined to be infected.
For the image Tomato Late_blight_1234.JPG, the model correctly identified the
disease as Tomato Late Blight and estimated that approximately 18.56% of the leaf area is
affected.
In the case of Apple Cedar_apple_rust_0164.JPG, the predicted class was Apple
Cedar Apple Rust, with an affected area of 12.45%, highlighting a moderate level of
infection.
For Grape Black_rot_3438.JPG, the model predicted Grape Black Rot with a
higher severity level of 23.78%, indicating more extensive damage.
Lastly, Potato Healthy2.JPG was identified as Potato Healthy, with an affected area of
0.00%, confirming that the leaf is free from any visible disease symptoms.
These results demonstrate the system’s ability to not only recognize the type of
disease with high accuracy but also to quantify the severity by calculating the infected
portion of the leaf. This combination of qualitative and quantitative analysis is essential for
supporting early diagnosis, guiding treatment decisions, and enabling precision agriculture
practices.
CHAPTER-6
CONCLUSION AND FUTURE SCOPE
6.1 CONCLUSION
The study successfully demonstrates the effectiveness of integrating deep learning and
image processing techniques to automate plant disease detection and estimate the severity
of infection. With agriculture forming the backbone of many economies and food systems,
the need for intelligent, data-driven, and scalable solutions to combat crop loss has never
been more pressing. This project addresses that need by developing a dual-function system
capable of classifying plant diseases and quantifying the percentage of the leaf area affected.
A robust CNN-based classification model was implemented using the MobileNetV2
architecture, which was fine-tuned on the PlantVillage dataset. The model achieved
impressive accuracy—98.02% on the training set and 97.82% on the validation set—
demonstrating its capability to learn meaningful patterns across 38 distinct disease and
healthy classes. Data augmentation, image normalization, and transfer learning were
essential in achieving high generalization performance while optimizing resource usage.
Beyond classification, the system incorporates an image processing pipeline using
OpenCV to analyze the physical spread of disease on a given leaf. By converting the image
to HSV color space, applying adaptive thresholding, and using morphological
transformations, the infected regions were effectively isolated and quantified. The resulting
area was expressed as a percentage of the total leaf surface, offering valuable insight into
the severity of the condition. Although the area detection approach occasionally
misinterprets shadows or natural discolorations as disease, it provides a solid foundation for
further refinement.
Overall, the project demonstrates how modern AI technologies can contribute to
precision agriculture by enhancing early disease diagnosis and providing actionable
insights. The developed system is scalable, user-friendly, and can be further integrated into
mobile or IoT-based platforms for real-time field usage. By empowering farmers with
timely and accurate disease detection, this work has the potential to minimize crop losses,
reduce pesticide misuse, and promote sustainable farming practices.
CHAPTER-7
REFERENCE
[1] https://ieeexplore.ieee.org/document/93992
[2] https://ieeexplore.ieee.org/document/71551
[3] https://ieeexplore.ieee.org/document/84385
[4] Mohanty, S.P., Hughes, D.P., & Salathé, M. (2016). Using deep learning for image-
based plant disease detection. Frontiers in Plant Science, 7, 1419.
https://doi.org/10.3389/fpls.2016.01419
[5] Ferentinos, K. P. (2018). Deep learning models for plant disease detection and diagnosis.
Computers and Electronics in Agriculture, 145, 311–318.
https://doi.org/10.1016/j.compag.2018.01.009
[6] PlantVillage Dataset. PlantVillage - Disease Classification Dataset. Available at:
https://www.kaggle.com/emmarex/plantdisease
[7] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.C. (2018). MobileNetV2:
Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 4510–4520.
[8] Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for
Biomedical Image Segmentation. In International Conference on Medical Image
Computing and Computer-Assisted Intervention (pp. 234–241). Springer.
[9] OpenCV Documentation. OpenCV: Open Source Computer Vision Library. Available
at: https://docs.opencv.org/
[10] TensorFlow Developers. (2023). TensorFlow API Documentation. Available at:
https://www.tensorflow.org/api_docs
[11] Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-
Scale Image Recognition. arXiv preprint arXiv:1409.1556.
[12] Sibiya, M., & Sumbwanyambe, M. (2019). A survey on deep learning techniques for
image-based plant disease detection. Computers and Electronics in Agriculture, 172, 105-
130.
[13] K. P. Ferentinos, “Deep learning models for plant disease detection and diagnosis,” Computers
and Electronics in Agriculture, vol. 145, pp. 311–318, Feb. 2018. DOI:
10.1016/j.compag.2018.01.009
[14] S. P. Mohanty, D. P. Hughes, and M. Salathé, “Using deep learning for image-based plant
disease detection,” Frontiers in Plant Science, vol. 7, pp. 1–10, Sep. 2016. DOI:
10.3389/fpls.2016.01419