0% found this document useful (0 votes)
27 views

Base Paper

This paper proposes a deep learning based system for real-time apple leaf disease detection. The system first classifies leaf images as diseased, healthy or damaged, then detects and localizes specific symptoms on diseased leaves. The system was trained on a dataset of over 9,000 annotated apple leaf images covering main diseases. Preliminary results show 88% classification accuracy and 42% mean average precision for detection, demonstrating potential for accurate early disease identification.

Uploaded by

sameenamz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Base Paper

This paper proposes a deep learning based system for real-time apple leaf disease detection. The system first classifies leaf images as diseased, healthy or damaged, then detects and localizes specific symptoms on diseased leaves. The system was trained on a dataset of over 9,000 annotated apple leaf images covering main diseases. Preliminary results show 88% classification accuracy and 42% mean average precision for detection, demonstrating potential for accurate early disease identification.

Uploaded by

sameenamz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Computers and Electronics in Agriculture 198 (2022) 107093

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture


journal homepage: www.elsevier.com/locate/compag

Deep diagnosis: A real-time apple leaf disease detection system based on


deep learning
Asif Iqbal Khan a, *, S.M.K. Quadri b, Saba Banday c, Junaid Latief Shah d
a
Dept. of Computer Science, Jamia Millia Islamia, New Delhi, India
b
Head, Dept. of Computer Science, Jamia Millia Islamia, New Delhi, India
c
Dept. of Pathology, Sher-i-Kashmir University of Agricultural Sciences, Shalimar, India
d
Higher Education Department, J&K, India

A R T I C L E I N F O A B S T R A C T

Keywords: Background and objective: Diseases and pests are one of the major reasons for low productivity of apples which in
Deep learning turn results in huge economic loss to the apple industry every year. Early detection of apple diseases can help in
Apple leaf disease detection controlling the spread of infections and ensure better productivity. However, early diagnosis and identification of
Convolutional Neural Network. Object
diseases is challenging due to many factors like, presence of multiple symptoms on same leaf, non-homogeneous
detection
background, differences in leaf colour due to age of infected cells, varying disease spot sizes etc.
Methods: In this study, we first constructed an expert-annotated apple disease dataset of suitable size consisting
around 9000 high quality RGB images covering all the main foliar diseases and symptoms. Next, we propose a
deep learning based apple disease detection system which can efficiently and accurately identify the symptoms.
The proposed system works in two stages, first stage is a tailor-made light weight classification model which
classifies the input images into diseased, healthy or damaged categories and the second stage (detection stage)
processing starts only if any disease is detected in first stage. Detection stage performs the actual detection and
localization of each symptom from diseased leaf images.
Results: The proposed approach obtained encouraging results, reaching around 88% of classification accuracy
and our best detection model achieved mAP of 42%. The preliminary results of this study look promising even on
small or tiny spots. The qualitative results validate that the proposed system is effective in detecting various types
of apple diseases and can be used as a practical tool by farmers and apple growers to aid them in diagnosis,
quantification and follow-up of infections. Furthermore, in future, the work can be extended to other fruits and
vegetables as well.

1. Introduction orchards and spread like wildfire infecting more than 70% of the culti­
vars in many districts across Valley. The disease resulted in extensive
Apple fruits are cultivated worldwide and are one of the most widely fruit fall and hence decrease in fruit production (Bhat et al., 2015). The
grown fruits in the world. Apple fruit is one of the most productive fruit same disease attack was again reported in 2018 (Javed Iqbal,). As per
due to its high remedial and nutritive values. In India, the valley of the domain experts and scientists, one of the main reasons for the spread
Kashmir holds the maximum share of Apple production with more than of this catastrophic disease was the lack of proper disease forecasting
75% of total apple production. Currently, around 160,000 ha of land in and detection system (Bhat et al., 2015). In 2020, a severe scab infection
the Valley is under apple cultivation with an annual productivity of broke out and more than 30 percent crop was affected by diseases
around 180,000 MTs (Directorate of Horticulture, 2021), in which an despite the use of fungicides. The infection resulted in heavy wastage of
enormous proportion is exported to different parts of the world. How­ crop (Raashid Hassan, 2021).
ever, diseases and pests cause huge economic loss to the apple industry An on-time detection system would detect the disease and prevent
every year. Diseases like Alternaria, Scab, and Mosaic remain a major the widespread of disease among other plants, thus preventing sub­
threat for the apple growers. In July 2013, Alternaria unfurled in apple stantial economic losses. Pests and diseases not only reduce the

* Corresponding author.
E-mail addresses: [email protected] (A.I. Khan), [email protected] (S.M.K. Quadri).

https://doi.org/10.1016/j.compag.2022.107093
Received 23 December 2021; Received in revised form 20 May 2022; Accepted 24 May 2022
Available online 31 May 2022
0168-1699/© 2022 Elsevier B.V. All rights reserved.
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

production but adversely affect the fruit quality as well. Therefore, identification discussed above have one common weakness, they all
timely detection of diseases is crucial for enhancing both quality and have been tested on single symptom images i:e the images in the dataset
quantity of apples. Identifying a disease correctly when it first appears contain only one type of disease. Whereas in reality, more often than
would help the farmer to take proper precautionary measures or apply not, a leaf can be infected with multiple diseases and varying intensities.
just the right amounts of pesticides, thereby getting both economic and Hence, there remains a lack of performance in real-field scenarios.
environmental benefits.
Traditionally, plant disease diagnosis was done by domain experts by 1.1. Challenges
visually examining the leaves. However, this practice was labour
intensive and time-consuming (Dutot et al., 2013). Also, the experts Plant disease detection involves many challenges and complexities
must be proficient and should have extensive knowledge of various due to various factors and in this research, we are trying to overcome
diseases, their symptoms and treatment. A satisfactory alternative to these challenges and propose a real-time and accurate End-to-End apple
such labour intensive task will be an automatic detection system that can disease detection system.
detect plant diseases in early stages. In this context, over the years, Plant disease patterns vary with the season and other factors such as
various computer vision and image processing techniques have been CO2 concentration, humidity, temperature, and water availability.
studied and developed to detect and identify plant diseases. Dubey and These factors can remarkably affect the disease development as each
Jalal, 2012 introduced a technique for classification of apple fruit dis­ disease may respond differently to these variations (Velasquez et al.,
eases. They first used K-Means clustering for defect segmentation fol­ 2018). Moreover, disease patterns significantly vary (high intra-class
lowed by Multi-class Support Vector Machine (SVM) for classification. variation) due to factors like leaf morphology, non-homogeneous
Features like Global Color Histogram (GCH), Color Coherence Vector background, age of infected cells, differences in leaf colour and light
(CCV), Local Binary Pattern (LBP) and Completed Local Binary Pattern illumination during imaging, and on the other hand, sometimes the vi­
(CLBP) were used for classification. Mokhtar et al. (2014), proposed a sual symptoms of different diseases may appear similar (low inter-class
technique to detect tomato diseases based on SVM. Dandawate and variation) due to varying lighting conditions, aging etc.
Kokare (2015) and Raza et al. (2015) also used SVM based approaches Next challenge that we are dealing with is the detection of small
for classification of plant diseases. Chuanlei et al. (2017) proposed a spots during early stages. Detecting small objects is a difficult task and
pattern recognition based technique which used region growing algo­ most of the object detection algorithms have tough time in dealing with
rithm (RGA) to detect apple leaf diseases and achieved an accuracy of small objects (Nguyen et al. 2020). Disease spots come in a wide variety
90% on a database of 90 disease images. of shapes, sizes and colours and occur randomly on the leaf surface. In
Deep Learning is popular Machine Learning sub field, which has incipient form, these spots are usually small or even tiny and with time
gained popularity in recent years. Deep Learning algorithms are a spe­ they change colour, size and shape making it difficult for traditional
cial type of Artificial Neural Networks which extract high-level repre­ detection algorithms.
sentations of data while training without any human intervention. Deep Another challenge is the availability of apple leaf diseases dataset.
learning has recently solved many complex problems and has shown Deep learning models are data hungry models which require huge
excellent performance in many computer vision and machine learning amount of data for training. To the best of our knowledge, there is no
tasks like image classification, object detection, speech recognition, sufficient sized large scale dataset available that can be utilized for this
voice recognition, natural language processing, medical imaging etc. research. There are few apple leaf disease classification datasets avail­
(Abbas et al., 2021). Due to their ability to learn features directly from able online but those datasets cover only few diseases and since the
images, deep learning algorithms like Convolutional Neural Networks datasets are not annotated, those are not suitable for this research. So we
(CNN) have found their application in the field of Agriculture as well. It built a sufficient sized quality dataset covering all the major apple dis­
is now being widely used in plant disease identification and diagnosis. eases and pests. Therefore, the contribution of this research is twofold.
For example apples (Liu et al., 2018), bananas (Amara et al., 2017),
cucumber (Kawasaki et al., 2015), tomato (Brahimi et al., 2017), Mul­ a) First, we have created an expert-annotated apple disease dataset of
tiple Plant diseases (Prasanna Mohanty et al., 2016), cassava (Ram­ suitable size consisting of high quality RGB images covering all the
charan et al., 2017). main foliar diseases and symptoms. A good quality dataset is the first
In 2017, Brahimi et al, introduced deep learning for classifying to­ important requirement for this research. Since there is no suitable
mato disease based on leaf images and achieved classification accuracy dataset available right now, our first aim was to build a suitable
of up to 99%, easily outperforming the conventional methods. Fer­ dataset which can be used to train and test our models. The dataset
entinos (2018) trained different CNN architectures (AlexNet, VGG and contains diseased apple leaf images with different backgrounds
GoogLeNet), on an open database containing 58 categories of plants or collected from various fields/orchards under different lighting con­
diseases. According to the experimental results, VGG architecture ach­ ditions using different capturing devices.
ieved the best accuracy of 99.53%. Liu et al. (2018) proposed a deep b) Second, we present a deep learning based real-time leaf disease
model based on AlexNet (Krizhevsky et al., 2012) and GoogLeNet detection system to identify seven types of diseases and pests that
(Szegedy et al., 2015), for classification of apple leaf diseases. Zhang affect apple plants which can help fruit growers in accurately iden­
et al., 2019 proposed a deep convolutional neural network model tifying various disease on time and provide helpful
(DCNN) for apple disease classification. The proposed model used one recommendations.
global average pooling layer instead of all the full connection layers and
utilized modified Softmax for improvement. Jiang et al. (2019) used 2. Dataset
object detection algorithm called SSD with Inception module and
Rainbow concatenation to detect 5 types of apple diseases. The model 2.1. Apple leaf disease dataset
was trained on a hybrid dataset (artificially generated and collected
from field) but the dataset is not publically available for testing. Zhong The dataset preparation consists of two parts, i) data collection:
and Zhao (2020) proposed DenseNet-121 based approach for collection of diseased apple leaf images from various fields and orchards
multi-classification of apple diseases and achieved an accuracy of and ii) data annotation: labelling of images as per their symptoms. Both
93.1%. However, the results are obtained on a small test set and contains these tasks are time consuming and resource intensive.
only 3 diseases divided into 5 classes based on the intensity of the Data Collection: Our dataset preparation took huge amount of human
disease. efforts as well as material resources. First it took many human resources
All the models and approaches proposed for apple disease to collect the images of diseased apple leaves visiting various fields and

2
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

orchards in different seasons. The total data collection took around two identifying other visual symptoms as well. For example insect attack and
years to complete. There are around seven varieties of apples and eight Necrosis. Necrosis means death of body cells or tissues. It is not a disease
types of diseases and pests commonly found in the valley. For this study, but a symptom of disease or stress due to injury, radiation or chemicals.
we have considered five commonly found diseases viz. Scab, Alternaria, The symptom can be dark watery spots or sometimes tan or black papery
Apple Mosaic, Marssonina leaf blotch (MLB) and powdery mildew only. spots on leaves. Portions of the plant may appear yellow or wilted,
We also made sure that our dataset contains leaf images from each va­ indicating a systemic severe disease that leads to death of cells. The most
riety. Not only this, disease patterns change with time and other factors common causes of leaf Necrosis are diseases like alternaria, mossaic etc.
like temperature, rainfall, humidity etc. resulting in high intra-class However, weather-related problems, insect activity and nutritional
variation. Some diseases only appear in a particular season and time, deficiency may also result in death of leaf cells or tissues.
for example rainy and humid weather is favourable for the development We collected more than 6000 leaf images in two years. To reflect the
of Marsonina Leaf Blotch (MLB). Therefore, the disease appears very late real world scenario, we took pictures from different cameras including
towards the end of July and is more noticeable in August. As a result, a mobile phones of different brands. The images have non-homogeneous
lot of time and efforts went into waiting and monitoring those diseases. background taken at different disease maturity stage and under
In order to deal with these challenges, we enriched our dataset with as different lighting conditions. Fig. 1 shows some of the sample images
many variations of each disease as possible. We not only captured an from our collected dataset. After image capturing, we performed manual
image of an infected leaf but also monitored the leaf throughout the inspection of these images and removed all duplicate, low quality,
season and captured various images at different stages and age. damaged and those images which were beyond recognition due to se­
Apart from the diseases mentioned above, we have also considered vere disease attack. Finally we were left with 5201 images ready for

a) Scab b) Alterneria c) Powdery Mildew

d) MLB e) Apple Mosaic f) Multiple

g) Necrosis h) Insect i) Healthy

Fig. 1. Sample images from the prepared Dataset. Leaves with visual symptoms a) Scab b) Alternaria c) Powdery Mildew d) MLB e) Mossaic f) Multiple g) Necrosis h)
Insect i) Healthy.

3
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

annotation out of which 2739 images had single symptom/disease and research in the area.
2462 images had multiple symptoms. Table 1 lists the class wise breakup
of images. 3. Methodology
Data Annotation: Data annotation is a vital step in the success of the
machine learning models. In data annotation, each spot/object in an Object detection is a complex task due to many challenges and dis­
image is manually labelled with its tag or label. There are several tools ease detection from leaf images comes with its own set of challenges. To
available for image annotation and in this study we used LabelImg overcome such challenges we divided the overall disease detection
(Tzutalin, 2015). LabelImg is a graphical image annotation tool written process into two main stages. These two stages are a) Initial Classifica­
in python which generates annotations as XML files in PASCAL VOC tion and b) Detection stage. The overall pipeline is shown in Fig. 3 and
format. Besides, it also supports YOLO and CreateML formats (Tzutalin, overall structure of our detection system is presented in Fig. 4. The
2015). This labelled data is used by supervised machine learning algo­ Initial Classification stage first classifies the input image into one of the
rithms during training. following ten categories
Data annotation is a manual process and thus requires human re­
sources, time and domain knowledge. Any fault or error in labelling can i) Healthy: If the leaf is healthy and free from any disease.
lead to lower success rates of the model. Domain knowledge is essential ii) Scab: If the leaf is infected with Scab only.
for labelling and through data labelling, the knowledge of a Subject iii) Alternaria: If the leaf is infected with Alternaria only.
Matter Expert (domain expert) with years of experience is instilled into iv) Powdery-Mildew: If the leaf is infected with Powdery-Mildew
machine learning models. Therefore, for this study, data labelling was only.
performed under the supervision of two domain experts who monitored v) Mossaic: If the leaf is infected with Mossaic virus only.
the whole process and finally the dataset was used only after each vi) MLB: If the leaf is infected with Marsonina Leaf blotch only.
annotation was cross checked by these experts. vii) Multiple: If the leaf is infected with multiple diseases i:e the leaf
Data Augmentation: Data Augmentations are needed for supple­ has multiple visual symptoms on it.
menting and enriching the training data. After collecting and annotating viii) Necrosis: If the leaf has dead tissues.
around 5000 images, various data augmentation operations were ix) Insect: If the leaf is attacked by insect/s.
applied to make the dataset richer and diverse. This helps in increasing x) Damaged: If the leaf or major portion of the leaf is damaged due
the generalizability of deep models and overcome the problem of to severe disease intensity rendering it unrecognizable.
overfitting. With more images and diversity achieved by data augmen­
tation techniques, the model can learn as many relevant features as If the input image is classified as ‘Healthy’ or ‘Damaged’ the process
possible during the training process, thus avoiding overfitting and stops without going further otherwise the output is fed to the Detection
achieving higher performance. There are many augmentation options stage. The reason for using a classification stage and not using the
available and the ones we used are detection stage directly is to improve the performance and reduce
overall processing time as overlapping symptoms deform a leaf area
i) Rotation: randomly rotate images clockwise or anti-clockwise up resulting in wrong detections. Another reason to use a two-stage system
to a certain degree amount. is to deal with low inter-class variation. Since, many diseases if not
ii) Flip: Randomly flip the images vertically or horizontally. treated on time, often lead to death of cells and tissues also known as
iii) Noise: Randomly inject noise (salt and pepper) to some images. Necrosis and then finally damaging the whole leaf making it difficult to
iv) Brightness: Increase/decrease brightness of random images by diagnose the cause or disease. This is where lower inter-class variation
certain amount. situation arises. The classification stage is used to filter out any such
v) Exposure: Adjust the gamma exposure (darker or brighter) of an scenario which might confuse our detector resulting in inaccurate
image up to certain amount. diagnosis.
vi) Cut-out: Removes sections of an image to simulate occlusion. Detection stage processing would start only if any disease is detected
in first stage. This way all the invalid inputs (a healthy leaf image with
Apart from theses image level augmentation, we also applied few no visual symptoms or severely damaged leaf) are filtered out. There is
bounding box level augmentation operations as well. To visualize how no need to go through the detection phase if there are no symptoms at all
these transformations affect the original images, a plot showing all the or the leaf area is severely damaged and cannot be diagnosed. A drill
transformations next to the original image for comparison is shown in down on individual stages is discussed below.
Fig. 2.
After applying data augmentation, our final dataset comprises of 3.1. The initial classification stage
9000 images fully annotated with 9 classes. The prepared dataset is first
of its kind in terms of the size and scale. A sufficient sized dataset not The “Initial Classification stage’ of our proposed system consists of a
only helps the current study but would potentially be helpful for future classification model which classifies an input image into one of the 10
classes discussed above. The classification model is a light weight Con­
volutional Neural Network architecture tailored for apple disease clas­
Table 1 sification. It is based on Xception architecture (Chollet, 2017). Xception
# of images of each disease in our apple disease dataset. stands for Extreme version of Inception (Szegedy et al., 2016) (its pre­
Disease/Symptoms Number of Images decessor model) is a 71 layers deep CNN architecture pre-trained on
Scab only 639
ImageNet dataset. Xception uses depthwise separable convolution layers
Alternaria only 418 with residual connections instead of classical convolutions. Depthwise
Mossaic only 312 Separable Convolution replaces classic n × n × k convolution operation
MLB only 371 with 1 × 1 × k point-wise convolution operation followed by channel-
Powdery Mildew only 320
wise n × n spatial convolution operation. This way the number of op­
Necrosis only 125
Insect only 221 erations are reduced by a factor proportional to 1/k. Residual connec­
Multiple symptoms 2462 tions are ’skip connections’ which allow gradients to flow through a
Healthy 221 network directly, without passing through non-linear activation func­
Damaged 112 tions and thus avoiding the problem of vanishing gradients. In residual
Total 5201
connections, output of a weight layer series is added to the original input

4
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

Fig. 2. Effect of different augmentation operations on original image.

Fig. 3. Pipeline of overall methodology.

and then passed through non-linear activation function as shown in layer at the end. The resultant model has 20,881,970 parameters in
Fig. 5. We used Xception as the base model and added a Global Max- total out of which 20,827,442 trainable and 54,528 are non-trainable
pooling layer followed by dropout layer and final fully-connected parameters. Architecture details, layer-wise parameters and output

5
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

Fig. 4. Overall structure of our detection system.

classification (Nguyen et al., 2020). Single-stage object detection ap­


proaches are generally faster and hence preferred by many people.
Although object detection has come a long way in past one decade or
so, it still has many key challenges to overcome. Some of these chal­
lenges include high intra-class variation, low inter-class variation and
model efficiency. To overcome these challenges, researchers have come
up with innovations like novel region proposals, multi-scale feature
maps, divided grid cell, new loss function etc. These innovations have
resulted in performance increase of object detection algorithms. How­
ever, one challenge which has attracted the attention recently is detec­
tion of small objects. Most of the state-of-the-art object detectors, both
single-stage and two-stage approaches, have struggled with detecting
small objects.
Our problem of disease detection is an object detection problem in
which different symptoms have to be identified and localized. These
symptoms can be of different sizes varying from few millimetres to few
centimetres, thus making their detection difficult for most of the state-
of-the-art detectors. A research conducted by Nguyen (Nguyen et al.,
2020) who ran different models with different backbones on multi-scale
objects to find out which model and backbone network is suitable for
Fig. 5. Residual Connection. small objects. It has been observed that Faster R-CNN with ResNet-101
+ FPN backbone achieved the top mAP (mean Average Precision) of
shape of the classification model are shown in Table 2. 41.2% on small object dataset. While as in single-stage approaches,
Yolov3 608 × 608 with Darknet-53 obtained 33.1% mAP. As per the
results above, Faster RCNN seems to be the ideal choice for our problem
3.2. The detection stage but since Yolov4 (Bochkovskiy et al., 2020) has been released which has
improved the accuracy over Yolov3 by around 10% and also decreased
Object detection is the task of classifying and localizing objects in an the inference time by a margin, we decided to try both the approaches.
image. In past recent years, the rapid advances of deep learning tech­ Besides, we also choose another famous detection model EffecientDet
niques have greatly accelerated the momentum of object detection. With (Tan et al., 2020) which has achieved better results than Yolo models on
advanced deep learning models and increasing computing power, the MS COCO dataset. All the three algorithms are best detection techniques
performance of object detectors has greatly improved, achieving many among the lot.
breakthroughs. Object detection approaches are broadly classified into
two types, region proposal based approaches also known as two-stage (a) YOLO v4
approaches and single-stage approaches based on regression or
You Only Look Once (YOLO) is a family of one-stage object detection
Table 2 techniques first introduced in 2016. Yolov4 is the recent member of the
Details of CNN Architecture. family which is an enhanced version of its predecessor Yolov3. Yolo v4 is
Layer (type) Output Shape Param # presently one of the best real-time object detection algorithms both in
Xception (Model) 5 × 5 × 2048 20861480 terms of Mean Average Precision (MAP) and speed. YOLO v4 combined
GlobalMaxPooling2D 2048 0 many universal features (features which are independent of the dataset,
dropout (Dropout) 2048 0 models, and type of task) such as Weighted-Residual-Connections
dense (Dense) 10 20490
(WRC), Self-adversarial-training (SAT), Cross mini-Batch Normaliza­
Total Parameters: 20,881,970. tion (CmBN), Cross-Stage-Partial-connections (CSP) and Mish-activation
Trainable Parameters: 20,827,442. function, Mosaic data augmentation, DropBlock regularization, and
Non-trainable Parameters: 54,528.

6
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

CIoU loss (Bochkovskiy et al., 2020). Table 4


# of instances of each disease in training set.
(b) Faster RCNN Disease/Symptoms # of Instances

Scab 5910
Faster R-CNN (Ren et al., 2015) is a region-based object detector Alternaria 7674
which is an improved version of its predecessor Fast-RCNN (Girshick, Mossaic 1716
2015). It first runs an image through a backbone network (CNN) and MLB only 1180
then on the last feature map of Convolution layers a fully convoluted Powdery Mildew 1795
Necrosis 2490
network called region proposal network (RPN) is trained which outputs Insect 881
a set of bounding boxes along with their objectless scores which deter­
mine likelihood of an object. In our experiment, we used a ResNet101 +
FPN based Faster RCNN. ResNet101 is 101 layers deep Convolutional labelmap file is also generated which maps the ID’s to string labels (class
Neural Network. It obtains the best speed/accuracy trade-off. Feature names). We used Roboflow (Roboflow, 2020) to convert the dataset into
Pyramid Network (FPN) (Lin et al., 2017) generates pyramid of feature required format. Roboflow provides online services to annotate and
maps at multiple scales useful for detecting small objects. convert annotations to or from any other format. We trained Yolov4 for
2000 epochs with input size of 416x416 after which the model perfor­
(c) EfficientDet mance stopped showing any improvement.
Our second model Faster-RCNN was implemented in Detectron2.
EfficientDet is another state-of-the-art detection model by Google Detectron2 is Facebook AI Research’s (FAIR) object detection library
Brain Team which has achieved slightly better results on MS COCO written in PyTorch. Detectron2 is flexible and extensible, allowing high-
dataset but at the cost of speed. EfficientDet uses weighted bi-directional quality implementations of state-of-the-art object detection algorithms.
feature pyramid network (BiFPN) for easy and fast multi-scale feature We used Detectron2′ s implementation of Faster-RCNN and trained the
fusion and a single compound scaling factor that uniformly governs the model for 5000 iterations with the initial learning rate of 0.01 for first
width, depth and resolution for all backbone, feature network and pre­ 1000 iterations and then 0.001 for next 4000 iterations. The model was
diction networks at the same time (Tan et al., 2020). EfficientDet uses trained using stochastic gradient descent (SGD) optimization method,
EfficientNet as its backbone network which is a state-of-the-art CNN the gamma value was 0.1, momentum 0.9, and weight decay of 0.0001.
available developed by the same Google Brain Team. EfficientNet is Our final model EfficientDet-D0 was implemented in Keras and
small, efficient making it ideal for detectors. Tensorflow2 and trained for 11,000 steps with batch size 16, input size
600, and gamma value 0.1. The optimization method used was mo­
mentum optimizer with momentum optimizer value of 0.899. All the
3.3. Implementation and training
hyper-parameters and their values for each model are given in Table 5
below.
In this section, we present the details of our experimental setup,
training and datasets used for evaluation.
4. Detection of small spots
Classification Model (Stage 1): The classification model Xception was
implemented in Keras on top of Tensorflow on a workstation with 16 GB
One of the most challenging and important problems in object
RAM, Intel (R) Xeon (R) @ 2.30Ghz processor with Tesla K80 12 GB
detection is the detection of small objects. This single problem has a
graphics card. Model parameters were initialized through the process of
significant impact on the performance of detection algorithms. For
Transfer learning and then re-trained on prepared dataset of around
example, mAP of EfficientDet on large objects is 51% while as on small
9000 images (after augmentation) using stochastic gradient descent
objects the performance drops 5 folds to just 12%. Similarly, for Yolo v4
(SGD) with learning rate of 0.0001, input size 460, batch size of 16 and
the mAP for large and small objects are 53% and 26% respectively
epoch value of 100. The model was further fine-tuned using Adam
(Bochkovskiy et al., 2020). To overcome this problem, we used
optimizer with a learning rate of 0.00005. All the hyper-parameters and
following few strategies to improve the performance of detection models
their values are given in Table 3 below.
on small spots.
Detection Stage (Stage 2): To train the detection models, the dataset
was split into 70% train, 20% validation and 10% test sets. The models
• Increased image capture resolution: Higher resolution means rich
were trained on training set, and finally tested on the test set. The
and detailed features.
number of total instances of each class in our training set is given in
• Image tiling: Tiling helps in zooming on small spots without
Table 4. Our first model that is Yolov4 was implemented using Darknet
increasing the input size.
Framework. Darknet is a custom open source neural network framework
• Small strides for convolution operation: small stride means less
written in C and CUDA by Joseph Redmon (Redmon Joseph, 2013). As a
pixels are skipped during feature extraction.
first step, the annotations done using LabelImg (xml files) have to be
• Tune model anchors: pre-set anchor boxes can be suboptimal for
converted in Darknet format. In this format, a text file is generated for
training data. It is good practice to custom tune them as per the
each image which contains the annotations and a numeric IDs. A
requirement.

Table 3
5. Results and discussions
Hyperparameters used for Classification Model.
Hyper Parameter Value 5.1. Quantitative results
Weights Imagenet
Learning Rate 1e-4 This section presents performance evaluation of both classification
Epochs 100
and Detection models.
Batch Size 16
Decay Rate 0.5
Patience 5 6. Performance of classification model
Dropouts 0.4
Activations ReLu for Conv and Softmax for Classification We evaluated the proposed classification model on hold-out test set.
Loss Function Categorical CrossEntropy
The multi-class classification results were recorded and then average

7
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

Table 5
Detection models and their hyper-parameter values.
Method Momentum Decay Gamma Lr Batch Size Training Time Stepsize Input Size

EfficientDet-B0 0.89 0.0005 0.1 0.001 16 12hrs 11,000 600


Yolov4 0.843 0.00036 – 0.003 8 8hrs 2000 416
Faster R-CNN 0.9 0.0001 0.1 0.001 16 6hrs 5000 600

numbers were calculated.


Table 6
The performance of the model is presented in the form of Confusion
Class-wise results of classification model.
matrix (CM) in Fig. 6.
Overall Accuracy, precision, recall and F-measure computed by Disease Precision Recall F1-Score Actual/Classified
formulae given below are summarized in Table 6. Alternaria 0.90 0.82 0.86 116/105
Damaged 0.86 0.60 0.71 30/21
No. of images correctly classified Healthy 0.99 0.96 0.97 90/87
Accuracy =
Total no. of images Insect 0.56 0.82 0.67 28/41
MLB 0.85 0.93 0.89 91/100
Sum of all True Positives (TP) Mossiac 0.58 0.56 0.57 45/43
Precision = Multiple 0.76 0.73 0.75 226/219
Sum of all True Positives(TP) + All False Positives (FP)
Necrosis 0.70 0.70 0.70 23/23
PWM 0.81 0.92 0.86 75/85
Sum of all True Positives(TP) Scab 0.83 0.83 0.83 212/212
Recall =
Sum of all True Positives (TP) + All False Negatives (FN) Average 0.784 0.787 0.781
Overall Accuracy 81.09%
2*Precision*Recall
F − measure =
Precision + Recall
with time, making it appear as some other symptom like Necrosis etc.
The aforementioned performance metrics are the top metrics used to Similarly, compared to other classes, the model is more prone to
measure the performance of classification models. The classification confusion in distinguishing between Multiple diseases and Damaged.
model achieved an average accuracy of 81.09%, precision and recall of This is due to the fact that multiple overlapping spots damage the leaf
78.4% and 78.7% respectively. The performance of a classifier can tissues.
reduce significantly when faced with multiple classes with minimum If we combine all the single diseases into one single class as Diseased
variation. Leaf images infected with different diseases and with different class, then the overall accuracy increases significantly. We did slight
intensities can confuse the classification model and thus result in lower modification to the same classification model and fine-tuned it for four
performance. For example, in the confusion matrix above, it can be seen classes only. After fine-tuning, the model was tested on test set
that many Alternaria examples have been classified as Necrosis Blotch. comprising of 340, 226, 90 and 30 images belonging to ‘Diseased’,
This can happen if some alternaria spots turn tan or black papery spots ‘Multiple’, ‘Healthy’ and ‘Damaged’ classes respectively. The Confusion

Fig. 6. Confusion matrix of our classification results.

8
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

matrix is presented in Fig. 7. It can be seen that the overall accuracy of Table 7
the model increased from 81.09% to 87.9%. In addition, class-wise Class-wise results of 4-class classification model.
precision, recall, F-measure, and accuracy are given in Table 7. Disease Precision Recall F1-Score Actual/ Classified
To compare the performance and superiority of the proposed model,
Diseased 0.91 0.88 0.89 340/327
we trained three more famous CNN models, Inception-V2 (Ioffe and Multiple 0.80 0.89 0.85 226/251
Szegedy, 2015), Mobilenet-V2 (Chu et al., 2020) and NasNet mobile Healthy 0.98 0.94 0.96 90/87
(Zoph et al., 2018) on same dataset and compared their results. The Damaged 0.86 0.60 0.71 30/21
results presented in Table 8 show that the Xception model that relies on Average 0.88 0.82 0.85
Overall Accuracy 87.9%
depth-wise separable convolutions and the ability for multidimensional
feature extraction achieved an accuracy of 81.09%, easily outperformed
the other three models by a fair margin.
Table 8
Performance comparison of different classification models.
7. Performance of detection models
Model Xception Mobilenet-V2 Inception-V2 NasNet
Performance
We evaluated the performance of detection models on a test set from
our prepared Apple Disease dataset. The test set comprises of around Overall Accuracy 81.09% 74.2% 77.21% 78.2%
10% of the total images. We used mean Average Precision (mAP) metric
to evaluate the performances. Mean AP is the main evaluation indicator
used to evaluate detection models. The test results of all the detection Table 9
models is presented in Table 9. Out of the three detection models, Faster Class-wise Test results (AP) of our three detection models.
RCNN with mAP of 42.1% outperformed EfficientDet and Yolov4. After Model EfficienDet Yolov4 Faster-RCNN
Faster-RCNN, next best model was Yolov4 with mAP of 41.4% and Disease
EfficientDet with mAP of 38% came third. According to the results, all Alternaria 39.75 53.75 58.1
the three models achieved best mAP score for Powdery mildew (PWM) Insect 42.6 55.6 57.6
category while as scab is the worst performing category. Scab disease MLB 19.4 28.4 30.4
spots vary immensely in shape and size. They can be black or olive-green Mossiac 26.3 32.3 31.3
Necrosis 23.8 30.8 32.1
or sometimes purple in colour. As the number of spots grow they can
PWM 56.1 58.4 59.2
turn sooty. On the other hand PWM appears as white powder on leaves Scab 16.9 27.9 25.9
which very often covers whole leaf surface thus easily identifiable. mAP 32.1 41.1 42.01

7.1. Qualitative results evaluating classification models. The reason is that an image can contain
many objects belonging to different classes. So, in order to evaluate an
We evaluated the performance of our best detection model for all object detection model, three things have to be verified; object class,
categories in our Apple Disease dataset. Our detector is effectively able corresponding location (bounding box) and confidence. First find all the
to detect various complex symptoms easily. It shows excellent results object in an image and then verify if all the found objects belong to their
due to its robustness to deal with objects of different shapes, scales, correct classes. In a detection task, where different objects overlap and
colours, etc. Fig. 8 shows some accurately predicted images by our come in different shapes and sizes, mAP can be very misleading as we
proposed detector. have observed in our experiments.
From qualitative results, it is evident that our model has been very Undoubtedly, mAP successfully summarizes the performance of an
accurate in detecting various diseases. However, the quantitative results object detection model in one number but unravelling different errors
are not very ideal. Faster-RCNN has performed slightly better than the from mAP is challenging because all forms of errors like misclassifica­
rest of the two models, but it has achieved low mAP and we can see a tion, miss-localization or duplicate detection all contribute to false
lack of performance in some categories. This is due to the fact that positives. All these errors are not same but have equal effect on mAP. For
evaluating object detection models is not as straightforward as example, misclassification means your model is not generalizing well
and is a serious performance issue while as duplicate detection means
the model is able to detect objects but redundantly. In tasks where
number of detected objects is not important, duplicate detections can be
ignored.
In order to understand the quantitative performance of our model we
analysed the results from our model and found out many duplicate de­
tections especially for Scab disease. It is the same category for which the
mAP score is low as compared to other categories. Fig. 9 shows some of
the examples of duplicate detections.
Another challenge that we were dealing with was detecting small
spots. Our all three detection models performed well in detecting
different spots of moderate to large sizes, however Faster-RCNN out­
performed the other two models in detecting small spots as well. Fig. 10
presents the comparison of all three models in detecting small spots from
leaf images. EfficientDet performed well on medium and large sized
spots but missed most of the small spots as shown in Fig. 10. The per­
formance of Yolov4 on small size spots was better than EfficientDet but
not as good as Faster-RCNN.

Fig. 7. Confusion matrix of our 4-class classification results.

9
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

Fig. 8. Results predicted by our detection models 1) Scab 2) Multiple diseases (Mossaic, Alternaria, Necrosis) 3) Powdery Mildew 4) Alternaria 5) Alternaria and
Scab 6) Insect and Scab.

Fig. 9. Duplicate (overlapped) Detections a) 6 duplicate scab detections b) 1 mossaic duplicate detection.

7.2. Accuracy vs speed comparison models to determine their detection speed which is crucial for real-time
detection. We evaluated our models in terms of their inference speed and
Inference time is another important metric of object detection the results are presented in Table 10. Yolo v4 which achieved second

10
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

Fig. 10. Model performance on small size spots.

can efficiently and accurately identify the symptoms. The proposed


Table 10
system works in two stages, first stage is a tailor-made light weight
Accuracy vs Detection Speed our three detection models.
classification model which classifies the input images into diseased,
Model EfficienDet Yolov4 Faster-RCNN healthy or damaged categories and the second stage (detection stage)
Performance
processing starts only if any disease is detected in first stage. Detection
mAP 32.1 41.1 42.01 stage performs the actual detection and localization of each symptom
FPS 38 47 6
from diseased leaf images. Both classification and detection models were
initialized using transfer learning and then trained on prepared dataset.
best testing result (41.1 mAP) is the fastest of all with a detection speed We used three different models for detection stage and evaluated them
of 47 FPS. EiifcientDet is at the second position in terms of speed with 39 in terms of accuracy (mAP) and speed. The top performing model Faster-
FPS. Our best model with mAP of 42.01%, Faster RCNN which is a two- RCNN achieved mAP of 42.01% at detection speed of 6 FPS while as
stage detection algorithm runs slower at 6 FPS and hence not suitable for second best model Yolo v4 reached mAP of 41.1% and stands at top in
real-time detection. The speed comparison between models shows that terms of speed with 47 FPS. The qualitative results show that Faster
Yolo v4 has a clear advantage in terms of run speed as it runs about 7 RCNN is best suitable for the task as it is able to detect even the smallest
times faster than Faster-RCNN (6 FPS) while maintaining a good of the spots easily and accurately. On the other hand, second best per­
accuracy. formant model Yolo v4 is best suited for real-time detection because of
its fast inference speed. The results demonstrate that the proposed
8. Conclusion detection system can be very helpful tool for farmers and apple growers
to aid them in diagnosis, quantification and follow-up of infections.
Plant disease detection is a challenging problem due to variable Furthermore, in future, the work can be extended to other fruits and
disease patterns, complex background and lighting conditions. In this vegetables as well.
research work, we constructed an expert-annotated apple disease data­
set of suitable size consisting of around 9000 high quality RGB real-field CRediT authorship contribution statement
images covering all the main foliar diseases and symptoms. Next, we
proposed a deep learning based apple disease detection system which Asif Iqbal Khan: Conceptualization, Methodology, Software. S.M.K.

11
A.I. Khan et al. Computers and Electronics in Agriculture 198 (2022) 107093

Quadri: Supervision, Visualization. Saba Banday: Investigation. Jiang, P., Chen, Y., Liu, B., He, D., Liang, C., 2019 May. Real-time detection of apple leaf
diseases using deep learning approach based on improved convolutional neural
Junaid Latief Shah: Writing – review & editing.
networks. IEEE Access 6 (7), 59069–59080.
Kawasaki, Y., Uga, H., Kagiwada, S., Iyatomi, H., 2015. Basic study of automated
Declaration of Competing Interest diagnosis of viral plant diseases using convolutional neural networks. InInternational
symposium on visual computing 2015 Dec 14, Springer, Cham, pp. 638-645.
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S., 2017. Feature
The authors declare that they have no known competing financial pyramid networks for object detection. In: Proceedings of the IEEE conference on
interests or personal relationships that could have appeared to influence computer vision and pattern recognition, pp. 2117-2125.
the work reported in this paper. Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep
convolutional neural networks. Advances in neural information processing systems
25.
Acknowledgements Liu, B., Zhang, Y., He, D., Li, Y., 2018 Jan. Identification of apple leaf diseases based on
deep convolutional neural networks. Symmetry 10 (1), 11.
Ioffe, S., Szegedy, C., 2015. Batch normalization: Accelerating deep network training by
The study is part of the project “Deep Learning for Apple Diseases: reducing internal covariate shift. InInternational conference on machine learning
Development of an Intelligent System for Detection of Apple Fruit Diseases” 2015 Jun 1, PMLR, pp. 448-456.
sponsored by the UGC Dr. D. S. Kothari Post-Doctoral Fellowship Mokhtar, U., Bendary, N.E., Hassenian, A.E., Emary, E., Mahmoud, M.A., Hefny, H.,
Tolba, M.F., 2014. SVM-based detection of tomato leaves diseases. In: InIntelligent
Scheme. Systems, 2015. Springer, Cham, pp. 641–652.
Dataset: A subset of dataset can be downloaded from below link. Nguyen, N.-D., Do, T., Ngo, T.D., Le, D.-D., 2020. An evaluation of deep learning
https://drive.google. methods for small object detection. Journal of Electrical and Computer Engineering
2020, 1–18.
com/drive/folders/1geT010iYOHqfXIsDumB6AaUUcCb8-QT9?
Prasanna Mohanty, S., Hughes, D., Salathe, M., 2016. Using Deep Learning for Image-
usp=sharing Based Plant Disease Detection. arXiv e-prints. 2016 Apr:arXiv-1604.
Raashid Hassan, 2021. Grim omen for apple growers as scab noticed on trees after plenty
References of rain. Available from: https://kashmirreader.com/2021/04/19/grim-omen-for-
apple-growers-as-scab-noticed-on-trees-after-plenty-of-rain 19th April 2021
[Accessed 10-Nov-2021].
Abbas, A., Jain, S., Gour, M., Vankudothu, S., 2021 Aug. Tomato plant disease detection Ramcharan, A., Baranowski, K., McCloskey, P., Ahmed, B., Legg, J., Hughes, D.P., 2017
using transfer learning with C-GAN synthetic images. Comput. Electron. Agric. 1 Oct. Deep learning for image-based cassava disease detection. Front. Plant Sci. 27
(187), 106279. (8), 1852.
Amara, J., Bouaziz, B., Algergawy, A., 2017, A deep learning-based approach for banana Redmon Joseph, 2013. Darknet: Open Source Neural Networks in C, http://pjreddie.
leaf diseases classification. Datenbanksysteme für Business, Technologie und Web com/darknet.
(BTW 2017)-Workshopband. Raza, S.E.A., Prince, G., Clarkson, J.P., 2015. Automatic detection of diseased tomato
Bhat, K.A., Peerzada, S.H., Anwar, A., 2015. Alternaria epidemic of apple in Kashmir. plants using thermal and stereo visible light images. Plos ONE 10, e0123262.
African J. Microbiol. Res. 9 (12), 831–837. https://doi.org/10.1371/journal.pone.0123262.
Bochkovskiy, A., Wang, C.Y., Liao, H.Y., 2020. Yolov4: Optimal speed and accuracy of Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster r-cnn: Towards real-time object
object detection. arXiv preprint arXiv:2004.10934. 2020 Apr 23. detection with region proposal networks. Advances in neural information processing
Brahimi, Boukhalfa, Mohammed, Kamel, Moussaoui, Abdelouahab, 2017. Deep Learning systems. 28, 91–99.
for Tomato Diseases: Classification and Symptoms Visualization. Roboflow, 2020, https://roboflow.com [Accessed 15-Nov-2021].
Chollet, F., 2017. Xception: Deep learning with depthwise separable convolutions. In: Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the
Proceedings of the IEEE conference on computer vision and pattern recognition, pp. inception architecture for computer vision. In: Proceedings of the IEEE conference on
1251-1258. computer vision and pattern recognition, pp. 2818-2826.
Chu, X., Zhang, B., Xu, R., 2020. Moga: Searching beyond mobilenetv3. In: ICASSP 2020- Tan M, Pang R, Le QV. Efficientdet: Scalable and efficient object detection. InProceedings
2020 IEEE International Conference on Acoustics, Speech and Signal Processing of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp.
(ICASSP) 2020 May 4, IEEE, pp. 4042-4046. 10781-10790).
Chuanlei, Z., Shanwen, Z., Jucheng, Y., Yancui, S., Jia, C., 2017 Mar 31. Apple leaf Tzutalin, L., Git code, 2015. Availabel from: https://github.com/tzutalin/labelImg
disease identification using genetic algorithm and correlation based feature selection [Accessed 2020 Apr].
method. Int. J. Agric. Biol. Eng. 10 (2), 74–83. Velasquez, A.C., Castroverde, C.D., He, S.Y., 2018 May 21. Plant–pathogen warfare
Directorate of Horticulture, Kashmir. Available from: http://hortikashmir.gov.in/Area under changing climate conditions. Curr. Biol. 28 (10), R619–R634.
Production data.html [Accessed 10-Nov-2021]. Zhang, Zhang, Q., Li, P., 2019. Apple disease recognition based on improved deep
Dubey, S.R., Jalal, A.S., 2012. Detection and classification of apple fruit diseases using convolution neural network. J. Forest. Eng. 4 (04), 107–112.
complete local binary patterns. In2012 Third International Conference on Computer Zhong, Y., Zhao, M., 2020 Jan. Research on deep learning in apple leaf disease
and Communication Technology 2012 Nov 23. IEEE, pp. 346-351. recognition. Comput. Electron. Agric. 1 (168), 105146.
Dandawate, Y., 2015. An automated approach for classification of plant diseases towards Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V., 2018. Learning transferable architectures
development of futuristic decision support system in Indian perspective. In: for scalable image recognition. In: Proceedings of the IEEE conference on computer
Proceedings of the International Conference on Advances in Computing, vision and pattern recognition, pp. 8697-8710.
Communications and Informatics. IEEE, Kochi, India, pp. 794–799. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A.
Dutot, M., Nelson, L.M., Tyson, R.C., 2013 Nov. Predicting the spread of postharvest (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on
disease in stored fruit, with application to apples. Postharvest Biol. Technol. 1 (85), computer vision and pattern recognition (pp. 1-9).
45–56. https://www.greaterkashmir.com/business/apple-growers-fear-losses-damage-to-orchar
Ferentinos, K.P., 2018 Feb. Deep learning models for plant disease detection and ds-due-to-alterneria-outbreak, Javed Iqbal 2018–. (Accessed 18 November 2021).
diagnosis. Comput. Electron. Agric. 1 (145), 311–318.
Girshick, R., 2015. Fast r-cnn. In: Proceedings of the IEEE international conference on
computer vision, pp. 1440-1448.

12

You might also like