Using Deep Transfer Learning For Image-Based Plant Disease Identification
Using Deep Transfer Learning For Image-Based Plant Disease Identification
A R T I C LE I N FO A B S T R A C T
Keywords: Plant diseases have a disastrous impact on the safety of food production, and they can cause a significant
Plant disease identification reduction in both the quality and quantity of agricultural products. In severe cases, plant diseases may even
Deep learning cause no grain harvest entirely. Thus, the automatic identification and diagnosis of plant diseases are highly
Convolution neural networks desired in the field of agricultural information. Many methods have been proposed for solving this task, where
Transfer learning
deep learning is becoming the preferred method due to the impressive performance. In this work, we study
Image classification
transfer learning of the deep convolutional neural networks for the identification of plant leaf diseases and
consider using the pre-trained model learned from the typical massive datasets, and then transfer to the specific
task trained by our own data. The VGGNet pre-trained on ImageNet and Inception module are selected in our
approach. Instead of starting the training from scratch by randomly initializing the weights, we initialize the
weights using the pre-trained networks on the large labeled dataset, ImageNet. The proposed approach presents
a substantial performance improvement with respect to other state-of-the-art methods; it achieves a validation
accuracy of no less than 91.83% on the public dataset. Even under complex background conditions, the average
accuracy of the proposed approach reaches 92.00% for the class prediction of rice plant images. Experimental
results demonstrate the validity of the proposed approach, and it is accomplished efficiently for plant disease
detection.
1. Introduction et al., 2017; Khirade and Patil, 2015). Therefore, looking for a fast,
automatic, less expensive, and accurate method to perform plant dis-
The occurrence of plant diseases has negative effects on agricultural ease detection is of great realistic significance.
production, and if the plant diseases are not detected in time, there will Plenty of previous works have considered the image recognition,
be an increase in food insecurity (Faithpraise et al., 2013). In particular, and a particular classifier is used which categorizes the images into
the main crops such as rice, maize, etc., are essential for guaranteeing healthy or diseased images. Generally, the leaves of plants are the first
the food supply and agricultural production. The early warning and source of plant disease identification, and the symptoms of most dis-
forecast are the basis of effective prevention and control for plant dis- eases may start to appear on the leaves (Ebrahimi et al., 2017; García
eases. They play crucial roles in the management and decision-making et al., 2017). In the past decades, the primary classification techniques
for agricultural production. Until now, however, the visual observations that were popularly used for disease identification in plants include k-
of experienced producers are still the primary approach for plant dis- nearest neighbor (KNN) (Guettari et al., 2016), support vector machine
ease detection in rural areas of developing countries; this requires (SVM) (Deepa and Umarani, 2017), fisher linear discriminant (FLD)
continuous monitoring of experts, which might be prohibitively ex- (Ramezani and Ghaemmaghami, 2010), artificial neural network(ANN)
pensive in large farms (Al-Hiary et al., 2011; Bai et al., 2018). Besides, (Sheikhan et al., 2012), random forest (RF) (Kodovsky et al., 2011) and
in some remote areas, farmers may have to go long distances to contact so on. As we all know that the disease recognition rates of the classical
experts, which makes the consulting too expensive and time-con- approaches rely heavily on the lesion segmentation and hand-designed
suming. Nevertheless, this approach can only be done in limited areas features by various algorithms, such as seven invariant moments, SIFT
and cannot be well extended. Automatic recognition of plant diseases is (Scale-invariant feature transform), Gabor transform, global–local sin-
an essential research topic, as it may prove benefits in monitoring large gular value, and sparse representation (Guo et al., 2007; Zhang and
fields of crops, and thus automatically detect the symptoms of diseases Wang, 2016; Zhang et al., 2017), etc. However, the artificially-designed
as soon as they appear on plant leaves (Al Bashish et al., 2011; Pooja features require expensive works and expert knowledge, which have a
⁎
Corresponding author.
E-mail address: [email protected] (D. Zhang).
https://doi.org/10.1016/j.compag.2020.105393
Received 3 November 2019; Received in revised form 27 March 2020; Accepted 29 March 2020
Available online 06 April 2020
0168-1699/ © 2020 Elsevier B.V. All rights reserved.
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
(a) Rice Leaf Smut (b) Rice Leaf Scald (c) Rice Bacterial Leaf Streak (d) Rice White Tip
(e) Maize Gray Leaf Spot (f) Maize Common Rust (g) Maize Eyespot (g) Maize Northern Leaf Blight
Fig. 1. Sample images of plant diseases.
certain subjectivity. Mainly, it is not easy to determine which features module, which is used as a basic feature extractor; the other is an
are optimal and robust for disease identification from the many ex- auxiliary structure that utilizes multi-scale feature maps for detection.
tracted features. Besides, under the complex background conditions, Specifically, the enhanced VGGNet with the Inception module is se-
most methods fail to effectively segment the leaf and corresponding lected in our approach. The conventional VGGNet is modified by re-
lesion image from its background, which will lead to unreliable disease placing its last convolutional layers with an extended convolutional
recognition results. Thus, the automatic recognition of plant disease layer of 3 × 3 × 512, where the batch normalization is added and the
images is still a challenging task due to the complexity of diseased leaf Swish activation function is used to directly replace ReLu. Then, the
images. More recently, deep learning techniques, particularly con- convolutional layer is followed by two Inception modules, which are
volutional neural networks (CNNs), are quickly becoming the preferred used to extract the multi-scale features of images input from the pre-
methods to overcome some challenges (Barbedo, 2018). CNN is the vious layer, and the fully connected layers are replaced by a global
most popular classifier for image recognition in both large and small pooling layer to conduct the dimension reduction of feature maps. After
scale problems. It has shown outstanding ability in image processing that, a fully-connected Softmax layer with a practical number of cate-
and classification (Kamilaris and Prenafeta-Boldú, 2018; Kussul et al., gories was added as the top layer of the modified networks. In this way,
2017; Yalcin, 2017). For example, Mohanty, Hughes and Marcel the newly formed network, which we term the INC-VGGN, is used for
(Mohanty et al., 2016) trained a deep learning model for recognizing 14 the class prediction of plant disease images.
crop species and 26 crop diseases. Their trained model achieves an The remainder of this paper is organized in the following manner.
accuracy of 99.35% on a held-out test set. Ma et al. (Ma et al., 2018) Section 2 introduces the collection of the image dataset followed by an
used a deep CNN to conduct symptom-wise recognition of four cu- overall flow summary, and this section mainly discusses the metho-
cumber diseases, i.e., downy mildew, anthracnose, powdery mildew, dology to accomplish the task of plant disease identification along with
and target leaf spots. They reached the recognition accuracy of 93.4%. related concepts and the proposed approach. Section 3 dedicates to the
Kawasaki et al. (Kawasaki et al., 2015) introduced a system based on algorithm experiments; multiple experiments are conducted and the
CNN to recognize cucumber leaf disease; it realizes an accuracy of experimental results are evaluated as well as comparative analysis. Fi-
94.9%, etc. Although very good results have been reported in the lit- nally, the paper is summarized in Section 4.
erature, investigations so far have used image databases with limited
diversity. The most photographic materials include images solely in
experimental (laboratory) setups, not in real field wild scenarios. In- 2. Materials and methods
deed, images captured in cultivation field conditions include a wide
diversity of background and an extensive variety of symptom char- 2.1. Data acquisition
acteristics (Barbedo, 2018). Additionally, there are a vast number of
parameters needed to be trained for CNN and its variants, while Including 500 rice images and 466 maize images, about 1000 crop
training these CNN architectures also requires multiple labeled samples leaf images were provided by the Fujian Institute of Subtropical Botany,
and substantial computer resources from scratch to assess their per- Xiamen, China. The images were captured under non-uniform illumi-
formance. Collecting a large labeled dataset is undoubtedly a challen- nation intensities and clutter field background conditions. All the crop
ging task. Despite the limitations, the previous investigations have images collected in this paper are as the disease categories, also saved
successfully demonstrated the potential of deep learning algorithms. as the JPG format. Among them, the rice diseases include the Rice
Particularly, the deep transfer learning, which alleviates the problem Stackburn, Rice Leaf Scald, Rice Leaf Smut, Rice White Tip, and
faced by classical deep learning methods, i.e. the solutions consisting of Bacterial Leaf Streak. The maize diseases consist of Phaeosphaeria Spot,
using a pre-trained network where only the parameters of the last Maize Eyespot, Gray Leaf Spot, and Goss's Bacterial wilt, etc. For the
classification levels need to be inferred from scratch (Kessentini et al., subsequent calculations, these images are uniformly processed into the
2019); is naturally employed in the practical application. In this work, RGB model by Photoshop tools firstly, and then the sizes of images are
we study the transfer learning for the deep CNNs with the aim of en- adjusted to 224 × 224 pixels. Some of the sample images are displayed
hancing the learning ability of tiny lesion symptoms along with de- in Fig. 1.
creasing the computational complexity. The proposed approach is
generally composed of two parts: the first part is the pre-trained
2
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
Plant disease image collection Image pre-processing & augmentation Model establishment Results & evaluation
knowledge
Domain
samples
Plant image Labeling/
Expert
Start collection Sample images
Classification
Update sample
library
Expert knowledge
Transfer learning/
Pre-trained model
Predicting
Images to be detected samples INC-VGGN model
Detected
disease End
Plant disease category
detection
2.2. Overview where Hi denotes the feature map of the current network layer, Hi-1 is
the convolution feature of the previous layer (H0 is the original image.),
As depicted in Fig. 2, a general overview of our method for plant Wi is the weight of the i-th layer, bi is the offset vector of the i-th layer,
disease identification is presented as follows. Firstly, the samples of and φ(·) represents the rectified linear unit (ReLU) function.
plant disease images are collected and labeled based on the knowledge
of experts in the field. Then, the image-processing techniques including 2 Pooling layers
grey transformation, image filtering, image sharpening and resizing,
etc., are performed on the acquired images, and new sample images are The function of pooling layers is reducing the spatial dimension,
generated to enrich the dataset using the data augmentation methods. which can reduce computational complexity and effectively control the
For example, random rotation, flipping, and translation are utilized to risk of over-fitting. In the l-th pooling layer, the output feature on the j-
enlarge the dataset. After that, the sample images are input to the th local receptive field can be calculated in Eq. (2).
proposed method (aliased as INC-VGGN) for model training. Thus, the
trained model is applied for the class prediction of unseen images, and x jl = down (x jl − 1, s ) (2)
the results of plant disease identification are obtained eventually. The l−1
where down(·) represents the down-sampling function, xj is the
detailed descriptions of these phases are illustrated in subsequent sec-
feature vector in the previous layer, and s is the pooling size.
tions.
3 Fully-connected layers
2.3. Related works
After convolutional and pooling layers, there are one or several
2.3.1. Convolutional neural networks fully-connected (FC) layers, and the purpose of FC layers is using the
Convolutional neural networks are a category of neural networks extracted features for image classification. The Softmax function is
designed for image recognition and classification and have achieved usually employed to conduct the class prediction with the features ex-
excellent results (Huang et al., 2017; Szegedy et al., 2015; Cetinic et al., tracted from the previous layers. Mathematically, the Softmax function
2018). Different from the traditional approaches, CNNs can learn high- is written by
level robust features directly from the original image instead of ex- K
tracting the specific features manually. In the identification of plant softmax(z )j = e zj / ∑ e zk (for j = 1, ...,K ) (3)
k=1
species and diseases, it is demonstrated that CNNs can provide better
where K represents the dimension of the z vector.
performance than the traditional feature extraction methods (Mohanty
et al., 2016; Dyrmann et al., 2016). A typical CNN architecture mainly
consists of convolution layers, pooling layers, and full connection layers 2.3.2. Vggnet
(Lu et al., 2017), which are described as follows. VGGNet (Simonyan and Zisserman, 2014) is a kind of convolutional
neural networks developed by researchers from the Visual Geometry
1 Convolutional layers Group of Oxford University and Google DeepMind. It consists of 16 or
more convolution, pooling and fully-connected layers (Khan et al.,
The convolutional layer is the crucial component of CNN, which 2019), as shown in Fig. 3. VGGNet adopts the cascaded network
extracts the specific features of the image by the different sizes of the structure, and there is a pooling layer after 2 or 3 convolution layers
convolution kernel. A set of feature maps of input images can be ex- using small 3 × 3 convolution filters. The VGGNet includes two typical
tracted after applying the convolutional layers several times. Let Hi models such as the VGG-16 and VGG-19, which are available as the pre-
represent the feature map of the i-th layer in CNN, then the Hi can be trained models with the 16 and 19 wt layers trained on the ImageNet
generated as follows: (Russakovsky et al., 2015) dataset. Thus, in our experiments, the VGG-
19 is considered as the base model and modified to generate the new
Hi = φ (Hi − 1 Wi + bi ) (1) networks, and then the labeled sample images are exploited to train and
3
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
fine-tune the formed networks. The VGGNet structure is depicted as from the networks, etc. In this way, a new network structure can be
follows. generated.
public image datasets, etc. In this paper, we consider using the pre- E (W ) = -
n
∑ ∑ [yik log P (xi = k ) + (1 − yik ) log(1 − P (xi = k ))]
xi = 1 k = 1
trained models learned from the massive typical dataset ImageNet, and
then transfer to the specific task trained by the objective dataset. The (4)
main processes of the transfer learning approach are described as fol- where W represents the weighting matrices of the convolutional and
lows. fully-connected layers, n is the number of training samples, i is the
index of training samples, and k is the index of classes. P(xi = k) is the
1 Determine the base networks probability of input xi belonging to the predicted k-th class.
Specifically, the stochastic gradient descent (SGD) (Ghazi et al.,
Determine the base networks of transfer learning and assign the 2017) algorithm is often used to calculate the optimal W by minimizing
network weights (W 1, W 2, …, W n) using the pre-trained CNN model. the loss function E on the target dataset, as defined in Eq. (5).
The weights of bottom layers can be downloaded from a well-trained
Wk = Wk − 1 − a (∂E (W )/ ∂W ) (5)
CNN (https://ker-as.io/applications/).
where a is the learning rate, and k is the index of classes.
2 Establish a new neural network Thereby, in this work, we chose the VGGNet pre-trained on
ImageNet and Inception module for transfer learning, and trained the
Based on the bottom layers, the network structure can be modified newly formed neural networks using our own datasets. The approach
such as replacing the layers, inserting the layers and deleting the layers combines the advantages of the VGGNet and Inception module, and the
Source dataset:
VGGNet-19 model Output:1000 classes
ImageNet dataset of imageNet
4
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
et al., 2019). Thus, the BN and Swish were both selected for being used
in our networks. Furthermore, we modified the conventional VGGNet
layers is almost 80% of the whole network, which will increase the
by replacing its full connection layers with a global pooling layer, and
training and testing time, and lead to a large demand for computer
two Inception modules, as stated before, are introduced in the VGGNet
memory (Zhang et al., 2019). Too many parameters can result in an
to improve the feature extraction ability of the new network, namely,
overfitting problem as well. Particularly, the diverse images captured in
INC-VGGN.
real field wild scenarios contain a lot of noise such as heterogeneous
More specifically, the details of this phase are illustrated as follows.
background and uneven illumination, which is easy to be fitted by the
The first few layers of a convolutional neural network typically extract
complicated model and causes the over-fitting problem. Additionally,
the color and corner features (Zeiler and Fergus, 2014) and it is of little
considering the number of model parameters, it is
value to utilize Inception to extract these features. Therefore, the
(5 × 5 + 1) × C2 = 26C2 generated by a 5 × 5 convolutional kernel in
Conv1_1 to Pool3 layers of the VGGNet are preserved and the sub-
C input images, while this value is 2×(3 × 3 + 1) × C2 = 20C2 for
sequent layers of VGGNet, Conv5_1 to Conv5_3, are replaced with two
two consecutive 3 × 3 convolution kernel, so the model complexity can
Inception modules and one added BN convolutional layer, in which the
be reduced by this consecutive convolution architecture. Instead of a
Swish is used as the activation function instead of the ReLU. That is, to
conventional Inception module, the enhanced Inception module is in-
enhance the multi-scale feature extraction ability of the network, a
troduced in our networks. Thus, in this network architecture, the single
convolutional layer with batch normalization and Swish is added be-
5 × 5 convolution layer is replaced with two consecutive 3 × 3 con-
hind Pooling3, and then the two Inception modules composed of In-
volution layers, which not only maintains the range of perceptive fields
ception and Concat are conducted. Finally, a global pooling layer in-
but also reduces the number of parameters.
stead of fully-connected layers are added after the second Inception
module and followed by a Softmax classifier. Fig. 5 depicts the network
structure and relevant parameters are displayed in Table 1. 3. Experimental results and analysis
Thus, the newly generated networks are generally composed of two
parts: the first part is the pre-trained module, which is used as a basic In our experiments, some image pre-processing algorithms were
feature extractor; the second part is the extended layers that extract the conducted using Matlab, while the data augmentation and CNN were
high-dimensional features and utilize the multi-scale feature maps for implemented using Anaconda3 (Python 3.6), Keras-GPU library (Keras-
classification. On the other hand, the reason why we used a global GPU. Available online: https://anaconda.org/anaconda/keras-gpu (ac-
pooling layer replacing the fully-connected layers is described below. In cessed on 17 Jun 2019), and OpenCV-python3 library, etc. The deep
traditional CNN or DCNN, the ratio of parameters in all fully-connected CNN training and testing are accelerated by GPU, and the experimental
5
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
hardware environment includes: Intel® Core™ i7-8750 central proces- learning method, the models are created and loaded with pre-trained
sing unit (CPU) at 2.20 GHz with 8-GB memory and NVIDIA GeForce weights from ImageNet, and the top layers are truncated by defining a
GTX 1060 (CUDA 9.0 and 6.9 GB memory) graphics card (GeForce, new fully-connected Softmax layer with the practical number of clas-
1060), which is used for the model training and testing. sification. Likewise, the models of various CNNs are trained and mul-
tiple experiments are conducted on the Maize dataset. The test ac-
3.1. Experiments on the public dataset curacies of different approaches are obtained in Table 2 and Figs. 7–11.
Considering the statistics of correct detections (also known as true
The Plantvillage database (www.plantvillage.org) is an interna- positives), misdetections (also known as false negatives), true negatives
tional general database for the algorithm test of plant disease detection and false positives, we can evaluate the performance of the models with
using machine learning (Hughes and Salathé, 2015). To assess the the indicators including the Accuracy, Sensitivity, and Specificity, as ex-
performance of the proposed approach, we conduct our experiments on pressed in Eqs. (6) and (7).
this general database firstly, and the Maize dataset of Plantvillage is
Accuracy = (TP + TN)/(TP + TN + FP + FN) (6)
selected for the testing. There are 3852 color leaf images in this dataset,
which are divided into 4 categories including 513 gray_leaf_spot
Sensitivity = TP/(TP + FN) (7)
images, 1192 common_rust images, 1162 healthy images, and 985
northern_leaf_blight images separately. It is noted that each image is Specificity = TN/(FP + TN) (8)
taken in a simple background condition and the illumination intensity
is relatively uniform. Some images of the same leaf are captured from where TP (true positive) is the number of instances that actually belong
different orientations. The dimension of all the images is uniform to the class C and are correctly identified by the classifier, FN (false
256 × 256 pixels. Obviously, the sample distributions of this dataset negative) is on the contrary, which is the number of instances that
are not balanced. The number of gray_leaf_spot leaf images is relatively belong to the class C but incorrectly classified. FP (false positive) is the
small, and it is significantly less than that of other classes for the Maize number of instances that do not belong to class C but mistakenly
dataset. Therefore, to ensure the balance of sample data, the data identified as this classification. TN (true negative) is the number of
augmentation techniques are used to enrich the sample images of the instances that are not in class C in reality, and they are correctly
gray_leaf_spot class. Through horizontal or vertical flipping, angle ro- identified.
tating, shearing, and size scaling, the amount of the original images in It can be seen from Table 2 that the proposed approach outperforms
gray_leaf_spot class is enlarged using Python script. The bounded values the other state-of-the-art methods which are experimented on the
of flipping, rotating, shearing, and scale transformation are random and public dataset, even if the optimal classifier is adopted. The main reason
evenly distributed in a specific range. For example, the rotation range is that the proposed approach transfers to the specific task using the
is ± 40°, the scale is changed from 0.9 to 1.1, and the shear range is pre-trained VGGNet on ImageNet with Inception module, and thus
0.2, etc. Thus, for the class of gray_leaf_spot, 87 additional synthetic combines the advantages of both. The pre-trained module is employed
images are selected to enrich this class, and a total of 20 original images as a basic feature extractor while the extended layers are charged with
are utilized to generate the augmented images. Fig. 6 displays the high-dimensional feature extraction as well as classification. By con-
partial augmented samples. trast, the other methods are individual networks. Although the models
In this way, new sample images are generated to enrich the dataset, are trained with pre-trained weights rather than from scratch, the op-
and at least 500 images of each category are ensured for the modeling timal results are not achieved for these models. In summary, applying
through data augmentation techniques. Then, the sample images are the batch normalization and Swish activation function, the proposed
uniformly resized to the fixed-dimension of 224 × 224 pixels to fit the INC-VGGN approach, which combines the advantages of the Inception
model, and the pre-processed leaf images with 500 images per class are module and VGGNet, achieves the top performances in the experiments.
selected to build the model. The training set and test set are divided After 30 epochs of training, the validation accuracy of the proposed
according to the ratio of 70/30. Particularly, to know how the proposed approach achieves 91.83% and the loss is 0.24, as displayed in Table 2.
approach will perform on new unseen data, a certain number of raw Furthermore, using the model trained by the proposed approach, the
images are retained to validate the effectiveness of the model. Based on images outside modeling are selected for the class prediction of plant
the method proposed in Section 2.4, we perform the model training and disease images. Fig. 12 depicted the confusion matrix of detecting re-
validation on the Maize dataset. Moreover, to further verify the effec- sults, and the corresponding evaluation indicators are calculated in
tiveness of the proposed approach, we considered four influential CNNs Table 3.
including DenseNet (Huang et al., 2017), VGGNet (Simonyan and As stated before, to validate the effectiveness of the proposed ap-
Zisserman, 2014), Inception V3 (Szegedy et al., 2016) and ResNet (He proach, the experiments were conducted on the new unseen data. From
et al., 2016) for the comparative experiments. Using the transfer Table 3, it can be seen that the average predicting accuracy achieves no
6
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
Table 2
Accuracy and loss of different approaches after 30 epoch training.
Pre-trained model 10 epochs 30 epochs
Training accuracy % Validation accuracy % Training loss Training accuracy % Validation accuracy % Training loss Validation loss
Fig. 7. DenseNet-201, left is the accuracy and right depicts the loss of the model.
Fig. 8. ResNet-50, left is the accuracy and right depicts the loss of the model.
Fig. 9. Inception V3, left is the accuracy and right depicts the loss of the model.
7
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
Fig. 10. VGGNet-19, left is the accuracy and right depicts the loss of the model.
Fig. 11. The proposed approach, left is the accuracy and right depicts the loss of the model.
Gray leaf spot 84.75 43.00 98.67 1. Change the size of the images. All the images are resized to the
Common rust 93.50 85.00 96.33
fixed-dimension of 224 × 224 pixels to fit the model. Using the data
Northern leaf blight 80.75 46.00 92.33
Healthy 78.00 100.00 70.67 augmentation techniques, the disease images are augmented and at
Average 84.25 68.50 89.50 least 100 images are guaranteed for each category.
2. Image pre-processing. In order to avoid image distortion, the image
pre-processing is performed to blacken the shorter sides of the
8
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
(a) Predicting results of rice disease images (b) Predicting results of maize disease images
Fig. 13. Confusion matrices of plant disease detection.
images so that they become the same proportion, thus the in- maize disease images, which is due to the scenario that some “Phaeo-
formation of the original images is retained, and the image de- sphaeria Spot” and “Maize Eyespot” diseases occur in the same leaf.
formation is also prevented. Additionally, the clutter field background and uneven illumination in-
3. Dataset partition. Except for a certain number of validation images, tensity can also affect the detection results. Some identified samples are
the dataset D is divided into a training set A and testing set B with displayed in Fig. 14.
the ratio of 70/30, thus D = A + B. As observed in Fig. 14, the top images are the original sample
4. Model training. Referring to the method proposed in Section 2.4, the images, the middle images are the extracted feature maps, and the
training set A is applied to train the model. To fully verify the ef- bottom images are the results of class prediction. Generally, such as
fectiveness, the multiple experiments are conducted with the shuf- Fig. 14(a-c), the predicted categories are consistent with the actual
fling of images. categories of these samples, and most of the rice and maize diseases are
5. Testing and validation. The testing set B is used to evaluate the correctly detected by the proposed approach. Whereas, the serious
model, and the new images outside modeling are applied to verify clutter field background and uneven illumination intensity may affect
the effectiveness of the model. The output results are compared with the feature extraction of lesion images, and cause the individual in-
the actual categories and the related evaluation indicators are cal- correct classification, as shown in Fig. 14(d). Although the individual
culated. images are misclassified, most of the detected plant disease types are
consistent with the actual categories. As seen in Tables 4 and 5, the
Thus, based on the above processes, we performed the model average prediction accuracy is no less than 80.00% and the higher
training on the rice disease image dataset and maize disease image average specificity reaches 95.00% in multiple experiments, which in-
dataset separately. After obtaining the trained models, the new unseen dicates that the proposed INC-VGGN approach has a significant cap-
images are selected for the class prediction. Fig. 13(a,b) depicts the ability to recognize the plant diseases. Based on the empirical analysis,
predicting results of rice and maize disease images, and the corre- it can be concluded that the proposed approach is effective for the
sponding evaluation indicators are calculated in Tables 4,5. identification of plant disease types, which can also be extended to the
For example, it can be seen from Fig. 13(a) that 13 samples are application of other fields such as online fault diagnosis, target re-
correctly predicted for the class of “Rice White Tip” except for 2 sam- cognition, etc.
ples. Likewise, for the class of “Rice Leaf Smut”, 12 samples are cor-
rectly predicted except for 3 misclassified samples. Totally, there are 48
4. Conclusion
plant leaf images that they are correctly detected by the proposed ap-
proach in 60 samples, and the average accuracy achieves 92.00% for
Plant diseases are the main harms to the agricultural development
the class prediction of all the rice disease images, as shown in Table 4.
of the world, and they have a disastrous impact on the safety of food
Thus, the INC-VGGN approach shows high accuracy for the identifica-
production. In severe cases, plant diseases may lead to no harvest
tion of rice disease images, indicating the validity of the proposed ap-
completely. Therefore, the automatic identification of plant diseases is
proach. On the other hand, although the performance is worse than that
highly desired in agricultural information. Deep learning techniques,
on rice, we realize an average accuracy of 80.38% for the detection of
particularly CNNs, have shown promising performance in addressing
most of the challenging problems associated with the classification. In
Table 4 this paper, the transfer learning for deep CNNs is studied with the aim
Evaluation indicators of rice disease detection (%). of enhancing the learning ability of tiny lesion symptoms, and a novel
deep learning architecture called INC-VGGN is proposed for the iden-
Types Accuracy Sensitivity Specificity
tification of plant disease images. The pre-trained VGGNet is modified
Rice Stackburn 96.67 85.71 98.11 by replacing its last layers with an additional convolutional layer, in
Rice Leaf Smut 90.00 80.00 93.33 which the batch normalization is added and the Swish activation
Rice Leaf Scald 90.00 73.33 95.56 function is used to directly replace ReLu. Then, the convolutional layer
Rice White Tip 91.67 86.67 93.33
is followed by two Inception modules, and the fully connected layers
Bacterial Leaf Streak 91.67 75.00 94.23
Average 92.00 80.00 95.00 are replaced by a global pooling layer to conduct the dimension re-
duction of feature maps. Finally, the fully-connected Softmax layer was
9
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
(a) Rice Stackburn (b) Rice Leaf Scald (c) Maize Eyespot (d) Maize Gray Leaf Spot
Fig. 14. The detected samples of plant disease images.
Table 5 (#20720181004). The authors wish to thank all the editors and anon-
Evaluation indicators of maize disease detection (%). ymous reviewers for their constructive advice.
Types Accuracy Sensitivity Specificity
References
Phaeosphaeria Spot 81.01 47.37 91.67
Maize Eyespot 75.95 65.00 79.61 Faithpraise, F., Birch, P., Young, R., Obu, J., Faithpraise, B., Chatwin, C., 2013. Automatic
Gray Leaf Spot 84.81 60.00 93.22 plant pest detection and recognition using k-means clustering algorithm and corre-
Goss's Bacterial wilt 79.75 70.00 83.05 spondence filters. Int. J. Adv. Biotechnol. Res. 4 (2), 189–199.
Average 80.38 60.76 86.92 Al-Hiary, H., Bani-Ahmad, S., Reyalat, M., Braik, M., Alrahamneh, Z., 2011. Fast and
accurate detection and classification of plant diseases. Int. J. Comput. Appl. 17 (1),
31–38.
Bai, X., Cao, Z., Zhao, L., Zhang, J., Lv, C., Li, C., Xie, J., 2018. Rice heading stage au-
added as the top layer for the classification. Thus, the newly generated tomatic observation by multi-classifier cascade based rice spike detection method.
networks consist of a pre-trained module and an auxiliary structure. Agricul. Forest Meteorol. 259, 260–270.
The former is employed as a basic feature extractor while the latter Al Bashish, D., Braik, M., Bani-Ahmad, S., 2011. Detection and classification of leaf dis-
eases using K-means-based segmentation and Information Technology Journal 10 (2),
extracts the high-dimensional features and is responsible for classifi-
267–275.
cation. Experimental results demonstrated the model with the state-of- Pooja, V., Das, R., & Kanchana, V. (2017, April). Identification of plant leaf diseases using
the-art performance on both the public dataset and our own image image processing techniques. In 2017 IEEE Technological Innovations in ICT for
Agriculture and Rural Development (TIAR) (pp. 130-133). IEEE.
dataset. It achieves a validation accuracy of 91.83% on the public da-
Khirade, S. D., & Patil, A. B. (2015, February). Plant disease detection using image pro-
taset. Even under complex background conditions, the average accu- cessing. In 2015 International conference on computing communication control and
racy reaches 92.00% for the class prediction of collected rice disease automation (pp. 768-771). IEEE.
images. In future development, we intend to deploy it on mobile de- Ebrahimi, M.A., Khoshtaghaza, M.H., Minaei, S., Jamshidi, B., 2017. Vision-based pest
detection based on SVM classification method. Comput. Electron. Agricul. 137,
vices to monitor and identify the broader range of plant disease in- 52–58.
formation automatically. Meanwhile, we plan to apply it to more real- García, J., Pope, C., & Altimiras, F. (2017). A distributed-means segmentation algorithm
world applications including computer-aided diagnosis (CAD) and so applied to lobesia botrana recognition. Complexity, 2017.
Guettari, N., Capelle-Laizé, A.S., Carré, P., 2016. Blind image steganalysis based on evi-
on. dential k-nearest neighbors. IEEE, pp. 2742–2746.
Deepa, S., Umarani, R., 2017. Steganalysis on Images using SVM with Selected Hybrid
Features of Gini Index Feature Selection Algorithm. Int. J. Adv. Res. Comput. Sci.
Declaration of Competing Interest
8 (5).
Ramezani, M., & Ghaemmaghami, S. (2010, January). Towards genetic feature selection
in image steganalysis. In 2010 7th IEEE Consumer Communications and Networking
The authors declared that they have no conflicts of interest to this
Conference (pp. 1-4). IEEE.
work. Sheikhan, M., Pezhmanpour, M., Moin, M.S., 2012. Improved contourlet-based stegana-
We declare that we do not have any commercial or associative in- lysis using binary particle swarm optimization and radial basis neural networks.
terest that represents a conflict of interest in connection with the work Neural Comput. Appl. 21 (7), 1717–1728.
Kodovsky, J., Fridrich, J., & Holub, V. (2011). Ensemble classifiers for steganalysis of
submitted. digital media. IEEE Transactions on Information Forensics and Security, 7(2), 432-
444.
Guo, Y., Hastie, T., Tibshirani, R., 2007. Regularized linear discriminant analysis and its
Acknowledgment application in microarrays. Biostatistics 8 (1), 86–100.
Zhang, S., Wang, Z., 2016. Cucumber disease recognition based on Global-Local Singular
This work is partly supported by the grants from the National value decomposition. Neurocomputing 205, 341–348.
Zhang, S., Wu, X., You, Z., Zhang, L., 2017. Leaf image based cucumber disease re-
Natural Science Foundation of China (Project no. 61672439) and the cognition using sparse representation classification. Comput. Electron. Agricul. 134,
Fundamental Research Funds for the Central Universities
10
J. Chen, et al. Computers and Electronics in Agriculture 173 (2020) 105393
135–141. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Berg, A.C., 2015.
Barbedo, J.G., 2018. Factors influencing the use of deep learning for plant disease re- Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115 (3),
cognition. Biosyst. Eng. 172, 84–91. 211–252.
Kamilaris, A., Prenafeta-Boldú, F.X., 2018. Deep learning in agriculture: a survey. Lumini, A., Nanni, L., 2019. Deep learning and transfer learning features for plankton
Comput. Electron. Agricul. 147, 70–90. classification. Ecol. Inform. 51, 33–43.
Kussul, N., Lavreniuk, M., Skakun, S., Shelestov, A., 2017. Deep learning classification of Ghazi, M.M., Yanikoglu, B., Aptoula, E., 2017. Plant identification using deep neural
land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. networks via optimization of transfer learning parametersNeurocomputing 235,
14 (5), 778–782. 228–235.
Yalcin, H. (2017, August). Plant phenology recognition using deep learning: Deep-Pheno. Chollet, F., 2017. Xception: Deep learning with depthwise separable convolutions. In:
In 2017 6th International Conference on Agro-Geoinformatics (pp. 1-5). IEEE. Proceedings of the IEEE conference on computer vision and pattern recognitionpp.
Mohanty, S.P., Hughes, D.P., Salathé, M., 2016. Using deep learning for image-based 1251–1258.
plant disease detection. Front. Plant Sci. 7, 1419. Ioffe, S., Szegedy, C., 2015. Batch normalization: accelerating deep network training by
Ma, J., Du, K., Zheng, F., Zhang, L., Gong, Z., Sun, Z., 2018. A recognition method for reducing internal covariate shift. In: International Conference on International
cucumber diseases using leaf symptom images based on deep convolutional neural Conference on Machine Learning, pp. 448–456. JMLR.org.
network. Comput. Electron. Agricul. 154, 18–24. Ramachandran, P., Zoph, B., Le, Q.V., 2017. Searching for activation functions. CoRR
Kawasaki, Y., Uga, H., Kagiwada, S., Iyatomi, H., 2015. Basic study of automated diag- abs/1710.05941. http://arxiv.org/abs/1710.05941.
nosis of viral plant diseases using convolutional neural networks. Springer, Cham, pp. Redmon, J., Farhadi, A., 2017. YOLO9000: better, faster, stronger. In: Proceedings of the
638–645. IEEE conference on computer vision and pattern recognitionpp. 7263–7271.
Kessentini, Y., Besbes, M.D., Ammar, S., Chabbouh, A., 2019. A two-stage deep neural Hung, J.C., Lin, K.C., Lai, N.X., 2019. Recognizing learning emotion based on convolu-
network for multi-norm license plate detection and recognition. Expert Syst. Appl. tional neural networks and transfer learning. Appl. Soft Comput. 84.
136, 159–170. Zeiler, M.D., Fergus, R., 2014. Visualizing and understanding convolutional networks.
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q., 2017. Densely connected Springer, Cham, pp. 818–833.
convolutional networks. In: Proceedings of the IEEE conference on computer vision Zhang, S., Zhang, S., Zhang, C., Wang, X., Shi, Y., 2019. Cucumber leaf disease identifi-
and pattern recognitionpp. 4700–4708. cation with global pooling dilated convolutional neural network. Comput. Electron.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Rabinovich, A., 2015. Agricul. 162, 422–430.
Going deeper with convolutions. In: Proceedings of the IEEE conference on computer Keras-GPU. Available online: https://anaconda.org/anaconda/keras-gpu (accessed on 17
vision and pattern recognitionpp. 1–9. Jun, 2019).
Cetinic, E., Lipic, T., Grgic, S., 2018. Fine-tuning convolutional neural networks for fine GeForce GTX 1060. Available online: https://www.nvidia.com/en-us/geforce/products/
art classification. Expert Syst. Appl. 114, 107–118. 10series/geforce-gtx -1060/specifications (accessed on 17 Jun, 2019).
Dyrmann, M., Karstoft, H., Midtiby, H.S., 2016. Plant species classification using deep Hughes, D., Salathé, M., 2015. An open access repository of images on plant health to
convolutional neural network. Biosyst. Eng. 151, 72–80. enable the development of mobile disease diagnostics. CoRR abs/1511.08060.
Lu, Y., Yi, S., Zeng, N., Liu, Y., Zhang, Y., 2017. Identification of rice diseases using deep https://arxiv.org/abs/1511.08060.
convolutional neural networks. Neurocomputing 267, 378–384. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the inception
Simonyan, K., Zisserman, A., 2015. Very deep convolutional networks for large-scale architecture for computer vision. In: Proceedings of the IEEE conference on computer
image recognition. In: Int. Conf. Learn. Represent. pp. 1–14. vision and pattern recognitionpp. 2818–2826.
Khan, S., Islam, N., Jan, Z., Din, I.U., Rodrigues, J.J.C., 2019. A novel deep learning based He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition. In:
framework for the detection and classification of breast cancer using transfer Proceedings of the IEEE conference on computer vision and pattern recognitionpp.
learning. Pattern Recog. Lett. 125, 1–6. 770–778.
11