Faster-PestNet A Lightweight Deep Learning Framework For Crop Pest Detection and Classification
Faster-PestNet A Lightweight Deep Learning Framework For Crop Pest Detection and Classification
ABSTRACT One of the most significant risks impacting crops is pests, which substantially decrease food
production. Further, prompt and precise recognition of pests can help harvesters save damage and enhance
the quality of crops by enabling them to take appropriate preventive action. The apparent resemblance
between numerous kinds of pests makes examination laborious and takes time. The limitations of physical
pest inspection are required to be addressed, and a novel deep-learning approach called the Faster-PestNet is
proposed in this work. Descriptively, an improved Faster-RCNN approach is designed using the MobileNet
as its base network and tuned on the pest samples to recognize the crop pests of various categories and
given the name of Fatser-PestNet. Initially, the MobileNet is employed for extracting a distinctive set of
sample attributes, later recognized by the 2-step locator of the improved Faster-RCNN model. We have
accomplished a huge experimentation analysis over a complicated data sample named the IP102 and acquired
an accuracy of 82.43%. Further, a local crops dataset is also collected and tested on the trained Faster-PestNet
approach to show the generalization capacity of the proposed model. We have confirmed through analysis
that the presented work can tackle numerous sample distortions like noise, blurring, light variations, and size
alterations and can accurately locate the pest along with the associated class label on the leaf of numerous
types and sizes. Both visual and stated performance values confirm the effectiveness of our model.
INDEX TERMS Classification of pest, deep learning, faster RCNN, pest detection.
a lack of expertise. However, the usage of fewer pesticides shown encouraging outcomes in the field of recognition
should be encouraged due to mounting environmental and of objects, a substantial study has focused on suggesting
health concerns [3], [4]. more complex target localization structures for enhanced
Spot application is a well-known method that can cut the identification competence, such as Super-FAN [10] and
cost of applying insecticides by up to 90%, reducing pollution unsupervised multi-stage key points learning [11] etc. Addi-
and lessening the harmful impacts on beneficial pests like tionally, several CNN-based methods, including GoogLeNet
bees. Such applications involve applying various Computer [15], AlexNet [16], VGG [17], and residual models [18], are
Vision (CV) approaches with the help of image processing tested for pest identification and classification. Some of the
to develop recognition systems. CV is increasingly being latest object recognition approaches like R-CNN [19], Fast
utilized for various tasks, including soil and crop monitoring, Region-based Convolutional Network Method (Fast R-CNN)
fruit sorting, plant disease detection, and pest identification. [20], Faster Region-based Convolutional Network Method
Identification and classification of pests are mandatory before (Faster R-CNNs) [21], and You Only Look Once (YOLO)
spraying on specific areas [5]. Pest detection has tradition- [22] have also exhibited improved results in various domains
ally relied on labor and error-intensive manual approaches. of agriculture.
Pest and disease identification is essential to acquiring crop The DL object recognition methods listed above have
growth and health data. The recent developments in computer shown outstanding results in designing generic target tracking
vision have provided much aid in precision agriculture [6]. systems, but they remain limited to a few practical uses
Target identification at various stages of agricultural growth for Pest monitoring. Pest diagnosis has distinctive attributes
is essential to forecast future yields, turn on astute spray- compared to current object spotting and categorization
ing systems, and manage independent insecticide spraying jobs [21].
robots in vast farms and gardens. However, with techno- In real field photos, pests are typically little objects that
logical advancements, pests can now be found via picture are accompanied by complicated environments; as a result,
processing. People are increasingly interested in precision any recognition system can be readily misled by the sur-
agriculture to address these issues [7]. Global positioning sys- roundings when estimating key points. Additionally, there is
tems (GPS) for tractor navigation, robotics, remote sensing, a significant variety in pest mass and positions due to the var-
data analytics, drones, and land vehicles are some of these ious vantage points and distances at which they are taken in
technologies [8]. The foundation of precision agriculture is the agricultural setting, which makes correct diagnosis more
accurate pest detection. For pest detection and spot spraying, difficult. Additionally, diverse pest types frequently share
computer vision must be used to capture and analyze visual plenty of physical characteristics, and identical types can
data [9]. appear in various forms, including larvae, eggs, pupae, and
Initially, conventional Machine Learning (ML) approaches adults, demonstrating significant within and inter-group dif-
with hand-coded features are employed to diagnose numer- ferences. In addition, the computerized recognition procedure
ous crop pests. It consists of four major steps: detection, is made more difficult by the presence of poor illumination
segmentation, feature extraction, and classification, and is and adverse conditions. In order to increase the effectiveness
carried out utilizing computer vision-based quality inspec- regarding categorization consistency and computing expense,
tion. However, it is challenging to recognize targets with such a more reliable and proficient automated approach for exact
approaches with adequate precision due to shape similarities, pest identification in the environment is still needed. The pro-
complex backgrounds, target overlap due to high-density dis- posed strategy’s objective is to overcome existing structures’
tribution, light variations in the large landscape of orchards, problems. We have accomplished this objective by intro-
and many other issues [10], [11]. Deep Neural Networks ducing an improved DL model named the Faster-PestNet.
(DNNs) are frequently utilized in computer vision applica- Descriptively, a customized Faster-RCNN framework is sug-
tions to plot complicated relationships and perform automatic gested by using the MobileNet as its base network and tuned
feature extraction. Deep artificial neural networks can now be on the pest samples to recognize the crop pests of various
trained more quickly and effectively with improved graphics categories and given the name of Faster-PestNet. First, the
processing units (GPUs). DNN provides meaningful results MobileNet is applied for calculating a distinctive set of image
for classifying objects. Such approaches utilize the idea features later recognized by the 2-step locator of the improved
of transfer learning in which numerous pre-learned deep Faster-RCNN model. We have proved the robustness of our
learning (DL) models are employed to execute a new task. approach by executing a huge experimental analysis over
DL methods utilize various Convolutional Neural Networks complex pest samples.
(CNNs) [12] for extracting effective dataset attributes that Following are the significant contributions of our work.
do not need domain skills [13]. DL architectures are widely • An accurate method capable of computing reliable
utilized to address complex issues in a sufficient period image features to enhance the Pest’s classification per-
due to the substantial advancement of computing equip- formance is proposed.
ment [14]. DL-based strategies have proven to be precise • Presented such a technique that is capable of accurately
and have been successfully modified to carry out various detecting and classifying the multiclass Pests due to the
farming activities. As improvements in DL techniques have high recall power of the proposed technique.
• A vast evaluation of the proposed work is performed on a Oriental fruit fly, Pieris rapae Linnaeus, Snail, Spodoptera
challenging dataset named IP102 and confirmed through Litura, Stinkbug, Cydia Pomonella, Weevil). Pre-trained
experimentation that the suggested method is able to models called VGG-16, VGG-19, ResNet50, ResNet152,
detect and classify numerous Pests with the presence of and GoogLeNet were employed in this experiment. The
several image distortions like noise, blurring, color, and accuracy of ResNet50 was 91.74%, ResNet152 was 92.9%,
light variations, etc. GoogleLeNet was 93.29%, VGG-16 was 91.44%, VGG-19
• A local crops dataset was collected, and the perfor- was 92.26%, and ResNet50 was 91.74%. Liu et al. [27]
mance analysis of the presented work was executed. used the anchor-free region convolutional neural network
It captured diverse environmental settings and proved (AF-RCNN) technique to detect agricultural pests in many
the generalization ability of our approach to agricultural categories. On a dataset with 24 classes of pests, this approach
applications. achieved 56.4% mean Average Precision (mAP) and 85.1%
recall.
The remaining paper is organized into the following sections:
For an effective CNN-based pest localization and recog-
the existing works are discussed in Section II, and the pro-
nition, Li et al. [28] implemented data augmentation in the
posed work in Section III. The results and conclusion are
training phase, test time augmentation (TTA) approach, and
demonstrated in Sections IV and V, respectively.
region proposal network (RPN) techniques. The mAP for this
model was 83.23%. To retrieve depth and spatial attention
II. LITERATURE REVIEW across many stages of the pyramid network, Liu et al. [29]
With the quick development of Artificial Intelligence (AI) manually collected dataset and implemented Global Acti-
and to address the limitations of ML, CNNs have been vated Feature Pyramid Network (GaFPN). Next, a Locally
fruitfully employed in agricultural research. CNN mod- Activated Region Proposal Network (LaRPN), an upgraded
els outperform conventional techniques when automatically pest localization module, was put out to find the exact
identifying and categorizing pest infestations [23]. Identify- locations of the pest objects. In the end, a ResNet50 back-
ing and determining certain pests in realistically obtained bone was used, achieving an accuracy of 86.9%. Images
images constitute pest categorization. Computer vision issues were physically gathered from two greenhouse locations in
are better handled by combining CNNs with ensemble mod- Belgium by Nieuwenhuizen et al. [30]. After that, yellow
els as feature extractors. The techniques utilized include sticky traps were utilized for insect detection and counting
Single-shot multi-box detectors (SSD) [24], YOLO, Faster using Faster R-CNN with ResNet-v2 as its foundation. This
R-CNNs, Region-based Convolutional Neural Networks approach had an accuracy rate of 87.4%. For large-scale
(R-CNNs), and Faster R-CNNs. Using it in object detection multiclass pest detection and classification, Wang et al. [31]
and recognition has been effective. used the PestNet technique, which comprises three phases:
Several scholars have conducted recent studies on object pest feature extraction (a CNN backbone), pest areas search,
detection methods for pest detection. Setiawan et al. [25] and pest prediction (fuse RPN and PSSM). They achieved
performed training on a CNN algorithm for pest detection and 75.46% mAP. The transfer learning (AlexNet) model was
used the IP102 dataset as a baseline. This study employed implemented by Dawei et al. [32]. The model’s accuracy in
dynamic learning rate, freezing layers, and sparse regular- identifying pests was 93.84%. Further, a pest dataset was
ization in conjunction with CutMix augmentation to opti- created by Xia et al. [33] using manually gathered photos
mize small MobileNetV2 models. The maximum accuracy, from search engines like Baidu and Google. The authors com-
71.32%, was achieved by amalgamating those procedures bined VGG19 and RPN models and obtained 89.22% insect
throughout training. Nanni et al. [26] utilized the IP102 and detection and classification accuracy. Li et al. [34] used the
a small dataset to spot and identify pest images. The author transfer learning (DenseNet169) technique to classify pests in
utilized CNN methods AlexNet, GoogLeNet, ShuffleNet, tomato plants. The collection includes 859 photos of tomato
MobileNetv2, and DenseNet201 along with saliency meth- pests divided into 10 classifications, and 88.83% accuracy
ods Graph-Based Visual Saliency (GBVS), Cluster-based was attained by DenseNet-169. The IP102 dataset was reor-
Saliency Detection (COS), and Spectral Residual (SPE). This ganized by Li et al. [34] and given the name IP_RicePests.
work reported a maximum accuracy of 92.43% on the smaller VGGNet, ResNet, and MobileNet networks were used to
dataset, while on the IP102 dataset, it was 61.93%. To catego- train the model. The experiment results demonstrate that all
rize crop pests, Setiawan et al. [25] used the NBAIR, Xiel, and three classification networks, when paired with transfer learn-
Xie2 insect datasets with 40, 24, and 40 classes, respectively. ing, have good recognition precision, with the IP_RicePests
In their experiments on datasets, they used AlexNet, ResNet- dataset providing the best classification accuracy to careful
50, ResNet-101, VGG-16, and VGG-19. For the insects’ adjustment of the ResNet50 network’s parameters. ResNet50
dataset mentioned earlier, the proposed CNN model achieved had an accuracy of 87.41%, MobileNet had an accuracy of
the highest classification accuracy of 96.75%, 97.47%, and 86.44%, and VGG16 had an accuracy of 88.68%.
95.97%. To use convolutional neural networks for crop pest To recognize nine different types of diseases and pests
recognition in innate situations, Liu et al. [27] manually col- on tomato plants, Sabanci et al. [35] integrated R-CNN,
lected datasets of 10 pests (Gryllotalpa, Leafhopper, locust, faster R-CNN, and SSD deep learning meta-learning with
104018 VOLUME 11, 2023
F. Ali et al.: Faster-PestNet: A Lightweight DL Framework for Crop Pest Detection and Classification
a visual geometry group network (VGG) and residual given the name Faster-PestNet. Real-time object identifica-
network [36]. A quick, precise, fine-grained object recog- tion on mobile devices is a good fit for MobileNet, an efficient
nition model based on the YOLOv4 deep neural network CNN approach designed for mobile devices with a smaller
was proposed by Roy et al. [37]. The proposed model’s footprint than conventional CNNs. Therefore, initially, the
detection rate and mAP were 70.19 FPS and 96.29%, MobileNet is applied for calculating a distinctive set of image
respectively. For the categorization of pests and diseases, characteristics, later optimized and divided by the 2-step
Liu et al. [38] devised a self-supervised transformer-based locator of the improved Faster-RCNN model and took the
pre-training technique employing Feature Relationship Con- following actions:
ditional Filtering (FRCF) and Latent Semantic Masking IP 102 dataset, which contains the images of pests that
Auto-Encoder (LSMAE). The accuracy rates for this study’s belong to 102 classes, is used, and the local collected crops
utilization of the IP102, CPB, and Plant Village datasets were dataset is also used. We have used an annotated dataset to
74.69%, 76.99%, and 99.93%, respectively. Zhang et al. [40] train a Faster-PestNet model using MobileNet as its founda-
employed the IP102 dataset to identify pests. The incep- tion. The annotated photos are fed into the model, and the
tionv3 model was used in this investigation, and the accu- parameters are changed to reduce the discrepancy between
racy was 67.88%. On six classes of the IP102 dataset, the anticipated and actual bounding boxes.
Deepika and Arthi [39] constructed the Improved Mask After training the model, we used it to detect pests in
Faster Region-Based Convolutional Neural Network (IMFR- new images by inputting them into the model and applying
CNN) model. The author has attained 99.2% accuracy in a threshold to the predicted bounding boxes to remove false
this investigation. Further, 20 classes from the IP102 dataset positives.
were used by Zhang et al. [40] for pest recognition. This The flow of the presented work is defined in Figure 1.
research used the 97.8% accuracy on the Faster and Exten-
sible Vision Transformer (FE-VIT) model. The Improved A. FASTER R-CNN
YOLO-X model was employed by Huang et al. [41] to spot The object detection algorithm Faster R-CNN expands on the
forest pests. A precision of 53.6% was reached in this inves- fundamental design of R-CNN and Fast R-CNN. A Region
tigation using the IP102 dataset. Li et al. [42] implemented Proposal Network (RPN) and a Fast R-CNN detector com-
the Mask-RCNN ResNet50, Faster-RCNN ResNet101, and prise its primary parts. An image is sent into the RPN, a fully
Yolov5 Darknet53 models for pest recognition. These meth- convolutional network, and outputs a list of object recommen-
ods each reached accuracy levels of 99.6%, 99.4%, and dations, each reflected by a bounding box and an objectness
97.6%. ResNeXt-50 (32 4d) model was used by Sang- score.
havi et al. [43] to classify pest images using a residual neural The RPN can recognize things of various sizes since it
network based on transfer learning using the IP102 dataset. operates across an image pyramid of various proportions. The
This study had an accuracy rate of 86.90%. A manually RPN is trained to maximize two loss functions: a regression
compiled dataset was produced by Vishakha et al. [44] for loss for forecasting the bounding box’s coordinates and a
the recognition and classification of crop pests using transfer classification loss for forecasting the objectness score of each
learning. This study used the hunger games search-based proposal.
deep convolutional neural network (HGS-DCNN) model with A proposal comprises an object or not is indicated by the
99% accuracy. When used in intricate settings to spot numer- objectness score, a binary classification score. This is how it
ous plant diseases, the model yields effective and efficient is explained:
outcomes. For detecting pests, several researchers have inves- 1
tigated object identification methods based on DL [45], [46]. pi = (1)
1 + e−wTxi
But none of this research covered the topic of identifying
scale insect to preserve beneficial insects. Pest control is still The feature vector of the proposal is represented by xi, and
not done in a way that is successful. Existing approaches the weight vector of the classification layer is i and w.
lack to recognize huge classes of crop pests and unable to The RPN generates a collection of ‘k’ proposals for every
execute well with the presence of several image distortions image and sends them to the Fast R-CNN detector for addi-
like noise, blurring, color and light variations, etc. Therefore, tional processing. The Fast R-CNN detector is a region-based
there remains a need for a more effective approach to over- detector that uses a Region of Interest (RoI) pooling layer
come the issues of existing works. to extract features from the RPN’s proposal outputs. A fully
connected network for classification and regression can be
used once the RoI pooling layer has generated fixed-size
III. METHODOLOGY feature maps for every set of rectangular RoIs.
This work proposed a model called the Faster-PestNet for
the localization and division of numerous crop pests. Hence, B. FASTER-PestNet
we have altered the conventional Faster-RCNN approach by The backbone network in the Faster R-CNN algorithm plays
utilizing MobileNet as the base network, tuned on the pest a critical role in extracting features from input images, which
samples to recognize the crop pests of various categories, and the RPN and classifier then utilize to detect objects. In this
project, we have utilized MobileNet as the backbone of the 3) FULLY CONNECTED LAYERS
architecture to achieve this goal. The lightweight CNN archi- The output from depthwise separable convolution layers is
tecture, MobileNet, is frequently utilized as the foundation mapped to a fixed-size feature map by the fully connected
for object detection algorithms due to its great efficiency and layers in the MobileNet backbone network, which the RPN
accuracy. and classifier can then use. The number of fully connected
In this study, the ResNet backbone networks are swapped layers and their criteria can be adjusted based on the specific
out for MobileNet backbone networks to serve as the Faster application.
R-CNN algorithm’s backbone. The major intuition for alter-
ing the base network of the conventional Faster-RCNN C. OUTPUT FEATURE MAP
approach is that the ResNet approach is computationally The MobileNet backbone network’s output feature map is
more complex and unable to tackle the complicated sam- H/32 × W/32 × D, where D is the number of feature chan-
ple transformational changes. To overwhelm the problems nels. It identifies objects in the image, and the RPN and
of the traditional model, we have employed the MobileNet classifier use this feature map as input.
approach as its feature extractor. The depth-wise separa-
ble convolutions used in the MobileNet design considerably D. RoI POOLING
lessen the number of specifications in the network while The output feature map from the mobilenet backbone network
retaining good precision. is subjected to RoI pooling to take out a fixed-size feature
The Faster-PestNet algorithm’s MobileNet backbone net- vector for each region proposal after the RPN creates a series
work can be characterized as follows: of region proposals. Each of the rectangular regions of the
feature map corresponding to the region suggestions receives
1) MOBILENET BACKBONE a max-pooling operation as part of the roi pooling operation.
The MobileNet backbone network is made up of a number
of fully connected (FC) layers after numerous depth-wise E. FULLY CONNECTED LAYERS FOR CLASSIFICATION
separable convolution layers. A size-related image acts as AND REGRESSION
the input to the MobileNet backbone network, as given in After the RoI pooling layer’s output, two distinct and com-
Equation 2. pletely connected layers for classification and regression are
used. The object in the region proposal is classified using the
ImageSize = H × W × 3 (2) classification layer, and the bounding box’s coordinates are
refined using the regression layer. The following can be used
where the number of color channels is 3, and the image’s to represent the classification and regression layer equation:
height and width are H and W, respectively.
fccls = ReLU (Wcls ∗ hpool + bcls ) (3)
2) DEPTH-WISE SEPARABLE CONVOLUTION The weight matrix Wcls , the bias vector bcls , the output of
There are two components to the depth-wise separable con- the RoI pooling layer hpool , and the rectified linear unit
volution layer: (i) depth-wise convolution and (ii) pointwise activation function ReLU are all present. Equation 3 is for
convolution. A 1 × 1 convolution is applied by pointwise the classification layer, and Equation 4 is shown below for
convolution to aggregate the output of the depth-wise con- the regression layer:
volution, while the depth-wise convolution applies a single
convolution filter to each input channel autonomously. fcreg = Wreg ∗ hpool + breg (4)
F. LOSS FUNCTION
The loss function is used to train the faster-pestnet
algorithm. The faster-pestnet model is calculated from the
output of the classification and regression layers. A Classi-
fication And Regression Loss Term Are Combined To Form FIGURE 2. Samples from the IP102 dataset.
The Loss Function. Equations 5 and 6 represent the classifi-
cation and regression loss terms, respectively. contain 56,846, 8,047, and 11,955 photos. Then, human
specialists manually add object-level labels to the photos
Lcls (p, p∗) = −log (p∗) if p∗ > 0 else − log (1 − p∗) (5) by hand, identifying one or more pests in the image. The
IP102 dataset is a difficult benchmark for object recognition
where the predicted probability of the object class is p, the
and detection tasks due to several distinctive features. These
ground truth probability of the object class is p∗ , and log is
consist of:
the natural logarithm.
• Diverse types of pests: The 102 categories in the IP102
Lreg (t, t∗) = SmoothL1(t − t∗) (6) dataset are diverse in nature.
• Occlusion and clutter: Many of the images in the IP102
where t is the predicted bounding box offset, t∗ is the ground dataset contain occlusions and clutter, such as multiple
truth bounding box offset, and SmoothL1 is the smooth L1 objects in the scene or objects partially obscured by other
loss function defined as: objects.
(
0.5x 2 if |x| < 1 • Imbalanced class distribution: The number of images per
smoothL1 (x) = (7) category in the IP102 dataset varies broadly, with some
|x| − 0.5 otherwise
categories having only a few images and others having
A succession of depth-wise separable convolution layers, thousands of images.
fully connected layers, RoI pooling, and finally, distinct fully A few samples from the IP102 are given in Figure 2.
connected layers for classification and regression make up
the MobileNet backbone of the Faster-PestNet method. The B. IMPLEMENTATION DETAILS
model is trained to precisely identify items in a picture using The Keras library is used in TensorFlow to implement the
the loss function. suggested framework. The Faster-PestNet model’s final train-
ing parameters are detailed in Table 1. To create the final
IV. EXPERIMENT DETAILS AND RESULTS optimized model in our study, we varied the epochs, batch,
The details of execution and the evaluations performed to and learning rate for the model’s hyperparameters. The exper-
assess the results of the proposed approach are elaborated in iment used the Stochastic Gradient Descent (SGD) training
this part. To thoroughly show the effectiveness of the Faster- optimizer amidst model learning rates of 0.0015. The epoch
PestNet model, we calculated pest recognition and division and batch size were 200 and 32. The input picture dimensions
results via numerous experiments and correlated them with were set at 320 × 320, and the data were split into training
other models. authentication and test sets at random. 70% of the samples
were utilized for training, 15% for validation, and 15% for
A. DATASET examination.
For model tuning and testing, we have employed the IP102
dataset, which is a large-scale benchmark dataset for pest C. EVALUATION PARAMETERS
image classification and recognition tasks, consisting of We have employed a variety of quantitative indicators, includ-
102 categories of crop pests commonly found in field areas. ing precision (P), recall (R), accuracy (Acc), and mAP,
The IP102 dataset is challenging for picture categorization to assess the efficacy of the proposed approach. Following
tasks. Dataset photos contain a broad range of perspective, is how these metrics are computed:
scale, orientation, and illumination changes. The dataset has
been utilized in several computer vision and machine learning P = TP/(TP + FP) (8)
research papers. Each of the 102 categories in the IP102 R = TP/(TP + FN ) (9)
dataset has a different number of photos. More than 75,000
Acc = (TP + TN )/(TP + TN + FP + FN ) (10)
photos from 102 categories of pests are included in the
dataset. The photos in the collection are RGB images with True positive, true negative, false positive, and false neg-
a resolution of 224 × 224 pixels. The pictures were gathered ative situations are denoted by the letters TP, TN, FP, and
from online resources, including Google Images and Flickr. FN. The pest in the image is regarded as TP if accurately
A training set, a validation set, and a test set are each divided identified; otherwise, it is regarded as FN. If the categoriza-
into separate portions of the dataset. These collections each tion is incorrect, the that is not visible in the photograph
TABLE 3. Faster-PestNet comparison with other object locators. TABLE 4. Faster-PestNet comparison with existing approaches.
V. CONCLUSION
In this study, we have provided a cost-effective DL sys-
tem for the automated spotting and division of crop pests.
Specifically, a model named the Faster-PestNet is proposed
in which the MobileNet approach is used as a core net-
work for collecting dense features. We tested our method
using the IP102 dataset collection, representing an extensive
and challenging standard collection for pest identification
made up of in-field collected photos. We have demonstrated
the viability of our method for practical pest surveillance
FIGURE 9. Faster-PestNet confusion matrix on the local crops dataset. tasks through considerable experiments. The results indi-
cated that our system could reliably localize and categorize
pests of various types, even in the context of complicated
backgrounds and fluctuations in different pest forms, hues,
dimensions, positions, and luminance. Further, we have a
local crop dataset and evaluated our approach to show better
generalization capability. All reported numeric and picto-
rial evaluations have proved our approach’s effectiveness in
recognizing huge pests. As a future consideration, we are
willing to design a further enhanced DL approach to improve
the classification results by taking into account other strate-
gies like feature fusion, etc. Further, we are motivated to
evaluate the proposed technique in other agriculture-related
applications, such as recognizing the crop diseases caused
by pests.
FIGURE 10. Faster-PestNet performance analysis on the local crops
dataset.
REFERENCES
[1] J. Bruinsma, ‘‘The resource outlook to 2050: By how much do land, water
and crop yields need to increase by 2050,’’ in Proc. Exp. Meeting Feed
I. GENERALIZATION ABILITY TESTING World, 2009, pp. 24–26.
[2] S. Neethirajan, ‘‘The role of sensors, big data and machine learning
To further show the robustness of the proposed Faster-PestNet in modern animal farming,’’ Sens. Bio-Sens. Res., vol. 29, Aug. 2020,
approach. We have collected a local crops dataset comprising Art. no. 100367.
a total of 1950 images from various site areas and labeled [3] T. Bai, J. Lin, G. Li, H. Wang, P. Ran, Z. Li, D. Li, Y. Pang, W. Wu, and
G. Jeon, ‘‘A lightweight method of data encryption in BANs using elec-
them into six classes with the help of domain experts. These trocardiogram signal,’’ Future Gener. Comput. Syst., vol. 92, pp. 800–811,
classes are named ‘Bug’, ‘Pupa Borer’, ‘Root Borer’, ‘Bee- Mar. 2019.
tle’, ‘Fall Army Bug’, and ‘Army Worm’. [4] J. Qin, F. Liu, K. Liu, G. Jeon, and X. Yang, ‘‘Lightweight hierarchical
We tuned the Faster-PestNet on this data sample using the residual feature fusion network for single-image super-resolution,’’ Neu-
rocomputing, vol. 478, pp. 104–123, Mar. 2022.
80-20% division mechanism for model learning and evalua- [5] M. T. Mallick, S. Biswas, A. K. Das, H. N. Saha, A. Chakrabarti, and
tion. Figure 8 shows visuals attained and clearly shows that N. Deb, ‘‘Deep learning based automated disease detection and pest clas-
our approach can diagnose the Pest of numerous types from sification in Indian mung bean,’’ Multimedia Tools Appl., vol. 82, no. 8,
pp. 12017–12041, Mar. 2023.
real-world examples under complicated background settings. [6] K. Rimal, K. B. Shah, and A. K. Jha, ‘‘Advanced multi-class deep learning
Further, the confusion matrix for this locally crops dataset convolution neural network approach for insect pest classification using
is reported in Figure 9 to show the recognition ability of our TensorFlow,’’ Int. J. Environ. Sci. Technol., vol. 20, no. 4, pp. 4003–4016,
Apr. 2023.
approach. The values in Figure 9 prove that the Faster-PestNet
[7] W. Albattah, M. Masood, A. Javed, M. Nawaz, and S. Albahli, ‘‘Custom
approach is proficient in recognizing all classes of pests CornerNet: A drone-based improved deep learning technique for large-
with a high recall rate. We have also measured other per- scale multiclass pest localization and classification,’’ Complex Intell. Syst.,
vol. 9, no. 2, pp. 1299–1316, Apr. 2023.
formance measures like precision, recall, F1, and accuracy
[8] G. Pajares, ‘‘Overview and current status of remote sensing applications
metrics, and attained values are given in Figure 10. We have based on unmanned aerial vehicles (UAVs),’’ Photogramm. Eng. Remote
obtained 95.24%, 95.26%, 95.23%, and 95.24% for this local Sens., vol. 81, no. 4, pp. 281–329, 2015.
[9] E. F. I. Raj, M. Appadurai, and K. Athiappan, ‘‘Precision farming in [33] D. Xia, P. Chen, B. Wang, J. Zhang, and C. Xie, ‘‘Insect detection and clas-
modern agriculture,’’ in Smart Agriculture Automation Using Advanced sification based on an improved convolutional neural network,’’ Sensors,
Technologies: Data Analytics and Machine Learning, Cloud Architecture vol. 18, no. 12, p. 4169, Nov. 2018.
Automation and IoT. Singapore: Springer, 2022, pp. 61–87. [34] Z. Li, X. Jiang, X. Jia, X. Duan, Y. Wang, and J. Mu, ‘‘Classification
[10] A. Bulat and G. Tzimiropoulos, ‘‘Super-FAN: Integrated facial landmark method of significant rice pests based on deep learning,’’ Agronomy,
localization and super-resolution of real-world low resolution faces in arbi- vol. 12, no. 9, p. 2096, Sep. 2022.
trary poses with GANs,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern [35] K. Sabanci, M. F. Aslan, E. Ropelewska, M. F. Unlersen, and A. Durdu,
Recognit., Jun. 2018, pp. 109–117. ‘‘A novel convolutional-recurrent hybrid network for sunn pest-damaged
[11] P. Zhou, B. Ni, C. Geng, J. Hu, and Y. Xu, ‘‘Scale-transferrable object wheat grain detection,’’ Food Anal. Methods, vol. 15, no. 6, pp. 1748–1760,
detection,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2022.
Jun. 2018, pp. 528–537. [36] A. Fuentes, S. Yoon, S. Kim, and D. Park, ‘‘A robust deep-learning-
[12] T. Roska and L. O. Chua, ‘‘The CNN universal machine: An analogic array based detector for real-time tomato plant diseases and pests recognition,’’
computer,’’ IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., Sensors, vol. 17, no. 9, p. 2022, Sep. 2017.
vol. 40, no. 3, pp. 163–173, Mar. 1993. [37] A. M. Roy, R. Bose, and J. Bhaduri, ‘‘A fast accurate fine-grain object
[13] L. R. Medsker and L. Jain, ‘‘Recurrent neural networks,’’ Des. Appl., vol. 5, detection model based on YOLOv4 deep neural network,’’ Neural Comput.
p. 2, Dec. 2001. Appl., vol. 34, pp. 3895–3921, Jan. 2022.
[14] A. Kamilaris and F. X. Prenafeta-Boldu, ‘‘Deep learning in agriculture: [38] H. Liu, Y. Zhan, H. Xia, Q. Mao, and Y. Tan, ‘‘Self-supervised transformer-
A survey,’’ Comput. Electron. Agricult., vol. 147, pp. 70–90, Apr. 2018. based pre-training method using latent semantic masking auto-encoder
for pest and disease classification,’’ Comput. Electron. Agricult., vol. 203,
[15] P. Sermanet, A. Frome, and E. Real, ‘‘Attention for fine-grained catego-
Dec. 2022, Art. no. 107448.
rization,’’ 2014, arXiv:1412.7054.
[39] P. Deepika and B. Arthi, ‘‘Prediction of plant pest detection using improved
[16] H. Kataoka, K. Iwata, and Y. Satoh, ‘‘Feature evaluation of deep con- mask FRCNN in cloud environment,’’ Meas., Sensors, vol. 24, Dec. 2022,
volutional neural networks for object recognition and detection,’’ 2015, Art. no. 100549.
arXiv:1509.07627. [40] L. Zhang, J. Du, and R. Wang, ‘‘FE-VIT: A faster and extensible vision
[17] L. Wang, S. Guo, W. Huang, and Y. Qiao, ‘‘Places205-VGGNet models for transformer based on self pre-training for pest recognition,’’ Proc. SPIE,
scene recognition,’’ 2015, arXiv:1508.01667. vol. 12349, pp. 35–42, Oct. 2022.
[18] S. Targ, D. Almeida, and K. Lyman, ‘‘ResNet in ResNet: Generalizing [41] J. Huang, Y. Huang, H. Huang, W. Zhu, J. Zhang, and X. Zhou,
residual architectures,’’ 2016, arXiv:1603.08029. ‘‘An improved YOLOX algorithm for forest pest detection,’’ Comput.
[19] A. Raj, V. P. Namboodiri, and T. Tuytelaars, ‘‘Subspace alignment based Intell. Neurosci., vol. 2022, Aug. 2022, Art. no. 5787554.
domain adaptation for RCNN detector,’’ 2015, arXiv:1507.05578. [42] W. Li, T. Zhu, X. Li, J. Dong, and J. Liu, ‘‘Recommending advanced deep
[20] R. Girshick, ‘‘Fast R-CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), learning models for efficient insect pest detection,’’ Agriculture, vol. 12,
Dec. 2015, pp. 1440–1448. no. 7, p. 1065, Jul. 2022.
[21] X. Zhao, W. Li, Y. Zhang, T. A. Gulliver, S. Chang, and Z. Feng, ‘‘A faster [43] C. Li, T. Zhen, and Z. Li, ‘‘Image classification of pests with residual neural
RCNN-based pedestrian detection system,’’ in Proc. IEEE 84th Veh. Tech- network based on transfer learning,’’ Appl. Sci., vol. 12, no. 9, p. 4356,
nol. Conf. (VTC-Fall), Sep. 2016, pp. 1–5. Apr. 2022.
[22] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, ‘‘You only look once: [44] V. B. Sanghavi, H. Bhadka, and V. J. Dubey, ‘‘Hunger games search
Unified, real-time object detection,’’ in Proc. IEEE Conf. Comput. Vis. based deep convolutional neural network for crop pest identification and
Pattern Recognit. (CVPR), Jun. 2016, pp. 779–788. classification with transfer learning,’’ Evolving Syst., vol. 14, pp. 649–671,
[23] J. G. A. Barbedo, ‘‘Detecting and classifying pests in crops using proximal Jul. 2022.
images and machine learning: A review,’’ AI, vol. 1, no. 2, pp. 312–328, [45] W. Xia, D. Han, D. Li, Z. Wu, B. Han, and J. Wang, ‘‘An ensemble learning
Jun. 2020. integration of multiple CNN with improved vision transformer models
[24] A. Kumar, Z. J. Zhang, and H. Lyu, ‘‘Object detection in real time based on for pest classification,’’ Ann. Appl. Biol., vol. 182, no. 2, pp. 144–158,
improved single shot multi-box detector algorithm,’’ EURASIP J. Wireless Mar. 2023.
Commun. Netw., vol. 2020, no. 1, pp. 1–18, Dec. 2020. [46] S. Wang, Q. Zeng, W. Ni, C. Cheng, and Y. Wang, ‘‘ODP-transformer:
Interpretation of pest classification results using image caption gen-
[25] A. Setiawan, N. Yudistira, and R. C. Wihandika, ‘‘Large scale pest
eration techniques,’’ Comput. Electron. Agricult., vol. 209, Jun. 2023,
classification using efficient convolutional neural network with augmen-
Art. no. 107863.
tation and regularizers,’’ Comput. Electron. Agricult., vol. 200, Sep. 2022,
[47] A. Krizhevsky and V. J. Nair, ‘‘Cifar-100 (Canadian Institute for advanced
Art. no. 107204.
research). 30 [65] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.
[26] L. Nanni, G. Maguolo, and F. Pancino, ‘‘Insect pest image detection and
ImageNet classification with deep convolutional neural networks,’’ in
recognition based on bio-inspired methods,’’ Ecological Informat., vol. 57,
Proc. Adv. Neural Inf. Process. Syst., vol. 25, 2012, p. 26.
May 2020, Art. no. 101089.
[48] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
[27] W. Liu, G. Wu, F. Ren, and X. Kang, ‘‘DFF-ResNet: An insect pest V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’
recognition model based on residual networks,’’ Big Data Mining Anal., in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,
vol. 3, no. 4, pp. 300–310, Dec. 2020. pp. 1–9.
[28] R. Li, X. Jia, M. Hu, M. Zhou, D. Li, W. Liu, R. Wang, J. Zhang, [49] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
C. Xie, L. Liu, F. Wang, H. Chen, T. Chen, and H. Hu, ‘‘An effective data large-scale image recognition,’’ 2014, arXiv:1409.1556.
augmentation strategy for CNN-based pest localization and recognition in [50] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and
the field,’’ IEEE Access, vol. 7, pp. 160274–160283, 2019. A. C. Berg, ‘‘SSD: Single shot MultiBox detector,’’ in Computer Vision—
[29] L. Liu, C. Xie, R. Wang, P. Yang, S. Sudirman, J. Zhang, R. Li, and ECCV. Amsterdam, The Netherlands: Springer, 2016, pp. 21–37.
F. Wang, ‘‘Deep learning based automatic multiclass wild pest monitoring [51] M. Z. Alom, M. Hasan, C. Yakopcic, T. M. Taha, and V. K. Asari,
approach using hybrid global and local activated features,’’ IEEE Trans. ‘‘Improved inception-residual convolutional neural network for object
Ind. Informat., vol. 17, no. 11, pp. 7589–7598, Nov. 2021. recognition,’’ Neural Comput. Appl., vol. 32, no. 1, pp. 279–293, Jan. 2020.
[30] A. Nieuwenhuizen, J. Hemming, and H. Suh, ‘‘Detection and classifi- [52] A. Newell, K. Yang, and J. Deng, ‘‘Stacked hourglass networks for
cation of insects on stick-traps in a tomato crop using faster R-CNN,’’ human pose estimation,’’ in Computer Vision—ECCV. Amsterdam, The
Agro Food Robot., Wageningen Univ. Res., Wageningen, The Netherlands, Netherlands: Springer, 2016, pp. 483–499.
Tech. Rep., 2018. [53] Ü. Atila, M. Uçar, K. Akyol, and E. J. Uçar, ‘‘Plant leaf disease classi-
[31] L. Liu, R. Wang, C. Xie, P. Yang, F. Wang, S. Sudirman, and W. Liu, ‘‘Pest- fication using EfficientNet deep learning model,’’ Ecol. Inform., vol. 61,
Net: An end-to-end deep learning approach for large-scale multi-class pest Mar. 2021, Art. no. 101182.
detection and classification,’’ IEEE Access, vol. 7, pp. 45301–45312, 2019. [54] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,
[32] W. Dawei, D. Limiao, N. Jiangong, G. Jiyue, Z. Hongfei, and H. Zhongzhi, P. Dollár, and C. L. Zitnick, ‘‘Microsoft COCO: Common objects in con-
‘‘Recognition pest by image-based transfer learning,’’ J. Sci. Food text,’’ in Computer Vision—ECCV. Zurich, Switzerland: Springer, 2014,
Agricult., vol. 99, no. 10, pp. 4524–4531, Mar. 2019. pp. 740–755.
[55] J. Redmon and A. Farhadi, ‘‘YOLOv3: An incremental improvement,’’ HUMA QAYYUM received the M.Sc. degree in
2018, arXiv:1804.02767. software engineering from the NUST College of
[56] S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, ‘‘Single-shot refinement Electrical and Mechanical Engineering, Pakistan,
neural network for object detection,’’ in Proc. IEEE/CVF Conf. Comput. in 2009, and the Ph.D. degree in software engi-
Vis. Pattern Recognit., Jun. 2018, pp. 4203–4212. neering from the University of Engineering and
[57] H. Law and J. Deng, ‘‘CornerNet: Detecting objects as paired keypoints,’’ Technology at Taxila, Pakistan, in2019. Her cur-
in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 734–750. rent research interests include image processing,
[58] M. T. Reza, N. Mehedi, N. A. Tasneem, and M. A. Alam, ‘‘Identification medical image analysis, and computer vision.
of crop consuming insect pest from visual imagery using transfer learning
and data augmentation on deep neural network,’’ in Proc. 22nd Int. Conf.
Comput. Inf. Technol. (ICCIT), Dec. 2019, pp. 1–6.
[59] E. Ayan, H. Erbay, and F. Varçin, ‘‘Crop pest classification with a genetic
algorithm-based weighted ensemble of deep convolutional neural net-
works,’’ Comput. Electron. Agricult., vol. 179, Dec. 2020, Art. no. 105809.
[60] S.-Y. Zhou and C.-Y. Su, ‘‘Efficient convolutional neural network for pest
recognition—ExquisiteNet,’’ in Proc. IEEE Eurasia Conf. IoT, Commun.
Eng. (ECICE), Oct. 2020, pp. 216–219.
[61] F. Ren, W. Liu, and G. Wu, ‘‘Feature reuse residual networks for insect pest
recognition,’’ IEEE Access, vol. 7, pp. 122758–122768, 2019.
[62] H. T. Ung, H. Q. Ung, and B. T. Nguyen, ‘‘An efficient insect pest
classification using multiple convolutional neural network based models,’’
2021, arXiv:2107.12189.
[63] T. Zheng, X. Yang, J. Lv, M. Li, S. Wang, and W. Li, ‘‘An efficient mobile
model for insect image classification in the field pest management,’’ Eng.
Sci. Technol., Int. J., vol. 39, Mar. 2023, Art. no. 101335.
[64] X. Wu, C. Zhan, Y.-K. Lai, M.-M. Cheng, and J. Yang, ‘‘IP102: A large- MUHAMMAD JAVED IQBAL received the
scale benchmark dataset for insect pest recognition,’’ in Proc. IEEE/CVF M.Sc. degree in computer science from the
Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 8779–8788. University of Agriculture, Faisalabad, Pakistan,
in 2001, the M.S./M.Phil. degree in computer
science from International Islamic University
FAROOQ ALI received the B.Sc. degree in soft- Islamabad, Pakistan, in 2008, and the Ph.D.
ware engineering from the University of Engineer- degree in computer science/information tech-
ing and Technology at Taxila, and the M.Sc. degree nology from Universiti Teknologi PETRONAS,
in software engineering from Riphah International Malaysia, in February 2015. He is currently a
University Islamabad, with a focus on software HEC-approved Ph.D. Supervisor and an Assistant
engineering discipline. He is currently pursuing Professor with the Computer Science Department, University of Engineering
the Ph.D. degree with UET Taxila. His current and Technology, Taxila, Pakistan. After the Ph.D. studies, he has been
research interests include image processing, med- actively involved in research. He has more than 20 international publications
ical image analysis, and computer vision. which include four ISI-indexed impact factor journals.