Medicinal Plant Classification Using Particle Swarm Optimized Cascaded Network
Medicinal Plant Classification Using Particle Swarm Optimized Cascaded Network
Corresponding authors: Md. Abdus Samad ([email protected]) and Imran Ashraf ([email protected])
This work was supported by European University of the Atlantic.
ABSTRACT Medicinal plants are essential to healthcare since ancient times and are integral to developing
drugs and other medical treatments. More than 25% of medicines in developed countries are produced
from medicinal plants, while in developing countries, approximately 80% of individuals receive primary
healthcare from these plants. Traditionally, these plants are identified manually by experts, which is tedious,
time-consuming, subjective and dependent on the availability of experts. Furthermore, a wrong detection can
result in serious health issues or death. This signifies the need for a more reliable approach to identifying
medicinal plants, which is accurate and practical. Several automated methods were proposed previously,
utilizing deep learning and traditional machine learning (TML) techniques, but they require singular leaf
images and failed to achieve sufficient accuracy when demonstrated in a different setting. Capturing singular
leaf images for each plant is also time-consuming and laborious. This paper presents a robust, accurate and
practical system to identify medicinal plants from smartphone-captured plant images in the site of plants.
The proposed system utilized a cascaded architecture to extract features using a pre-trained ResNet50 model,
which were optimized using Particle Swarm Optimization (PSO) to classify the plants using a Support
Vector Machine (SVM). The proposed ResNet50-PSO-SVM network classified seven medicinal plants with
99.60% accuracy, outperforming the state-of-the-art (99%). The system was demonstrated for three different
smartphones, classifying an image in 0.15 seconds with 97.79% accuracy on average. The system’s high
accuracy, rapid identification time and robustness ensured its practical use.
INDEX TERMS Medicinal plants, particle swarm optimization, feature selection, cascaded network,
medicinal plant classification.
modern drug discovery. According to the World Health plant classification in the site of plants from smartphone-
Organization (WHO), 60% of the population worldwide captured plant images, 2) an analysis of the pre-trained CNN-
relies on medicinal plant-based medicines [1]. In the United PSO-traditional classifier-based cascaded approach for the
States, around 80 out of 100 prescribed medicines are classification of medicinal plants, 3) an ablation study to
produced mainly or partly from medicinal plants [2]. This is evaluate the contribution of PSO in the cascaded network
even higher in underdeveloped countries. On top of that, the and 4) the assessment of the proposed system in terms of
demand for medicinal plants is increasing rapidly throughout robustness, rapidness and facility to provide results in the site
the world. of plants to ensure its practical application.
An estimated 350,000 medicinal plants exist worldwide,
which is 10% of all vascular plants [3]. Thus, identifying II. RELATED WORKS
the medicinal plants is a challenging task. In Bangladesh, Several image-based automated methods were proposed to
more than 116 types of medicinal plants are reported by classify medicinal plants. These methods can be broadly
the experts [4]. Traditionally, these plants are identified classified into three categories: 1) manually selected features
based on the naked eye observation of experts. Experts with traditional machine learning (TML) or convolutional
mainly observe the shape, color, texture and aroma of neural network (CNN) classifiers, 2) CNN-based feature
leaves and flowers to identify a plant. However, this manual selection and classification and 3) cascade of multiple TML
examination is not practical as it takes a lot of time, is prone or CNN models. Typically, these methods targeted the local
to fatigue and introduces inter-observer variability. Many medicinal plants and fine-tuned the methods for classifying
researchers have proposed automated methods for medicinal them.
plant identification, motivated by the latest advancements Kan et al. proposed a method by manually selecting ten
in computer vision and machine learning techniques. These shapes and five texture features to classify 12 medicinal
medicinal plant classification methods were proposed for plants using SVM [5]. They collected 240 leaf images
mainly two purposes: detecting leaf disease in medicinal from China for the 12 plants. This method achieved an
plants and classifying the type of medicinal plants for accuracy of 93.3%. Another method based on leaf shape
drug development or treatment. In this study, we mainly and texture features was proposed by Janani and Gopal [6].
focused on classifying medical plants; thus, we have only They collected 63 leaf images for six plants and trained
reviewed the relevant papers proposed to classify medicinal a custom-built artificial neural network (ANN) to classify
plants [5], [6], [7], [8], [9], [10], [11], [18], [19], [20], [21], the plants, achieving 94.4% accuracy. Begue et al. extracted
[22], [23]. The major limitation of the existing medicinal 24 features related to shape, color, and texture from 30 leaf
plant classification methods is their dependency on singular images belonging to 20 plants [7]. They then trained a random
leaf images captured from a close distance, usually less forest classifier to classify the plant, which obtained 90.1%
than a meter. Capturing distinct leaf images for each plant accuracy. Habiba et al. extracted Modified Local Gradient
is tedious and time-consuming, compromising the primary Pattern-based texture features from 1054 leaf images of
objective of developing an automated detection system. 10 medicinal plants and trained an SVM classifier to
These methods also fail to classify the plants from plant classify the plants [8]. This study collected images from
images. Moreover, they fail when the leaf images are captured Bangladesh, similar to ours, but achieved only 96.11%
using a different device or from a distance. These methods accuracy. Naeem et al. utilized a bank of 65 features, which
work when the input image contains a singular leaf taken include texture features, run-length matrix, and multi-spectral
from a close distance. Most of these methods can not provide features [9]. They captured 600 multi-spectral images of
classification results on the site of examination as they 6 medicinal plants and classified them using Multi-Layer
process the captured images in a different location where Perceptron (MLP) with 99.13% accuracy. However, multi-
the method is implemented in a computer. Therefore, the spectral imaging is costly, and it is not easy to carry the
existing methods are not suitable for practical use. Another imaging device to the site of plants; it is not suitable for
limitation is their robustness. Most of the methods were tested practical applications. Pacifico et al. also utilized color and
using homogeneous data. Consequently, they failed when the texture features of leaves to classify plants using MLP
images were captured using a different device. Additionally, network [10]. They collected 1148 leaf images of 15 plants
the deep learning-based methods are highly parameterized from Brazil and achieved 97% accuracy. Rohmat et al.
and computationally heavy, challenging to implement in a proposed another manual feature-based approach. However,
low-computing machine such as a smartphone. This paper they selected the features using principle component analysis
proposed a cascaded network combining a pre-trained CNN, and then classified the plans using the CNN method [11].
PSO and SVM to classify the medicinal plants. This method Azadnia and Kheiralipour utilized 28 color and texture
can accurately predict the plant class from plant images taken features of leaves to classify six medicinal plants [12].
from a normal distance, regardless of the imaging device. They used a specially prepared image acquisition system
This method is lightweight yet robust. and trained an artificial neural network to classify the
The major contributions of this paper are listed as follows: quality-enhanced leaf images. This method achieved 100%
1) the development of a cascaded network for medicinal in training. Anami et al. combined edge features with color
and texture to classify three types of plants using SVM [13]. Medicinal Plants from phytochemistry and therapeutics
This method achieved 90% accuracy when trained using plants [27]. Berihu et al. proposed a GoogLeNet-based
278 features from 900 images. Pushpa et al. used the KNN method to classify medicinal plants from Ethiopia with
classifier to classify ten plants based on the gray-level texture 96.7% accuracy [28]. Another neural network-based method
features with 60% accuracy [14]. Another texture-based was proposed by Kumar et al. to classify 25 different
SVM classifier was proposed by Puri et al.; however, the medicinal plant leaves using a Multi-layer Perceptron (MLP)
accuracy was 91% in their method [15]. Venkataraman and classifier [29]. This method prepared and investigated the
Mangayarkarasi combined histogram-oriented features with performance of 6 different deep learning-based methods and
texture features and classified using SVM [16]. When few obtained the highest accuracy of 82.51%. Sharma studied the
classes and limited training images were available, the TML performance of VGG and VGG19 models [30]. The VGG16
classifier-based approaches using manually selected features model achieved a higher accuracy of 98.52% in classifying
functioned effectively. However, effective feature selection 21 plants. Widneh et al. trained a MobileNet model using
is critical to these models’ performance. TML models are 15,100 leaf images to classify 52 plants and achieved 92%
lighter and have fewer parameters than deep learning-based accuracy [31]. These methods utilized entirely CNN-based
techniques like CNN. models. They obtained excellent accuracy when trained on
On the contrary, deep learning models are highly param- large datasets; nevertheless, for most of these methods, the
eterized but suitable for achieving high accuracy when the loss climbed with the accuracy, particularly when the number
number of classes is high, given a large dataset. Duong- of classes was high. This phenomenon suggests an over-
Trung et al. utilized lightweight MobileNet-based CNN for fitting model which fails when used in different settings.
classifying ten medicinal plants collected from Vietnam [17]. Some studies integrated multiple networks to develop a
This method was trained and tested using 2296 leaf images cascaded architecture to utilize different models for feature
and achieved 98.7% accuracy. Valdez et al. also utilized selection and classification. The cascaded architecture fea-
MobileNet and achieved 97.43% accuracy [18]. Akter and turing deep learning-based feature selection with traditional
Hosen [19] and Musa et al. [20] proposed a CNN-based machine learning-based classification tends to achieve higher
method to classify Bangladeshi medicinal plants, similar to accuracy with low loss when the number of classes is high.
our study. Raisa et al. prepared a dataset of 37,693 leaf Moreover, these systems have less parameters compared to
images of 10 plants and classified them with 71.3% accuracy. the entirely CNN-based methods, thus less computational
In contrast, Musa et al. classified six plants with 95.58% complexity. Dileep and Pournami cascaded AlexNet for
accuracy. Saikia et al. [21] also utilized a neural network extracting the features and SVM for classification [22],
for classifying six medicinal plants from 90 RGB leaf which achieved 96.76% accuracy. They prepared a dataset of
images collected from India. Dileep and Pournami combined 2400 leaf images of 40 plants, which were captured using
AlexNet-based feature extraction with SVM classifier to a scanner at 1200 or 600 dpi. Another cascaded network
classify 40 medical plants [22]. The model was trained was proposed by Diwedi et al. [23]. They also classified
using 2400 images and yielded 96.76% accuracy. A similar 40 medicinal plants using 6500 leaf images from India and
approach was proposed by Dewedi et al. utilizing ResNet achieved 96.8% accuracy. They utilized ResNet50 for feature
and SVM model to classify 40 medicinal plants [23]. This extraction and SVM for classification.
model was trained using a public dataset of 6500 images. This High accuracy is essential for the medicinal plant classifi-
method achieved 96.80% accuracy. Uddin et al. [24] proposed cation to be effective because a false positive detection could
an ensemble of CNN models to classify ten medicinal plants. cause fatalities or major health issues. However, making
The method was trained using 5000 images for only ten the system practical and easy to use is also necessary.
epochs to achieve an accuracy of 99%. Our study, considered Recent deep learning techniques such as CNNs [32], vision
this method as the state-of-the-art, as it achieved the highest transformers [33], graph-based neural networks [34] and
accuracy for the RGB images. However, this model was AI-generated super-resolution images [35] have excelled in
highly over-fitted with a loss of over 0.80. Another ensemble various domains, especially where large amounts of labeled
of CNN models was proposed by Bahri [25]. This method data are available and complex patterns need to be cap-
was trained using 40,000 images for 50 epochs to achieve tured. Therefore, many researchers developed large datasets
97.4% accuracy. This model also suffers from high loss, and implemented CNN for medicinal plant classification.
indicating an over-fitted method. Kyalkond et al. proposed However, collecting a large dataset of medical plant images
a binary classifier to distinguish medical plants from non- takes a lot of work. Some of these datasets contained noisy
medicinal plants using a custom-built CNN model [26]. This images and needed to be verified by experts. CNN-based
model was trained using 1600 images of 100 medicinal models use local receptive fields to capture small, spatially
plants for 100 epochs. This method achieved 90% accuracy proximate patterns in the image, which enables them to focus
with a low loss value. However, identifying medicinal plants on local features, such as edges and textures, effectively.
without knowing their class doesn’t serve our study. Similar The convolutional and pooling operations and weight-
work was done by Pukhrambam and Sahayadhas in which sharing strategies contribute to CNN’s remarkable feature
they trained a DenseNet-based CNN for distinguishing extraction capabilities. Several medical plant classification
VOLUME 12, 2024 42467
M. Tarequl Islam et al.: Medicinal Plant Classification
methods were previously developed utilizing CNN-based of AI models to classify them. Multiclass classification is
models. However, these methods are highly parameterized, typically challenging as assigning a class label to a new
computationally intensive, and often produce over-fitted image among the seven classes is more complex than making
models [24], [25]. the same decision where there are fewer classes. The time
This study focused on developing an optimal and practical complexity is also higher for a multiclass classifier. Another
system for medicinal plant classification. SVM and similar challenge is the data imbalance problem. In a multiclass
traditional machine learning methods offer computational classification problem where some classes are rare and others
efficiency compared to complex deep learning models are common, the model may learn to favor the common
like CNN. SVMs can generalize well with limited data, classes and neglect the rare ones. Finally, selecting the best
making them suitable for scenarios where acquiring a large model is another critical task. There is no one-size-fits-
labeled dataset is challenging. It allows explicit feature all solution; the strengths and weaknesses vary among the
engineering, where domain knowledge can be leveraged to AI models depending on the data and the task. Therefore,
select relevant features. This is suitable for medical plant in this study, we undertook an exhaustive search to select
classification and motivated many researchers to deploy SVM the best AI-enabled network to classify medicinal plants
for classifying plants using different leaf features. However, from one of the seven classes. We have gone through
these methods failed when the complexity of the decision intensive data preparation, data curation, network tuning,
boundary increased with the number of plant classes [5], feature selection, model evaluation, network selection and
[7], [8]. Therefore, in this research, we combined CNN’s finally, demonstration of the selected network. To deal
expelling feature extraction capability with SVM’s ability with data imbalance, we have oversampled the images
to draw a simple decision boundary to develop an optimal of the classes with fewer images to balance the class
method. This method utilized the ResNet50 model for feature distribution. In order to select the best network, we considered
extraction. Then, these features were further processed using the accuracy, time complexity and interoperability of the
PSO to select the suitable features to predict the class using networks. We determined the accuracy of the model using
SVM. The incorporation of PSO further boosted the accuracy cross-validation. Grid-based searching was used to compare
of the system. The features were extracted using pre-trained and optimize different models and parameters for the best
CNN with no learnable parameters. The learnable parameters performance.
of the proposed system are included only in the PSO and In this study, we combined the CNN models with the
SVM module, which makes the system lightweight and TML models. CNN models are computationally expensive
computationally effective. and require a large number of images, especially for a multi-
class problem. In comparison, the TML models take less
III. MATERIALS AND METHODS time to train and can achieve good accuracy when trained
A. MEDICINAL PLANT DATASET using a smaller dataset [36]. However, the performance of the
In this study, we collaborated with the ethnobotanists to TML models depends on the optimally selected features [37].
prepare a dataset of 6,427 images of medicinal plants which Therefore, we have utilized the PSO method to select the
included 678 aloevera, 1012 hibiscus, 1002 holy basil, optimal features for training the TML models. The PSO
621 lemon grass, 926 neem, 1237 henna and 951 mint leaf was used because of its quick convergence speed, minimal
images. These images were collected from six distinct places memory requirements, and ease of implementation [38].
in Bangladesh and captured using an iPhone 12 camera (12 In this study, we proposed a cascaded architecture where the
megapixels, f/1.6, 1.4 µm). CNN model works as the feature extractor, which are then
Initially, 7000 images were captured and then the exposure processed by the PSO method to select the optical features.
and focus quality of the images were examined to eliminate Finally, a TML model utilizes these features to predict the
under-exposed and blurry images. To evaluate the focus and output class. We used the pre-trained weights of CNN to
exposure quality of the image, we used a reference-less reduce the computation and time complexity further. This
image quality evaluation method [36], in which we estimated cascaded architecture allowed for a reduction in training time
the average width and height of the edges. After that, two and resources. Moreover, it achieved sufficient accuracy and
ethnobotanists examined the images independently and we high robustness. Fig. 1 shows the architecture of the proposed
took only those images for which the experts’ judgement system.
matched. This was done to ensure the confirm the class of The proposed system can classify the plants from low-
each image, which is the ground truth. In the quality and resolution smartphone-captured images. Once the image was
experts evaluations, 573 images were eliminated, resulting in captured, the system evaluated the quality of the image
6,427 images for this dataset, as shown in Table 1. first. If the quality of the image was satisfactory, then the
image was converted to sRGB space to compensate for
B. CASCADED ARCHITECTURE FOR THE PROPOSED the color variation and later resized according to the input
SYSTEM size of the CNN model. The quality was determined in
In this study, we have considered seven different types terms of sharpness and brightness indices. We measured
of medical plants and proposed a cascaded architecture the distances between local minima and local maxima for
each edge’s gradient, corresponding to the edge’s width. given by Equation (1) and (2):
The sharpness index was the average width of the edges.
A smaller average width value indicates that the image is N
1X
sharper. Then, we also calculated the absolute difference in Sharpness index = w(i) (1)
intensity between the local minima and maxima, representing N
i=1
the edge’s height. The average height of the edges indicates 1 XN
the brightness of the image. A brighter image had a better Brightness index = h(i) (2)
N
average height. The sharpness and brightness indices are i=1
FIGURE 1. Architecture of the proposed cascaded network for medicinal plant classification.
In Equation (1) and (2), N is the count of edges, w(i) is the D. MODEL SELECTION FOR CASCADED NETWORK
width of edge i and h(i) is the height of edge i. If an image In this study, we have performed an exhaustive search to
has sharpness index higher than 6 or brightness index lower find suitable networks for the proposed cascaded architecture.
than 6, then it was eliminated for further processing. We have experimented with six different CNN models and
Following the pre-assessment, the CNN-PSO-TML cas- seven TML models incorporated with the PSO to get the
caded network processed the image to determine its class. best network for the proposed system. For the CNN-enabled
The following section contains details of how the appropriate feature extraction, we investigated the performance of deep
CNN and TML models were selected for the feature convolutional architecture-based networks [40] VGG16 and
extraction and classification. We additionally evaluated VGG19, depthwise separable convolutions based network
the proposed system in this study using three different [42] Xception, densely connected layer based network [43]
smartphones to ensure the system is robust. DenseNet121, deep residual learning-based networks [41]
ResNet50, and inverted residual structure-based model
C. FEATURE SELECTION FOR CASCADED NETWORK MobileNet which is optimized for mobile devices [44].
Particle Swarm Optimization (PSO) is indeed a population- Typically, a CNN model functions in two steps: firstly,
based stochastic optimization algorithm developed based the bottom layers of the models, which mainly contain
on the social behavior of birds flocking. PSO optimization the convectional layers, and pooling layers extract features
follows the techniques that a swarm of birds follow to find from the input image. Then, the features extracted by the
their desired target in a search space. The swarm’s movement convolutional base are utilized by the top layers of the
is influenced by their current search directions, the best model, which mainly contain the dense layers and dropouts
individual results or fitness to this point, the best fitness to predict the class based on the given features. One of the
found by all birds, and a random perturbation. Every bird most significant advantages of using CNN models is their
continues to track its best fitness. Then, it switches its position ability to extract useful features suitable for class prediction,
with other birds in the swarm. The swarm of birds uses this which is often difficult to achieve using an independent
communication system to locate the desired target, something feature selection method. Therefore, we have utilized the
that is not possible for a single bird to do on its own. This convolutional base extracted features of the CNN models
method almost always converges on the global optimal. in the cascaded architecture. However, we have utilized
In our experiment, we relied on the standard PSO algorithm the pre-trained weights of the convolutional base, which
by Kennedy and Eberhart [38] and utilized a similar were derived by using the ImageNet dataset for the CNN
parameter setup for feature selection as mentioned in [39] to models. This allowed us to eliminate the convolutional base’s
derive the fitness for each particle, using the following the training using our data. Then, these features are processed
fitness function: by PSO to identify the optimal features. PSO identifies
1−q the minimum number of best-fitting features. Finally, these
Fi = qai + selected,i (3) features were utilized by seven different TML models, which
Nf
include Support Vector Machine (SVM), Random Forest
where Fi is the fitness of the ith particle, Nfselected,i is the (RF), Decision Tree (DT), Naive Bayes (NB), eXtreme
number of selected features, ai is the accuracy for the ith Gradient Boosting (XGB), K-Nearest Neighbors (KNN) and
particle and q is a weight. A higher value of q yields Logistic Regression (LR).
higher number of features, maximizing its accuracy, whereas For each CNN-based model, we have investigated the
a lower value yields lower number of selected features, performance incorporating PSO and each TML model. This
compromising accuracy. In our experiment q was 1. generated 42 cascaded networks, which were then trained and
tested using our dataset to select the best one. We performed of the cascaded network. Moreover, in our experiment of
a hold-out test and 10-fold cross-validation to compare ablation study, we have also found that the SVM yielded
the cascaded networks. In the hold-out test, we used 60% the best performance in the absence of PSO for most CNN
of the images for training, 20% for validation and 20% models. This explains the suitability of SVM for the cascaded
to test the networks. In the 10-fold cross-validation, the network. The findings of our experiment suggest that the
7000 images were divided into ten groups. The images were SVM classifier performs significantly better than another
randomly distributed to each group to have 700 images, classifier when trained with optimally selected features
containing 100 images per class. generated by a CNN model. This finding is analogous to the
We also performed an ablation study to determine the role results reported in [36] for artifact classification.
of PSO. The time requirements were also analyzed for the After that, we performed 10-fold cross-validation for
selected network. Finally, the system was implemented and the best candidate networks. For this experiment, we used
demonstrated using three smartphones to ensure it is robust 6,400 images and distributed them randomly to one of
regardless of the device. the ten groups so that each group contains 67 aloe vera,
101 hibiscuses, 100 holy basil, 62 lemongrass, 92 neem,
IV. RESULTS AND ANALYSIS 123 henna, and 96 mint leaf images. We calculated the
In this study, we evaluated the performance of 42 distinct average accuracy for the candidate networks in the 10-
networks, which were prepared by cascading six pre-trained fold cross-validation experiment. It involves dividing the
CNN and seven TML models with the PSO. Each CNN dataset into ten subsets. The networks were trained and
model was used with PSO and one of the seven TML tested ten times, each using a different subset of images
models, generating seven cascaded networks for each CNN as the test set and the remaining data as the training set.
model. Firstly, we selected the best candidate cascaded This experiment provided a robust and reliable estimate
network of each CNN model based on their accuracy, of a network’s performance, suitable for selecting the best
precision, recall, and F1-Score for the test data. Then, the best network for the proposed system. In the cross-validation
candidate networks of all CNN models were compared based experiment, the ResNet50-PSO-SVM achieved an average
on their average accuracy in the 10-fold cross-validation accuracy of 98.39 ± 0.19, outperforming the other candidate
experiment to select the best network for the proposed networks, as shown in Table 2. The second highest accuracy
system. After that, the proposed method was compared with was 96.13 ± 0.40 for the VGG19-PSO-SVM network.
the existing methods including the state-of-art method to Therefore, the ResNet50-PSO-SVM was selected for the
ensure its efficacy. We also evaluated the performance of proposed system. Fig. 2 also showed that Resnet50-PSO-
these 42 cascaded networks without incorporating PSO- SVM outperformed all the cascaded networks in accuracy,
based feature selection in the ablation study to ensure the precision, recall, and F1-score for the test data of the holdout
role of PSO in the cascaded network. The ablation study experiment.Additionally, we plotted the confusion matrix for
revealed that incorporating PSO increases the networks’ the best candidate networks for the test data. Fig. 3 shows
accuracy. Finally, the performance of the proposed system the confusion matrices. Further, we plotted the received
was investigated for practical use. The results of these operating characteristics curves for these cascaded networks,
experiments are detailed in the following sections. as shown in Fig. 4. The average area under the curve (AUC)
value was 1, indicating 100% accuracy of the network.
A. MODEL SELECTION AND EVALUATION The confusion matrices and the ROC curves also reveal the
First, we compared the 42 distinct PSO-incorporated cas- precedency of the ResNet50-PSO-SVM network over the
caded networks’ performance for the unseen test data in other cascaded networks.
terms of average accuracy, precision, recall, and F1-Score, Finally, we have compared the results of the proposed
as shown in Fig. 2. This experiment revealed that the ResNet50-PSO-SVM network-based method with the exist-
best candidate networks for VGG16, VGG19, ResNet50, ing methods, as shown in Table 3. The proposed method
Xception, DenseNet121, and MobileNet CNN are VGG16- outperformed the existing methods in average accuracy. This
PSO-KNN, VGG19-PSO-SVM, MoblineNetV2-PSO-SVM, method utilized plant images and did not rely on singular leaf
and DenseNet121-PSO-SVM, respectively. From Fig. 2, images. Moreover, this rapid method can predict the class in
it can be observed that the VGG16, VGG19, and ResNet50- 0.17 seconds. Time for other methods was not reported. This
based networks performed better for the other CNN models Table also shows that the cascaded networks achieved higher
regardless of the type of TML classifiers. The VGG16 models accuracy when the number of classes are high compared to
achieved the highest accuracy of 98.61% when cascaded the other two approaches. Standalone CNN-based methods
with the KNN classifier. While VGG19 and ResNet50 also achieved high accuracy; however, they required a large
achieved their highest accuracy of 98.93% and 99.60%, number of images for training. Alternatively, TML models
respectively, with the SVM classifier. The DenseNet121 and such as SVM and RF performed well when trained using
MobileNetV2 also obtained their best performance with the selected features derived from fewer images. Moreover, SVM
SVM classifier. Four out of six CNN models achieved the best has few parameters compared to the CNN models, making
performance when SVM was incorporated as the classifier it lightweight. The proposed method utilized pre-trained
FIGURE 2. Evaluation to select the best cascaded network for each CNN.
FIGURE 3. Confusion matrix for the best cascaded network of each CNN model.
RestNet50 for feature extraction that involves no learnable few parameters; thus, this cascaded network is lightweight
parameter. The SVM-based classifier and the PSO have a compared to the standalone CNN-based methods.
FIGURE 4. ROC for the best cascaded network of each CNN model.
TABLE 2. The fold-wise accuracy of the best cascaded networks of each CNN model.
B. ABLATION STUDY FOR PSO average time requirements for the best-cascaded network
Ablation studies help understand the role of a particular of each CNN with and without integrating PSO, shown
component in a complex system, such as the cascaded archi- in Table 5. The time was estimated for 145 unseen test
tecture of multiple machine learning models. This ablation images. This table shows that the PSO improved accuracy
study investigates the performance of cascaded networks without significantly increasing the total computational time
by removing PSO-based feature selection to understand its of the network. It justifies the contribution of the PSO-based
contribution. For this purpose, we trained and validated each feature selection for the cascaded networks and the proposed
of the 42 cascaded networks without the PSO module using system. The SVM-based classifiers tend to cost more time
the same dataset. Then, the trained networks were tested using compared to the other classifiers, such as KNN and XGB.
the same dataset. Finally, we compared the performances XGB classifier showed higher computation speed than KNN
of these networks in terms of average accuracy, precision, and SVM, considering the number of features.
recall and F1-Score, as provided in Table 4. This table shows
that the performance of most of the networks improves C. FEASIBILITY OF PROPOSED SYSTEM FOR PRACTICAL
in the presence of the PSO-based feature selection. SVM USE
classifier-based networks yielded the best performance for A practical system enables capturing the image quickly,
most cascaded networks for both with and without PSO. such as using smartphones and provides rapid and robust
Additionally, we have compared the average accuracy and classification results on the site of plants. In this study,
TABLE 5. The impact of PSO in the cascaded network in terms of number of features, accuracy and computation time.
we assessed the feasibility of the proposed system for from the server and predicted the output for each set of
practical use. For this purpose, firstly, we tested the proposed images captured by the smartphones. Then, the results were
system for three different smartphones to determine its sent back to the client’s smartphone. Fig. 5 shows how
robustness. We captured two sets of 145 plant images using the proposed system is implemented currently for practical
two other smartphones: Xiaomi Redmi Note 10 Pro (108 use. The accuracy of the system using Redmi Note Pro and
megapixels, f/1.9, 0.7 µm) and Samsung Galaxy S10 (12 Samsung Galaxy S10 was 93.89% and 99.90%, respectively.
megapixels, f/1.5-2.4, 1.4 µm). We captured the images for This ensured the robustness of the method. This system
the same trees and then transferred the images to a server can detect the class of plans from the smartphone-captured
using the internet. The proposed method received the images plant images taken from a standard distance (1)-2 meters).
Additionally, this method was tested for smartphone-captured steps boosted the robustness of the system. In this study,
leaf images and was found effective. we incorporated PSO for feature selection, which boosted
Secondly, we estimated the time requirements of the the system’s performance, as found in the experiments.
proposed system. The proposed method took 35.6 ± 2.14 and Nonetheless, examining the efficacy of alternative feature
22.9 ± 5.67 seconds for classifying 145 images captured selection techniques, such as chi-square, is imperative. This
by Redmi Note Pro and Samsung Galaxy S10 phones, method was designed for only limited medicinal plants,
respectively. The accuracy and the evaluation time were which should be increased in the future. Again, this method
99.60% and 14.9 ± 0.10 seconds respectively for the iPhone was not tested for images captured using other devices, such
12. The proposed method is robust regardless of the device, as digital cameras, or captured by other groups. This is
as demonstrated by the average accuracy and prediction time another limitation of this study.
of 97.79 ± 2.76% and 0.15 ± 0.05 seconds/image for the
three devices. The total turnaround time for the proposed VI. CONCLUSION
system from capturing image to receiving the classification In this paper, we proposed an automatic medicinal plant
result was less than five minutes for a single plant, regardless classification method for Bangladesh. This method is highly
the smartphone. This is practical considering the existing accurate and practically sound to identify the class of the
system which requires to process image after leaving the site plants using the smartphone-captured image on the site
of plants. In the future, an smartphone application will be of plants within minutes. In the future, more plants will
developed and installed to perform the classification in the be integrated and a standalone mobile application will be
client machine and provide results instantly. developed for the proposed system.
V. DISCUSSION REFERENCES
Traditionally, experts identify medicinal plants meticulously [1] K. Ahmad and I. Ahmad, Herbal Medicine: Current Trends and
to use them for drug development or treatment. The manual Future Prospects. (New Look to Phytomedicine). New York, NY, USA:
identification is prone to inter-observer variability and is Academic, 2019.
[2] M. J. Balunas and A. D. Kinghorn, ‘‘Drug discovery from medicinal
contingent upon the availability of experts. The manual plants,’’ Life Sci., vol. 78, no. 5, pp. 431–441, Dec. 2005.
assessment lacks reliability because even one incorrect [3] L. N. Joppa, D. L. Roberts, N. Myers, and S. L. Pimm, ‘‘Biodiversity
identification can result in significant health problems, hotspots house most undiscovered plant species,’’ Proc. Nat. Acad. Sci.
USA, vol. 108, no. 32, pp. 13171–13176, Aug. 2011.
financial losses, and even fatalities. If the image is captured
[4] M. F. Kadir, J. R. Karmoker, M. R. Alam, S. R. Jahan, S. Mahbub,
in different settings, existing automated methods fail to detect and M. M. K. Mia, ‘‘Ethnopharmacological survey of medicinal plants
plants. They rely on singular leaf images taken from a used by traditional healers and indigenous people in chittagong Hill
short distance, such as one meter. These systems fail when tracts, bangladesh, for the treatment of snakebite,’’ Evidence-Based
Complementary Alternative Med., vol. 2015, pp. 1–23, Jan. 2015.
the image is captured using a different imaging device or [5] H. X. Kan, L. Jin, and F. L. Zhou, ‘‘Classification of medicinal plant leaf
smartphone. More importantly, they are required to carry the image based on multi-feature extraction,’’ Pattern Recognit. Image Anal.,
images and later process them using a computer to get the vol. 27, no. 3, pp. 581–587, Jul. 2017.
[6] R. Janani and A. Gopal, ‘‘Identification of selected medicinal plant leaves
results. Thus, the existing systems lack practical usability. using image features and ANN,’’ in Proc. Int. Conf. Adv. Electron. Syst.
In this study, we presented an automated medicinal (ICAES), Sep. 2013, pp. 238–242.
plant classification method that outperformed previous [7] A. Begue, V. Kowlessur, U. Singh, F. Mahomoodally, and S. Pudaruth,
‘‘Automatic recognition of medicinal plants using machine learning
approaches, achieving 99.60% accuracy. This method is techniques,’’ Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 4, pp. 166–175,
designed to predict the class from plant images. Moreover, 2017.
this system also works if leaf images are provided. The [8] U. Habiba, M. R. Howlader, M. A. Islam, R. H. Faisal, and M. M. Rahman,
images were captured on the plant’s site and transferred over ‘‘Automatic medicinal plants classification using multi-channel modified
local gradient pattern with SVM classifier,’’ in Proc. Joint 8th Int.
the internet to get the classification results in the client’s Conf. Informat., Electron. Vis. (ICIEV) 3rd Int. Conf. Imag., Vis. Pattern
phone in less than five minutes. This study demonstrated Recognit. (icIVPR), May 2019, pp. 6–11.
the system for two different smartphones not used in the [9] S. Naeem, A. Ali, C. Chesneau, M. H. Tahir, F. Jamal, R. A. K. Sherwani,
and M. Ul Hassan, ‘‘The classification of medicinal plant leaves based
training. Regardless of the imaging devices, the system’s high on multispectral and texture feature using machine learning approach,’’
accuracy and rapid speed ensured its robustness. This system Agronomy, vol. 11, no. 2, p. 263, Jan. 2021.
was effective for different smartphones, for singular leaf and [10] L. D. S. Pacifico, L. F. S. Britto, E. G. Oliveira, and T. B. Ludermir,
‘‘Automatic classification of medicinal plant species based on color and
plant images, and for providing the classification results on texture features,’’ in Proc. 8th Brazilian Conf. Intell. Syst. (BRACIS),
the site in less than five minutes. This ensured the practical Brazil, Oct. 2019, pp. 741–746.
use of the system. [11] R. I. Borman, R. Napianto, N. Nugroho, D. Pasha, Y. Rahmanto,
and Y. E. P. Yudoutomo, ‘‘Implementation of PCA and KNN algorithms
The proposed method incorporated a three-stage image
in the classification of Indonesian medicinal plants,’’ in Proc. Int.
pre-assessment, which includes color normalization to com- Conf. Comput. Sci., Inf. Technol., Electr. Eng. (ICOMITEE), Oct. 2021,
pensate for the color variation due to different imaging pp. 46–50.
devices, sharpness evaluation to ensure that the image is not [12] R. Azadnia and K. Kheiralipour, ‘‘Recognition of leaves of different
medicinal plant species using a robust image processing algorithm and
out of focus or blurry, and brightness evaluation to ensure artificial neural networks classifier,’’ J. Appl. Res. Medicinal Aromatic
the images are not underexposed. These pre-assessment Plants, vol. 25, Dec. 2021, Art. no. 100327.
[13] B. S. Anami, S. S. Nandyal, and A. Govardhan, ‘‘A combined color, texture [34] J. Zhou, ‘‘Graph neural networks: A review of methods and applications,’’
and edge features based approach for identification and classification of AI Open, vol. 1, pp. 57–81, Jan. 2020.
Indian medicinal plants,’’ Int. J. Comput. Appl., vol. 6, no. 12, pp. 45–51, [35] X. Zhu, K. Guo, H. Fang, L. Chen, S. Ren, and B. Hu, ‘‘Cross view capture
Sep. 2010. for stereo image super-resolution,’’ IEEE Trans. Multimedia, vol. 24,
[14] B. R. Pushpa, N. Megha, and K. B. Amaljith, ‘‘Comparision and pp. 3074–3086, 2022.
classification of medicinal plant leaf based on texture feature,’’ in Proc. [36] H. M. Shakhawat, T. Nakamura, F. Kimura, Y. Yagi, and M. Yamaguchi,
Int. Conf. Emerg. Technol. (INCET), Jun. 2020, pp. 1–5. ‘‘[Paper] automatic quality evaluation of whole slide images for the
[15] D. Puri, A. Kumar, and J. Virmani, ‘‘Classification of leaves of medicinal practical use of whole slide imaging scanner,’’ ITE Trans. Media Technol.
plants using laws’ texture features,’’ Int. J. Inf. Technol., vol. 14, Appl., vol. 8, no. 4, pp. 252–268, 2020.
pp. 931–942, Aug. 2019. [37] M. S. Hossain, M. G. Hanna, N. Uraoka, T. Nakamura, M. Edelweiss,
[16] D. Venkataraman and N. Mangayarkarasi, ‘‘Support vector machine based E. Brogi, M. R. Hameed, M. Yamaguchi, D. S. Ross, and Y. Yagi,
classification of medicinal plants using leaf features,’’ in Proc. Int. Conf. ‘‘Automatic quantification of HER2 gene amplification in invasive breast
Adv. Comput., Commun. Informat. (ICACCI), Sep. 2017, pp. 793–798. cancer from chromogenic in situ hybridization whole slide images,’’ J.
[17] N. Duong-Trung, L.-D. Quach, M.-H. Nguyen, and C.-N. Nguyen, Med. Imag., vol. 6, no. 4, p. 1, Nov. 2019.
‘‘A combination of transfer learning and deep learning for medicinal plant [38] J. Kennedy and R. Eberhart, ‘‘Particle swarm optimization,’’ in Proc. ICNN
classification,’’ in Proc. 4th Int. Conf. Intell. Inf. Technol., Feb. 2019, Int. Conf. Neural Netw., Nov. 1995, pp. 1942–1948.
pp. 83–90. [39] M. S. Hossain, N. Hasan, M. A. Samad, H. M. Shakhawat, J. Karmoker,
F. Ahmed, K. F. M. N. Fuad, and K. Choi, ‘‘Android ransomware detection
[18] D. B. Valdez, C. J. G. Aliac, and L. S. Feliscuzo, ‘‘Medicinal plant
from traffic analysis using metaheuristic feature selection,’’ IEEE Access,
classification using convolutional neural network and transfer learning,’’
vol. 10, pp. 128754–128763, 2022.
in Proc. IEEE Int. Conf. Artif. Intell. Eng. Technol. (IICAIET), Sep. 2022,
pp. 1–6. [40] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
large-scale image recognition,’’ 2014, arXiv:1409.1556.
[19] R. Akter and M. I. Hosen, ‘‘CNN-based leaf image classification for
[41] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
Bangladeshi medicinal plant recognition,’’ in Proc. Emerg. Technol.
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Comput., Commun. Electron. (ETCCE), Dec. 2020, pp. 1–6.
Jun. 2016, pp. 770–778.
[20] M. Musa, M. Arman, M. Hossain, A. Thusar, N. Nisat, and A. Islam, [42] F. Chollet, ‘‘Xception: Deep learning with depthwise separable convo-
‘‘Classification of immunity booster medicinal plants using CNN: A deep lutions,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
learning approach,’’ in Proc. Int. Conf. Adv. Comput. Data Sci., Nashik, Jul. 2017, pp. 1800–1807.
India, Apr. 2021, pp. 244–254.
[43] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, ‘‘Densely
[21] A. P. Saikia, P. V. Hmangaihzuala, S. Datta, S. Gope, S. Deb, and connected convolutional networks,’’ in Proc. IEEE Conf. Comput. Vis.
K. R. Singh, ‘‘Medicinal plant species classification using neural network Pattern Recognit. (CVPR), Jul. 2017, pp. 2261–2269.
classifier,’’ in Proc. 6th Int. Conf. Commun. Electron. Syst. (ICCES), [44] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen,
Jul. 2021, pp. 1805–1811. ‘‘MobileNetV2: Inverted residuals and linear bottlenecks,’’ in
[22] M. R. Dileep and P. N. Pournami, ‘‘AyurLeaf: A deep learning approach Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
for classification of medicinal plants,’’ in Proc. TENCON IEEE Region 10 pp. 4510–4520.
Conf. (TENCON), Oct. 2019, pp. 321–325.
[23] H. K. Diwedi, A. Misra, and A. K. Tiwari, ‘‘CNN-based medicinal plant
identification and classification using optimized SVM,’’ Multimedia Tools
Appl., vol. 83, no. 11, pp. 33823–33853, Sep. 2023.
[24] A. H. Uddin, Y.-L. Chen, B. Borkatullah, M. S. Khatun, J. Ferdous,
P. Mahmud, J. Yang, C. S. Ku, and L. Y. Por, ‘‘Deep-learning-based
classification of Bangladeshi medicinal plants using neural ensemble
models,’’ Mathematics, vol. 11, no. 16, p. 3504, Aug. 2023.
[25] A. Bahri, ‘‘Dynamic CNN combination for Morocco aromatic and MD. TAREQUL ISLAM received the B.Sc. degree
medicinal plant classification,’’ Int. J. Comput. Digit. Syst., vol. 11, no. 1,
from the University of Rajshahi, Bangladesh, and
pp. 239–249, Jan. 2022.
the M.Sc. (Engg.) degree in computer science and
[26] S. Kyalkond, S. Aithal, V. Sanjay, and P. Kumar, ‘‘A novel approach to
engineering from Mawlana Bhashani Science and
classification of ayurvedic medicinal plants using neural networks,’’ Int. J.
Technology University (MBSTU), Bangladesh,
Eng. Res. Technol., vol. 11, no. 1, p. 2278, 2022.
where he is currently pursuing the Ph.D. degree.
[27] B. Pukhrambam and A. Sahayadhas, ‘‘Advanced medicinal plant classifi-
cation and bioactivity identification based on dense net architecture,’’ Int.
He is an Assistant Professor with the Computer
J. Adv. Comput. Sci. Appl., vol. 13, no. 6, pp. 1–7, 2022. Science and Engineering Department, Khwaja
[28] M. Berihu, J. Fang, and S. Lu, ‘‘Automatic classification of medicinal
Yunus Ali University, Bangladesh. His research
plants of leaf images based on convolutional neural network,’’ in Proc. interests include computer vision, bioinformatics,
CCF Conf. Big Data, 2022, pp. 108–116. the IoT, and blockchain.
[29] G. Kumar, V. Kumar, and A. Hrithik, ‘‘Herbal plants leaf image
classification using machine learning approach,’’ in Intelligent Systems
and Smart Infrastructure. Boca Raton, FL, USA: CRC Press, 2023,
pp. 549–558.
[30] P. Sharma, ‘‘Automatic classification and diseases identification of Indian
medicinal plants using deep learning models,’’ ECS Trans., vol. 107, no. 1,
pp. 4631–4639, Apr. 2022.
[31] M. Widneh, A. Workneh, and A. Alemu, ‘‘Medicinal plant parts
identification and classification using deep learning based on multi label WAHIDUR RAHMAN received the bachelor’s
categories,’’ Ethiopian J. Sci. Sustain. Develop., vol. 8, pp. 96–108, and master’s degrees in computer science from
Sep. 2021. the Computer Science and Engineering Depart-
[32] A. Krizhevsky, I. Sutskever, and G. Hinton, ‘‘ImageNet classification with ment, Mawlana Bhashani Science and Technology
deep convolutional neural networks,’’ in Proc. Adv. Neural Inf. Process. University. He is currently a Senior Lecturer with
Syst., 2012, pp. 1–7. the Department of Computer Science and Engi-
[33] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, neering. His research interests include machine
T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, learning, the Internet of Things, and computer
J. Uszkoreit, and N. Houlsby, ‘‘An image is worth 16×16 words: vision in agriculture.
Transformers for image recognition at scale,’’ 2020, arXiv:2010.11929.
MD. SHAKHAWAT HOSSAIN (Member, IEEE) RAQUEL MARTÍNEZ DIAZ is currently with Universidad Europea
received the Ph.D. degree from Tokyo Institute del Atlántico, Santander, Spain. She is also affiliated with Universidad
of Technology, Japan. Later, he was a Research Internacional Iberoamericana, Arecibo, PR, USA, and Universidad de La
Scientist with the Memorial Sloan Kettering Romana, La Romana, Dominican Republic.
Cancer Center, USA. He also received an offer to
join the University of Oxford, U.K., as a Senior
IMRAN ASHRAF received the Ph.D. degree
Researcher of machine learning in medical imag-
in information and communication engineer-
ing. He is currently an Assistant Professor with the
ing from Yeungnam University, Gyeongsan-si,
Computer Science and Engineering Department,
South Korea, in 2018, and the M.S. degree (Hons.)
Independent University, Bangladesh. His research
in computer science from Blekinge Institute of
interests include using machine learning and whole slide image analysis
Technology, Karlskrona, Sweden, in 2010. He was
techniques to gain insight into the treatment of cancer patients. Recently,
a Postdoctoral Fellow with Yeungnam Univer-
his work has focused on unraveling the role of the HER2 (human epidermal
sity. He is currently an Assistant Professor with
growth factor receptor 2) gene in the growth of different cancers, such as
the Information and Communication Engineering
breast, colon, and gastric. He is a member of the Association of Pathology
Department, Yeungnam University. His research
Informatics, USA.
interests include positioning using next-generation networks, communica-
tion in 5G and beyond, location-based services in wireless communication,
smart sensors (LIDAR) for smart cars, and data analytics.