Applsci 13 12504
Applsci 13 12504
sciences
Article
Deep Learning-Based Method for Classification and Ripeness
Assessment of Fruits and Vegetables
Enoc Tapia-Mendez 1 , Irving A. Cruz-Albarran 1,2 , Saul Tovar-Arriaga 3 and Luis A. Morales-Hernandez 1, *
Abstract: Food waste is a global concern and is the focus of this research. Currently, no method in
the state of the art classifies multiple fruits and vegetables and their level of ripening. The objective
of the study is to design and develop an intelligent system based on deep learning techniques to
classify between types of fruits and vegetables, and also to evaluate the level of ripeness of some of
them. The system consists of two models using the MobileNet V2 architecture. One algorithm is for
the classification of 32 classes of fruits and vegetables, and another is for the determination of the
ripeness of 6 classes of them. The overall intelligent system is the union of the two models, predicting
first the class of fruit or vegetable and then its ripeness. The fruits and vegetables classification
model achieved 97.86% accuracy, 98% precision, 98% recall, and 98% F1-score, while the ripeness
assessment model achieved 100% accuracy, 98% precision, 99% recall, and 99% F1-score. According to
the results, the proposed system is able to classify between types of fruits and vegetables and evaluate
their ripeness. To achieve the best performance indicators, it is necessary to obtain the appropriate
hyperparameters for the artificial intelligence models, in addition to having an extensive database
with well-defined classes.
categorize fresh and rotten apples. Finally, researchers proposed a CNN model that can
accurately classify fresh and rotten fruits, including apples, bananas, and oranges, utilizing
a dataset containing 5989 images [22].
Despite the growing interest in AI for food waste, there is a lack of research on the use
of AI for the detection of fruits, vegetables, and their ripeness stage. The aim of this research
is to design and develop an intelligent system capable of identifying 32 classes of fruits and
vegetables, in addition to assessing the level of ripeness of 6 of the total number of classes
(apples, bananas, mangoes, oranges, potatoes, and tomatoes). Two stages of ripening are
considered: fresh and rotten. The architecture used for both models is MobileNet version 2,
and the transfer learning is performed from the pre-trained weights of the network. The
pre-trained weights are taken from ImageNet, while the input images have a size of
512 × 512 pixels. Additionally, the initial weights of the first layers are frozen, as they do
not need to be trained again. The system is unique in its class, as no other system with
such capabilities has been identified in the state of the art. The proposed hypothesis is
that an intelligent system can be used to develop a model capable of classifying fruits
and vegetables, as well as determining their ripeness. The proposed system has some
limitations; for example, the general methodology uses two AI models, one to recognize
thirty-two categories of fruits and vegetables, and another model to assess the ripeness
of six of the categories. On the other hand, the system is not able to detect damaged,
obstructed or poorly illuminated fruits and vegetables.
2. Theoretical Background
This section presents a theoretical framework for fundamental concepts such as deep
learning, CNNs, metrics utilized in deep learning models, and horticultural maturity stages
utilized in agricultural research. These concepts played an essential role in constructing the
current research.
efficiently down-sample the feature maps, enabling the network to concentrate on the most
pertinent information. [24].
Figure2.2. Basic
Figure Basic operating
operating scheme
scheme for
for fruits, vegetables,
fruits,vegetables, and
vegetables,and ripeness
andripeness prediction.
ripenessprediction.
prediction.
Figure 2. Basic operating scheme for fruits,
Deep learning
Deep learning modelshave have largedata data processing capabilities as well as good
Deep learning models
models have largelarge dataprocessing
processingcapabilities
capabilitiesasas
well as good
well gen-
as good
generalization
eralization when when
the the model
model is is obtained.
obtained. For For
thisthis
casecase study,
study, twotwo pre-trained
pre-trained models
models are
generalization when the model is obtained. For this case study, two pre-trained models
are
used,used,
wherewhere the
the first first predicts the type of fruit or vegetable class, while the second
are used, where the predicts the type
first predicts the of fruit
type ofor vegetable
fruit class, while
or vegetable class, the second
while predicts
the second
the corresponding ripening stage. The reason why two models are used and not just one
where all classes could be included is that the purpose of the research is to demonstrate
that through an intelligent system, it is possible to classify and assess the level of ripeness
of fruits and vegetables.
The great advantage of using this type of configuration is that it is capabilities can
be increased by easily identifying areas of opportunity through the analysis of training
metrics. Furthermore, both models can be retrained by adding classes to them, depending
predicts the corresponding ripening stage. The reason why two models are used and not
just one where all classes could be included is that the purpose of the research is to
demonstrate that through an intelligent system, it is possible to classify and assess the
level of ripeness of fruits and vegetables.
Appl. Sci. 2023, 13, 12504 The great advantage of using this type of configuration is that it is capabilities can be17
6 of
increased by easily identifying areas of opportunity through the analysis of training
metrics. Furthermore, both models can be retrained by adding classes to them, depending
on the objective; for example, if the objective is to increase the number of maturity levels
on the objective; for example, if the objective is to increase the number of maturity levels
per class, only model number two, which has the objective of detecting maturity levels,
per class, only model number two, which has the objective of detecting maturity levels,
needs to be retrained; the same principle would work for model number one.
needs to be retrained; the same principle would work for model number one.
3.1.
3.1. Conceptual
Conceptual Proposal
Proposal of
of the
the System
System
The system isiscomposed
The system composedofofa general
a general algorithm
algorithm thatthat performs
performs the classification
the classification of
of fruits
fruits and vegetables, followed by an algorithm that detects the state of ripeness
and vegetables, followed by an algorithm that detects the state of ripeness of the fruits of the
fruits and vegetables
and vegetables mentioned
mentioned in Table
in Table 1. As1.part
As part of the
of the methodology,
methodology, Figure
Figure 3 describes
3 describes the
the methodology performed
methodology performed [25]. [25].
Figure 3.
Figure Proposed methodology
3. Proposed methodology for
forfruits,
fruits,vegetables,
vegetables,and
andripeness
ripenessdetection.
detection.
Figure 33 outlines
Figure outlines the the approach
approach used
used to
to develop
develop the the AI
AI system.
system. The Theinitial
initialphase
phase
involves collecting a dataset that is appropriate for the given task. Once
involves collecting a dataset that is appropriate for the given task. Once the data has been the data has been
collected, it must be prepared for the training model by formatting it into
collected, it must be prepared for the training model by formatting it into a dataset. Before a dataset. Before
starting to
starting to train,
train, itit is
is important
important totoanalyze
analyzethe
thedataset
datasetto tomake
makesure
surethat
thatititisisnot
notbiased
biasedoror
limited. This will ensure that your system is trained on high-quality
limited. This will ensure that your system is trained on high-quality data, reducing the data, reducing the
likelihood of biased or inaccurate results. The next step is to train the model
likelihood of biased or inaccurate results. The next step is to train the model by feeding by feeding the
datadata
the intointo
the chosen
the chosen deepdeep
learning method
learning and allowing
method the algorithm
and allowing the algorithmto learnto alearn
model
a
capable of predicting new data. Finally, the model is tested on a retained
model capable of predicting new data. Finally, the model is tested on a retained dataset to dataset to evaluate
its performance.
evaluate This processThis
its performance. identifies areas
process in the system
identifies areasthat
in need
the improvement.
system that need Once
the model is performing well, it can be integrated into production to begin predicting
improvement. Once the model is performing well, it can be integrated into production to
new data.
begin predicting new data.
3.2. Recollect
3.2. Recollect
For any AI algorithm, and especially for deep learning, it is necessary to have input
For any AI algorithm, and especially for deep learning, it is necessary to have input
data to feed the CNN. In this step, depending on the problem, the database to be used
data to feed or
is searched thegenerated.
CNN. In this
Thestep,
first depending
dataset usedonwas
the problem,
obtained the database
from Kaggle to be contains
and used is
searched or generated. The first dataset used was obtained from Kaggle and contains
36 types of fruits and vegetables [26], but for research purposes, it was reduced to 32 classes, 36
types of fruits and vegetables [26], but for research purposes, it was reduced
where each class contains 105 images, with a total of 3360 images. In addition, other to 32 classes,
where
datasetseach class from
obtained contains 105 were
Kaggle images, with
used, a total
which of 3360
contain images. states
the ripening In addition, other
depending on
datasets obtained from Kaggle were used, which contain the ripening states depending
the type of fruit or vegetable [27,28]. Table 1 shows the characteristics of the datasets used
on
for the
the type of fruit
ripening or vegetable [27,28]. Table 1 shows the characteristics of the datasets
process.
used for the ripening process.
Table 1. Ripening state datasets.
Table 1. Ripening state datasets.
Fruit/Vegetable Ripeness States Number of Images
Fruit/Vegetable Ripeness States Number of Images
Apple Fresh and rotten 561
Apple
Banana Fresh and rotten
Fresh and rotten 561
524
Banana
Mango Fresh
Freshand
and rotten
rotten 524
1198
Orange
Mango Freshand
Fresh and rotten
rotten 428
1198
Potato
Orange Fresh and rotten
Fresh and rotten 1200
428
Tomato Fresh and rotten 1189
Potato Fresh and rotten 1200
Tomato Fresh and rotten 1189
It should be noted that there is an imbalance between the classes due to the number of
images contained in each class. Therefore, an increase in data or synthetic data was used to
create the correct balance between classes. The methodology section explains the process.
Also, the databases for the maturation states were collected from different datasets and one
was built with them.
Appl. Sci. 2023, 13, 12504 7 of 17
3.3. Prepare
The database for classifying fruits and vegetables has 32 classes. For the selection of
the ripeness classes, a study was carried out to determine which of the most consumed
fruits and vegetables in the country would be suitable for ripeness detection. In addition, a
data augmentation was implemented to homogenize all the classes in order to obtain the
correct balance between them. The configuration used was a 30◦ clockwise rotation of the
images and a 10% zoom of the original image. This resulted in three times the number of
images per class.
3.4. Analyze
In the fruit and vegetable classification dataset, images that do not have representa-
tive characteristics or contain some other types of information were removed, which
significantly improved the model’s performance. The analytical method used is the
80-10-10 method, where 80% of the data is used for training, 10% for testing, and 10% for
validation. That is, for classification, out of the total number of images, 2688 images were
used for training, 168 for testing, and 168 for validation. On the other hand, for ripeness,
4080 images were used for training, 255 for testing, and 255 for validation. In addition, data
augmentation is performed for each of the domains.
To run the models, ablation tests were used to understand the contribution of each
component of the model. These components are the hyperparameters, such as the optimizer,
the number of epochs, and the architecture. All the tests are shown in the Results section.
To perform these tests, first a model was trained with a configuration, then the metrics were
analyzed to decide which hyperparameter had to be changed to obtain better performance.
After identifying the components that could be changed or removed, the model was trained
again to finally evaluate the new metrics. This process was repeated several times until the
performance was better than the previous one.
3.5. Train
The fruit and vegetable classifier was trained via transfer learning from the pre-trained
InceptionV3 and MobileNetV2 models. Hyperparameters were modified until the highest
accuracy was reached during testing. The initial fully convolutional layer of MobileNetV2
has 32 filters, followed by 19 residual bottlenecks. ReLu is utilized as the nonlinearity
due to its robustness with low-precision computations. Always use kernel size 3 × 3, as
it is a standard for modern networks, and implement dropout and batch normalization
throughout the network’s training. Maintain a constant expansion rate throughout the
network, except for the first layer. According to experiments, maintaining expansion rates
between 5 and 10 results in nearly identical performance curves. Smaller networks perform
better with slightly smaller expansion rates, whereas larger networks perform slightly
better with larger expansion rates.
For all primary experiments, we utilize an expansion factor of 6, which is applied
to the size of the input tensor. As an illustration, suppose a bottleneck layer receives a
64-channel input tensor and produces a 128-channel tensor. In that case, the intermediate
expansion layer would have 384 channels, as 64 × 6 = 384 [29].
The outputs of the general architecture shown in Figure 4 can be apple, banana, beets,
bell pepper, cabbage, carrot, cauliflower, chili pepper, corn, cucumber, eggplant, garlic,
ginger, grapes, jalapeno pepper, kiwi, lemon, lettuce, mango, onion, orange, pear, peas,
pineapple, pomegranate, potato, radish, soybean, spinach, sweet potato, tomato, water-
melon, fresh mango, fresh potato, fresh tomato, rotten mango, rotten potato, rotten tomato,
fresh apple, fresh banana, fresh orange, rotten apple, rotten banana, and rotten orange.
For the fruits and vegetables classification model, the output layers are modified in
order to determine the correct result, since when performing transfer learning training, the
last layers and the output layer must be adapted to the nature of the problem to be solved.
The configurations made resulted in adding two previous dense layers to the output with
Appl. Sci. 2023, 13, x FOR PEER REVIEW 8 of 17
Appl. Sci. 2023, 13, 12504 8 of 17
watermelon, fresh mango, fresh potato, fresh tomato, rotten mango, rotten potato, rotten
tomato,
the fresh apple,
Relu activation fresh banana,
function freshlayer
and a dense orange, rottenwith
as output apple,
therotten
numberbanana, and classes
of existing rotten
orange.
with the Softmax activation function, resulting in the final training model.
Figure4.
Figure 4. General
General CNN
CNN architecture
architecture used
used for
for models
models 11 and
and 2.
2.
For the
The modelfruits
forand vegetablesthe
determining classification model, the
state of ripeness output
presents a layers are modified
configuration in
like the
order to
model of determine
classificationtheofcorrect
fruits result, since when
and vegetables, performinga transfer
maintaining transfer learning
learning training,
with the
the last layers
weights and the output
of MobileNetV2, layermodifications
but with must be adapted to the
in the last nature
layers ofandthethe
problem
output.toThe be
solved. The configurations
antepenultimate layer is a batchmade resulted in adding
normalization two previous
to normalize dense
the size and layers
help to the
the output,
output awith
adding the Relu layer,
penultimate activation
with afunction and a dense
kernel regularizer layer
type l2, as output with
activation the number
regularizer l1, andofa
existing
bias classes l1
regularizer with
withtheRelu
Softmax activation
activation function,
function. Note resulting
that before inreaching
the final the
training
output model.
layer,
there The modelinfor
is a layer determining
which neurons with the state of ripeness
a dropout presents
are turned off toa help
configuration like the
train the network.
model of
Finally, classification
a dense of fruits
output layer and vegetables,
is presented with the maintaining
number of aclasses,
transfer learning
which with
in this casethe
is
12, since the
weights 6 classes of fruits
of MobileNetV2, butandwithvegetables
modificationseach have
in the2 last
states of ripeness.
layers and the output. The
antepenultimate layer is a batch normalization to normalize the size and help the output,
3.6. Test a penultimate layer, with a kernel regularizer type l2, activation regularizer l1, and
adding
a biasThis state corresponds
regularizer l1 with Relu to the testing of
activation the already
function. Notetrained models
that before with the
reaching theweights
output
obtained
layer, thereduring this phase.
is a layer in whichThe neurons
training withdata, atest, and validation
dropout are turned intervene
off to helpto be ablethe
train to
demonstrate
network. Finally,that the model
a dense was able
output layer to is
learn without
presented generating
with the numberovertraining orwhich
of classes, subtrain-
in
ing,
this in addition
case to thethe
is 12, since fact6 that with
classes of the validation
fruits separation
and vegetables each can be verified
have 2 states the robustness
of ripeness.
of the model. For both algorithms developed, the metrics of precision, recall, f1-score, and
support
3.6. Test were used to perform the evaluation. The equations of accuracy, precision, recall,
and F1-score
This statearecorresponds
previously mentioned
to the testing as of
(1),the
(2),already
(3), andtrained
(4). models with the weights
obtained during this phase. The training data, test, and validation intervene to be able to
3.7. Use
demonstrate that the model was able to learn without generating overtraining or
It is necessary
subtraining, to identify
in addition to the iffact
thethat
design
withcorresponds
the validationto separation
an application, a prototype,
can be verified thea
part of a platform or if it must be executed from a specific framework.
robustness of the model. For both algorithms developed, the metrics of precision, In this case, the
recall,
design
f1-score,corresponds
and support to were
the platform that will the
used to perform be used in the future
evaluation. to buildof
The equations anaccuracy,
API and
aprecision,
functionalrecall, and F1-score are previously mentioned as (1), (2), (3), and (4). they are
physical prototype. Trained models need to be called from where
stored and executed with a Python programming language. The results of this part of the
methodology
3.7. Use are presented in Section 4.
It is necessary to identify if the design corresponds to an application, a prototype, a
4. Results
part of a platform or if it must be executed from a specific framework. In this case, the
As part of building a robust intelligent system capable of detecting fruits, vegetables,
design corresponds to the platform that will be used in the future to build an API and a
and their stage of ripeness, the first model had a unique dataset that included the 32 classes
functional physical prototype. Trained models need to be called from where they are
of fruits and vegetables, as well as the 12 classes of fresh and rotten stages, obtaining a
stored and executed with a Python programming language. The results of this part of the
dataset of 44 classes. The performance obtained with this configuration is shown in Table 2,
methodology are presented in Section 4.
which reports the number of classes (No. classes), architecture, optimizer (Opt.), epochs,
execution time (Exec. time), accuracy (Acc.), precision (P), recall (R), and F1-score (F1).
4. Results
Table As part ofresults
2. Model building a robust
for fruit intelligent
and vegetable system capable
classification of detecting fruits, vegetables,
and ripeness.
and their stage of ripeness, the first model had a unique dataset that included the 32
No. Classes
classes Architecture
of fruits Opt. as well
and vegetables, Epochs
as theExec.
12 Time
classes ofAcc. P rotten
fresh and R stages,
F1
obtaining
44 a dataset of 44 classes.
MobileNetV2 Adam The performance
7 obtained87.10%
40.37 min. with this89%
configuration
87% 95%is
Appl. Sci. 2023, 13, 12504 9 of 17
As shown in Table 2, the metrics obtained are good, but they could be improved
by making some changes to the model. According to the results obtained, the decision
taken to obtain a better performance was to generate two models. The first model con-
tained only the fruit and vegetable classes, and the second model contained the fresh and
spoiled classes. This decision was made to have more possibilities to change the hyper-
parameters and obtain better results. Also, to have opportunities to add more classes to
the models in future work. This decision had good results and the results can be seen
in Sections 4.1 and 4.2.
The hyperparameters for the deep learning model were adjusted using a grid search to
identify the best parameters for assessing horticultural maturity. Grid search is a commonly
used method for hyperparameter tuning, which involves training the model with numerous
hyperparameter values and then selecting the model that provides the optimal performance
on a held-out validation set.
Table 3 shows the accuracy increased throughout the training because the hyperpa-
rameters were changed according to the results presented in the previous training. With
respect to the first training with InceptionV3, the training time is significantly reduced, due
to the configuration presented by the model.
As can be seen in Figure 5, both the accuracy and loss curves show that there is no
overfitting or underfitting as they continue up and down, respectively. At this stage, an
early stopping condition was implemented with the parameters that the algorithm stops
if the accuracy and loss do not improve. This ensures that the following phenomena do
not occur.
(a)
(b)
Figure
Figure5. 5.Accuracy andloss
Accuracy and loss curves
curves ofbest
of the theresults
best for
results for vegetable
fruit and fruit and vegetable (a)
classification: classification:
accuracy; (a)
accuracy;
(b) loss.(b) loss.
Analyzing
4.2. Fruit the results
and Vegetable Stageobtained, it is observed that higher accuracy is obtained only
of Ripeness
by tenths in the second training with MobileNetV2, but that the execution time was much
For the
higher secondtomodel,
compared which
the training represents
with the detection
InceptionNetV2. Also, it of thenot
does stage of ripeness,
present transfer
overfitting
learning was alsothese
or underfitting performed using
by analyzing the weights
Figure 6. of MobileNetV2 and InceptionV2. In this
case, three experiments were performed because the final metrics were acceptable. Table
4 shows the results and some hyperparameters used for training.
(a)
(b)
Figure
Figure6.6.Accuracy
Accuracy and losscurves
and loss curvesofof the
the best
best results
results for fruit
for fruit and vegetable
and vegetable ripeness:
ripeness: (a) accuracy;
(a) accuracy;
(b)(b)
loss.
loss.
Figure77shows
Figure shows the theconfusion
confusionmatrices
matrices for for
both model
both 1 (a) 1
model and
(a)model 2 (b). For
and model the For the
2 (b).
case of (a), a high index of good classification is shown for each of the classes of fruits and
case of (a), a high index of good classification is shown for each of the classes of fruits and
vegetables, indicating that 26 of the classes are correctly predicted, and the rest of them are
vegetables,
above 83%indicating
of the correct that 26 of the classes
classification are of
of the total correctly predicted,
the samples for eachand theFor
class. rest
theof them
arespecific
abovecase
83%ofof (b),the correct
5 classes classification
show of the
a classification ratetotal of the
of 100%, samples
while the restfor
areeach
aboveclass.
93%. For the
specific case of (b), 5 classes show a classification rate of 100%, while the rest are above
93%.
Appl. Sci.
Sci. 2023,
2023,13,
13,x12504
FOR PEER REVIEW 12 of 12
17of 17
(a)
(b)
Figure
Figure7.7.Confusion
Confusionmatrix:
matrix:(a)(a)
classification of fruits
classification andand
of fruits vegetables; (b) (b)
vegetables; stage of ripeness.
stage of ripeness.
Appl. Sci. 2023, 13, x FOR PEER REVIEW 13 of 17
Appl. Sci. 2023, 13, 12504 13 of 17
4.3. Inference
4.3. Inference
Inference is a powerful tool that allows machines to reason and learn from
Inference
experience. It isisaacritical
powerful tool that
process allowsAI
in many machines to reason
applications and and
is onelearn from
of the keyexperience.
ingredients
It is makes
that a critical
AIprocess in many
possible. AI applications
In computer vision orand is one
deep of the key
learning, ingredients
inference thattomakes
is used identify
AI possible. In computer vision or deep learning, inference is used
objects in images and videos. For example, a machine can use inference to identify to identify objects in a
person in a crowd, recognize a traffic sign, or in the case of this study, identify fruits aand
images and videos. For example, a machine can use inference to identify a person in
crowd, recognize
vegetables and their a traffic sign, In
ripeness. or Figure
in the case
8, itofcan
thisbestudy,
seen identify
some of fruits and vegetables
the results obtained in
and their ripeness. In Figure 8, it can be seen some of the results obtained in the context of
the context of inference on data with the models already trained. In the upper part, the
inference on data with the models already trained. In the upper part, the recognition of
recognition of some fruits and vegetables and their state of ripeness is observed, while in
some fruits and vegetables and their state of ripeness is observed, while in the lower part,
the lower part, only the recognition of the fruit or vegetable is shown. This is because the
only the recognition of the fruit or vegetable is shown. This is because the CNN has been
CNN
trained has
tobeen
detecttrained
only theto ripeness
detect only the ripeness
of some fruits and ofvegetables.
some fruits and vegetables.
Figure 8.
Figure Results obtained
8. Results obtained with
withinference.
inference.
4.4. Metrics Obtained
4.4. Metrics Obtained
In the case of fruit and vegetable ripeness detection, it is important to consider the
four In the case
metrics of fruit and
of accuracy, vegetable
precision, ripeness
recall, detection,
and F1-score whenitevaluating
is important to consider the
the performance
four metrics of accuracy, precision, recall, and F1-score when evaluating
of the model. The metrics obtained in models 1 and 2 are shown in Table 5. the performance
of the model. The metrics obtained in models 1 and 2 are shown in Table 5.
Table 5. Metrics of the best models.
Table 5. Metrics of the best models.
Model Architecture Acc. P R F1
Model
Classification of fruits Architecture Acc. P R F1
MobileNetV2 97.86% 98% 98% 98%
and vegetables
Classification of fruits and vegetables MobileNetV2 97.86% 98% 98% 98%
Ripeness stage MobileNetV2 100% 98% 99%
Ripeness stage MobileNetV2 100% 98% 99%99% 99%
The metrics
The metrics obtained
obtained for
forfruit,
fruit,vegetable,
vegetable,and
andripeness
ripenessclassification
classificationshow that
show thethe
that
performance of
performance of both
both models
models isisgood.
good. To
Toevaluate
evaluatethe
theperformance
performanceofof thethe
model, it isit is
model,
necessary to compare each of the metrics shown in Table 4. Comparing the metrics the
necessary to compare each of the metrics shown in Table 4. Comparing the metrics of of the
classification of fruits and vegetables model, it can be appreciated that no exists a huge
Appl. Sci. 2023, 13, 12504 14 of 17
classification of fruits and vegetables model, it can be appreciated that no exists a huge
difference between the four metrics, which shows that the model generalized very well,
and it is able to make the main function of detection fruits and vegetables. It is the same
case for the ripeness stage detection model.
5. Discussion
The results show that good training is performed, as indicated by the reported metrics
such as accuracy and loss, but there is room for improvement that can be generated from
the input data. Solutions include increasing the number of images per class, which can help
the model better understand the different variations in each category and be more resilient
to noise and outliers. It may also be useful to have a more diverse dataset consisting of
images from a wider range of sources with different backgrounds, lighting conditions, and
other factors. Another alternative would be to use data augmentation, where new training
samples can be generated by transformations such as cropping, flipping, or rotating. The
graphs shown do not show undertraining or overtraining, as the curves were generated
from the training and validation samples. The data was split 80-10-10, with 80% for training,
10% for validation, and 10% for testing. It should be noted that given the architecture
related to the classification of objects, the algorithm is not able to identify more than one
object for each inference; this can be solved by modifying the architecture of the network by
a network of type object detection. In addition, more variables can be added to detect, such
as a wider range in the spectrum of ripening identification and ripening states to fruits and
vegetables that do not contain them, among others.
Table 6 shows comparisons related to the classification of fruits and vegetables as
well as the determination of ripeness. According to the table, there is a better performance
compared to the authors mentioned. It should be noted that the system in general consists
of two algorithms, but they were trained separately and therefore two different metrics
are reported.
Table 6. Comparative analysis of different methodologies applied for classification and ripeness of
fruits and vegetables.
As can be seen, the reference [38] shows better accuracy than ours, reporting 99.79%.
The reason for this is that our model has 32 classes and the input images for the model
have a size of 512 × 512 pixels. In the reference mentioned above, the input images are
100 × 100 pixels in size, and this helps to increase the accuracy, but probably does not do a
good job of making inferences.
According to the results and comparison with other methods, the proposed system
could be applied in various fields, such as agriculture, where farmers can use this system to
identify the ideal harvesting time, which can improve yields and minimize waste. Similarly,
food processors can use this system to ensure that their products are processed at the
optimal ripening time. In the retail sector, the system has the potential to improve quality
and safety by identifying and discarding overripe produce and allowing customers to select
Appl. Sci. 2023, 13, 12504 15 of 17
the best-conditioned fruits and vegetables from store shelves, thereby reducing food waste.
On the other hand, the proposed methodology could be taken as a basis for other types of
applications, but it would be necessary to evaluate whether the models proposed in this
work are the ones that present the best performance to solve the problem to be addressed,
in addition to obtaining the best hyperparameters, as well as having a robust database for
training, validation, and testing.
Despite the results, there are some limitations; for example, only some classes of fruits
and vegetables have data on ripening, and for this reason, models can be improved by
adding more classes of ripening states to the rest of fruits and vegetables where the ripening
state is not detected. The dataset provides information for only certain types of fruits and
vegetables. This constraint is prevalent in several AI models as gathering data for all classes
can be challenging and costly. Hence, the model may not precisely recognize the ripeness of
fruits and vegetables excluded from the dataset. Performance could be improved by making
changes to the hyperparameters, but this depends on the nature of the problem. Tuning
the hyperparameters could enhance the model’s performance and this could be made
with optimization algorithms, such as genetic (GA) and swarm intelligence (SI). Also, the
general methodology uses two AI models, one to recognize thirty-two categories of fruits
and vegetables, and another model to assess the ripeness of six of the categories. One of
the improvements of the system would be to unify in a single model both the classification
between fruits and vegetables and the identification of the ripeness stage. On the other
hand, the system is not able to detect damaged, obstructed or poorly illuminated fruits and
vegetables. Future research will take these points into account in order to create a robust
system. Finally, this work can only detect one fruit or vegetable per inference. Therefore, it
is necessary to develop a system that allows the detection of more than one element per
inference, achieving a more robust system. As part of future work, enhancing the model’s
performance could involve integrating real-time inference capabilities. This will enable the
model to be used in real-world applications, including sorting and grading on farms and
quality control in stores. Additionally, incorporating sensors such as hyperspectral and
near-infrared cameras could enhance the model’s performance. These devices can provide
additional information on the chemical composition and structural characteristics of fruits
and vegetables.
6. Conclusions
Food waste will always be a problem to be considered and solved with the help of
current technologies, such as AI. Certainly, awareness must come from everyone and
try to contribute to the reduction of the current percentages of food waste worldwide.
Undoubtedly, the creation of models such as the one presented above contributes to the
solution. For both algorithms that make up the system in general, high metrics are reported,
so it can be concluded that it is a robust system. The results show, for the classification of
fruits and vegetables, 97.86% accuracy, 98% precision, 98% recall, and 98% F1-score, and
for the ripeness assessment, the results show 100% accuracy, 98% precision, 99% recall,
and 99% F1-score. It should be noted that there is no other model capable of detecting
32 types of fruits and vegetables and assessing their ripeness. The use of transfer learning
to train neural network models with many classes is advantageous because it generates
high metrics and reduces computational costs. It can be mentioned that deep learning
generally uses only one model to obtain results like the problem solved in this research;
however, in this case, the remarkable contribution is the design and development of a
general intelligent system based on two Convolutional Neural Networks trained separately
and that the finally trained models are brought together in a single algorithm to generate
the outputs as shown in this paper.
Appl. Sci. 2023, 13, 12504 16 of 17
References
1. Kyriacou, M.C.; Rouphael, Y. Towards a new definition of quality for fresh fruits and vegetables. Sci. Hortic. 2018, 234, 463–469.
[CrossRef]
2. Moreno, J.L.; Tran, T.; Cantero-Tubilla, B.; López-López, K.; Becerra López Lavalle, L.A.; Dufour, D. Physicochemical and
physiological changes during the ripening of Banana (Musaceae) fruit grown in Colombia. Int. J. Food Sci. Technol. 2021, 56,
1171–1183. [CrossRef] [PubMed]
3. Maduwanthi, S.D.T.; Marapana, R. Biochemical changes during ripening of banana: A review. Int. J. Food Sci. Nutr. 2017, 2,
166–169.
4. Bhargava, A.; Bansal, A. Fruits and vegetables quality evaluation using computer vision: A review. J. King Saud Univ. Comput. Inf.
Sci. 2021, 33, 243–257. [CrossRef]
5. Kurtulmus, F.; Lee, W.S.; Vardar, A. Immature peach detection in colour images acquired in natural illumination conditions using
statistical classifiers and neural network. Precis. Agric. 2014, 15, 57–79. [CrossRef]
6. Gongal, A.; Amatya, S.; Karkee, M.; Zhang, Q.; Lewis, K. Sensors and systems for fruit detection and localization: A review.
Comput. Electron. Agric. 2015, 116, 8–19. [CrossRef]
7. Vignati, S.; Tugnolo, A.; Giovenzana, V.; Pampuri, A.; Casson, A.; Guidetti, R.; Beghi, R. Hyperspectral Imaging for Fresh-Cut
Fruit and Vegetable Quality Assessment: Basic Concepts and Applications. Appl. Sci. 2023, 13, 9740. [CrossRef]
8. Lorente, D.; Aleixos, N.; Gómez-Sanchis, J.; Cubero, S.; García-Navarrete, O.L.; Blasco, J. Recent Advances and Applications of
Hyperspectral Imaging for Fruit and Vegetable Quality Assessment. Food Bioprocess Technol. 2012, 5, 1121–1142. [CrossRef]
9. Song, Z.; Fu, L.; Wu, J.; Liu, Z.; Li, R.; Cui, Y. Kiwifruit detection in field images using Faster R-CNN with VGG16.
IFAC-PapersOnLine 2019, 52, 76–81. [CrossRef]
10. Wan, S.; Goudos, S. Faster R-CNN for multi-class fruit detection using a robotic vision system. Comput. Netw. 2020, 168, 107036.
[CrossRef]
11. Chu, P.; Li, Z.; Lammers, K.; Lu, R.; Liu, X. Deep learning-based apple detection using a suppression mask R-CNN. Pattern
Recognit. Lett. 2021, 147, 206–211. [CrossRef]
12. Sun, M.; Xu, L.; Luo, R.; Lu, Y.; Jia, W. GHFormer-Net: Towards more accurate small green apple/begonia fruit detection in the
nighttime. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 4421–4432. [CrossRef]
13. Przybył, K.; Gawrysiak-Witulska, M.; Bielska, P.; Rusinek, R.; Gancarz, M.; Dobrzański, B., Jr.; Siger., A. Application of Machine
Learning to Assess the Quality of Food Products—Case Study: Coffee Bean. Appl. Sci. 2023, 13, 10786. [CrossRef]
14. Cubero, S.; Aleixos, N.; Moltó, E.; Gómez-Sanchis, J.; Blasco, J. Advances in Machine Vision Applications for Automatic Inspection
and Quality Evaluation of Fruits and Vegetables. Food Bioprocess Technol. 2011, 4, 487–504. [CrossRef]
15. Ismail, N.; Malik, O.A. Real-time visual inspection system for grading fruits using computer vision and deep learning techniques.
Inf. Process. Agric. 2021, 9, 24–37. [CrossRef]
16. Fahad, L.G.; Tahir, S.F.; Rasheed, U.; Saqib, H.; Hassan, M.; Alquhayz, H. Fruits and Vegetables Freshness Categorization Using
Deep Learning. Comput. Mater. Contin. 2022, 71, 5083–5098. [CrossRef]
17. Roy, K.; Chaudhuri, S.S.; Pramanik, S. Deep learning based real-time Industrial framework for rotten and fresh fruit detection
using semantic segmentation. Microsyst. Technol. 2021, 27, 3365–3375. [CrossRef]
18. Ananthanarayana, T.; Ptucha, R.; Kelly, S.C. Deep Learning based Fruit Freshness Classification and Detection with CMOS Image
sensors and Edge processors. IS T Int. Symp. Electron. Imaging Sci. Technol. 2020, 2020, 172-1–172-7. [CrossRef]
Appl. Sci. 2023, 13, 12504 17 of 17
19. Chen, M.C.; Cheng, Y.T.; Liu, C.Y. Implementation of a Fruit Quality Classification Application Using an Artificial Intelligence
Algorithm. Sens. Mater. 2022, 34, 151–162. [CrossRef]
20. Zargham, A.; Haq, I.U.; Alshloul, T.; Riaz, S.; Husnain, G.; Assam, M.; Ghadi, Y.Y.; Mohamed, H.G. Revolutionizing Small-Scale
Retail: Introducing an Intelligent IoT-based Scale for Efficient Fruits and Vegetables Shops. Appl. Sci. 2023, 13, 8092. [CrossRef]
21. Fan, S.; Li, J.; Zhang, Y.; Tian, X.; Wang, Q.; He, X.; Zhang, C.; Huang, W. On line detection of defective apples using computer
vision system combined with deep learning methods. J. Food Eng. 2020, 286, 110102. [CrossRef]
22. Bhargava, A.; Bansal, A. Classification and Grading of Multiple Varieties of Apple Fruit. Food Anal. Methods 2021, 14, 1359–1368.
[CrossRef]
23. Zhang, Y.; Gao, J.; Zhou, H. Breeds Classification with Deep Convolutional Neural Network. In Proceedings of the 2020 12th
International Conference on Machine Learning and Computing, Shenzhen, China, 15–17 February 2020; pp. 145–151. [CrossRef]
24. Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [CrossRef]
25. Fernández, M.A.A. Inteligencia Artificial Para Programadores Con Prisa, 2nd ed.; Universo de letras: Sevilla, Spain, 2021.
26. Zhang, E. Fruit Classification. Kaggle. Available online: https://www.kaggle.com/datasets/sshikamaru/fruit-recognition
(accessed on 1 March 2022).
27. Reddy, S. Fruits Fresh and Rotten for Classification. Kaggle. Available online: https://www.kaggle.com/datasets/sriramr/fruits-
fresh-and-rotten-for-classification (accessed on 1 March 2022).
28. Mukhiddinov, M. Fruits and Vegetables Dataset. Kaggle. Available online: https://www.kaggle.com/datasets/muhriddinmuxiddinov/
fruits-and-vegetables-dataset (accessed on 1 March 2022).
29. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In
Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June
2018; pp. 4510–4520. [CrossRef]
30. Unay, D.; Gosselin, B. Automatic defect segmentation of ‘Jonagold’ apples on multi-spectral images: A comparative study.
Postharvest Biol. Technol. 2006, 42, 271–279. [CrossRef]
31. Zhu, B.; Jiang, L.; Luo, Y.; Tao, Y. Gabor feature-based apple quality inspection using kernel principal component analysis. J. Food
Eng. 2007, 81, 741–749. [CrossRef]
32. Ranjit, K.N.; Raghunandan, K.S.; Naveen, C.; Chethan, H.K.; Sunil, C. Deep Features Based Approach for Fruit Disease Detection
and Classification. Int. J. Comput. Sci. Eng. 2019, 7, 810–817. [CrossRef]
33. Zhang, Y.D.; Dong, Z.; Chen, X.; Jia, W.; Du, S.; Muhammad, K.; Wang, S.H. Image based fruit category classification by 13-layer
deep convolutional neural network and data augmentation. Multimed. Tools Appl. 2019, 78, 3613–3632. [CrossRef]
34. Dubey, S.R.; Jalal, A.S. Apple disease classification using color, texture and shape features from images. Signal Image Video Process.
2016, 10, 819–826. [CrossRef]
35. Kim, D.G.; Burks, T.F.; Qin, J.; Bulanon, D.M. Classification of grapefruit peel diseases using color texture feature analysis. Int.
J. Agric. Biol. Eng. 2009, 2, 41–50. [CrossRef]
36. Arakeri, M.P. Lakshmana Computer Vision Based Fruit Grading System for Quality Evaluation of Tomato in Agriculture industry.
Procedia Comput. Sci. 2016, 79, 426–433. [CrossRef]
37. Jahanbakhshi, A.; Momeny, M.; Mahmoudi, M.; Zhang, Y.D. Classification of sour lemons based on apparent defects using
stochastic pooling mechanism in deep convolutional neural networks. Sci. Hortic. 2020, 263, 109133. [CrossRef]
38. Sakib, S.; Ashrafi, Z.; Siddique, M.A.B. Implementation of Fruits Recognition Classifier Using Convolutional Neural Network
Algorithm for Observation of Accuracies for Various Hidden Layers. arXiv 2019, arXiv:1904.00783.
39. Hossain, M.S.; Al-Hammadi, M.; Muhammad, G. Automatic Fruit Classification Using Deep Learning for Industrial Applications.
IEEE Trans. Ind. Inform. 2019, 15, 1027–1034. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.