1 s2.0 S2772671124002419 Main (Asp)
1 s2.0 S2772671124002419 Main (Asp)
A R T I C L E I N F O A B S T R A C T
Keywords: Nowadays, Diabetes Mellitus is one of the significant health challenges that affects many people across the world.
Diabetes mellitus Early detection of Diabetes Mellitus will help in preventing complications, i.e., kidney disease, nerve damage, eye
Glucose damage, etc. Over the past few years, several Machine Learning and Deep Learning techniques have been applied
Diabetic symptoms
for the early detection of Diabetes Mellitus. The paper provides reviews on various Machine Learning and Deep
TYpe 2 diabetes mellitus
Deep learning
Learning techniques applied for early detection of Diabetes mellitus. The review criteria mainly focus on five
Diabetes Management topics: the diabetes dataset, methods used, performance metrics, limitations of the work, and the overall status of
diabetic research. The objective of this paper is to provide a comprehensive review of Diabetes Mellitus pre
diction techniques applying Machine Learning and Deep Learning that will be helpful sources for researchers in
the healthcare field.
* Corresponding author.
E-mail address: [email protected] (N. Katiyar).
https://doi.org/10.1016/j.prime.2024.100661
Received 12 January 2024; Received in revised form 31 May 2024; Accepted 20 June 2024
Available online 28 June 2024
2772-6711/© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
2
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
like increased thirst and more frequent urination studied by Suyanto labelled data have similar features and characteristics. The machine
et al. [12]. In this type of diabetes, a monthly check-up is recommended learning procedure supervises the outputs of a given set of inputs. In this
by the doctor. algorithm, suppose that the input variable of the diabetic dataset is X,
and the output variable is Y. When we apply this learning algorithm, we
3. Types of machine learning techniques get the predicted new input as X. The latest output is Y given by[16].
The semi-supervised algorithms used for diabetes detection are KNN,
The Machine Learning algorithms are categorized into four sections, DT and SVM. It is an effective approach where the model is trained for
as shown in Fig. 2. the labelled data, meaning the input features are associated with the
diabetes outcomes. It also uses overfitting techniques. Additionally, the
1) Supervised learning, which operates on labelled data. supervised algorithm performs feature selection and extraction tech
2) Unsupervised learning, which works with unlabeled data. niques to enhance the model interoperability.
3) Semi-Supervised learning, which utilizes both labelled and unlabeled
data. 3.2. Unsupervised learning
4) Reinforcement learning, which operates based on feedback systems
and experiences as given in article Shokrekhodaei et al. [2]. Unsupervised learning uses unclassified, unlabeled data. The main
objective of unsupervised learning is to acquire unhidden information
from datasets. We can get unhidden information from unique training
3.1. Supervised learning systems or add the classification process from the datasets. This learning
applies clustering techniques and divides the data according to the
The supervised learning worked on the labelled datasets. The features from a group of items that have similar features.
3
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
The unsupervised algorithms used for diabetes detection are PCA, label are not pre-defined. It totally worked on the hit-and-trial proced
Neural Networks, Hierarchal Clustering, and K-mean Clustering. Lekha ure. The Agent’s action worked according to the previous action. Ghosh
et al. [5] performed research using the techniques of diabetes detection et al. [18] focused on the reinforcement learning procedure mainly
use clustering, dimensionality reduction and generative models. depends upon the Agent, environment, action, reward and policy.
The reinforcement learning used for diabetes detection is Thompson
3.3. Semi-Supervised learning sampling and posterior sampling. The main goal of reinforcement
learning is to optimize the insulin dosage for a patient with diabetes and
The ML algorithm is an intermediate between supervised and unsu minimize the risk of diabetes hypoglycemia (low blood sugar level).
pervised learning algorithms [3-4]. It consolidates both labelled and
unlabeled data. In this algorithm, the training data is in the form of 4. A brief about the survey paper criteria
tuples. Various labels are applied to the dataset to easily provide the
predicted output. If we apply a pseudo-labeling process, the output may In the healthcare sector, particularly within the realm of diabetes
be incorrect. mellitus, many works are available. The work chosen for our review
The Supervised algorithm used for diabetes detection is clustering, process is sourced from Scopus or may originate from peer-reviewed
densities and dimensionality reduction. In semi-supervised learning, journals. Their primary purpose lies in aiding academic researchers
diabetes detection is performed with fine-tuning, Interference, Moni and scientists in the early-stage prediction of diabetes. Additionally,
toring and Updating. these research papers may serve as a foundation for the development of
advanced techniques for diabetes prediction. Notably, these papers
3.4. Reinforcement learning straddle two prominent domains: medical science and computer science.
Chaki et al. [17] has given a prevalent theme in diabetes detection
Reinforcement learning uses feedback systems for learning. In this research involves the intricate interplay between Machine Learning
learning procedure, the Agent learns and improves its performance by (ML) and Deep Learning (DL) techniques. These intelligent computa
using feedback systems. It does not use any labelled data. It solves the tional methods represent cutting-edge approaches in the healthcare
problem of sequential decision-making. In this learning, the output and field, offering a wealth of precise manuscripts that harness the power of
4
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
5
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
6
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
7
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
8
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
9
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
Fig. 4. Categorizationof Algorithms uses Machine Learning and Deep Learning techniques analysed in this manuscript.
and computational efficiency. In the context of diabetes, PCA can be 1) Random Forest- Random Forest is a type of decision tree; this tree is
employed in several ways – also known as the form CART (Classification and Regression Tree). It
1. Feature reduction—Diabetes detection and prediction are based is a collection of more decision trees. In this tree, each node sub-tree
on features (e.g., glucose, insulin blood pressure, BMI, etc.). PCA is connected to the other node of the tree. Alkhodari et al [6] has
can decrease the feature count by merging similar types of fea provide the concept of bagging and boosting is commonly used, and
tures into a smaller set of uncorrelated principal components. It researchers are interested in using random forest and boosting
reduces the multi-co linearity issues and supports model techniques for diabetes prediction. Both Random Forest and Boosting
interpretability. are popular ensemble machine learning methods used for classifi
2. Data Visualization—In PCA, the high-dimensional diabetes data cation tasks like predicting diabetes. Whenever people apply a
in 2D or 3D can be easily visualized by plotting the data points of boosting technique in a random forest, it starts by building a baseline
the principal components. random forest model. The random forest worked on predictions
3. Pre-processing—This is indulgently working on data pre- depending on the features, such as the number of trees, maximum
processing techniques; it’s a step before feeding the data into depth, and minimum samples per leaf.
machine learning models. 2) Multilayer Perceptron-MLP works in the same way the human
brain works. The human brain handles and responds to any situation
It performs the dimensionality reduction procedure to reduce the risk the same way MLP does. MLP is the advanced version of ANN. A
of overfitting. Multilayer Perceptron (MLP) is a form of artificial neural network
that features several layers of nodes, also known as neurons, which
1) Deep Neural Network- It relies on unsupervised learning that is are interconnected. It usually includes an input layer, several hidden
used in the prediction and detection of many diseases and many layers, and an output layer. Within each layer, the neurons execute
health applications. This neural network is used for measuring blood calculations on the incoming data. In a multilayer perceptron (MLP),
sugar levels and early detection of diabetes, and it’s essential for neurons are organized into layers. Each neuron in these layers gets
efficient management and prevention of complications in the context inputs from the previous layer, calculates a weighted sum of these
of diabetes. Deep Neural Networks are used in several ways: inputs, adds a bias term, and then processes the data using an acti
1. Data Collection—This is the first step in DNN to assemble infor vation function. The architecture of an MLP is so dense it consists of
mative data on patients. The data set consists of blood glucose more hidden layers and complex features from the data as given in
levels, age, Body Mass Index, Blood pressure, and other health- the article Shuja et al. [14].
relevant indicators. This dataset is used to detect whether a per 3) Convolutional Neural Network-CNN is a category of deep learning
son is diabetic or not. that works similarly to the feed-forward neural network. In CNN, the
2. Network Architecture Model—The DNN architecture model is data is analyzed through transitional and rotational methods. This
very complex in its performance. It may consist of some other neural layer is applied to the convolutional operation input data. It
neural architecture, such as feed-forward neural networks, con also applies some filters to the input data. This network trained itself
volutional neural networks (CNNs) for image-based features, or automatically based on features and patterns [6]. This algorithm is
recurrent neural networks (RNNs) for sequential data for the specially designed to process and analyze visual data, such as images
blood glucose detection of diabetes. and videos.
3. Training of Model- The diabetic dataset is categorized into a 4) Artificial Neural Networks-ANNs work the same way humans
training set or validation set. respond to any situation. It processes the data based on computa
tional techniques. The ANN applies the model to our problems and
The DNN model is trained on the training set using an intensification obtains high performance. The ANN algorithm maps the input data
algorithm like gradient descent to minimize the loss function of the combined with appropriate output classes. According to Kazerouni
diabetic dataset as mentioned in the article of Theis et al. [4]. et al. [15] the ANN consists of an Input layer, a Hidden Layer, and an
10
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
output layer. Its architecture indulges in a neural network with many the model with classifiers such as KNN, Naive Bayes, and decision trees.
artificial neurons, which is termed as each unit arranged in a The model was prepared to forecast the prediction result [29].
sequence of layers. The research based on female patients in Bangladesh has been used
5) Recurrent Neural Network-RNN worked for the sequential data. In to practice diabetes detection. In this research, the two databases formed
RNN, the processing of sequential data depends on the time series. a comparative study with various machine learning algorithms PrantoB,
The time series depends upon from 1 to π. The RNN is called recur et al. [30].has taken two datasets taken are PIDD, and second one is
rent because it executes the same task for every element of the Kurmitola General Hospital Datasets. Eight samples for feature extrac
sequence. Zhu et al. [22] worked on-time computations, its output is tion used four machine Learning algorithms: KNN, Decision Tree,
dependent. Another process of thinking of RNN is that it captures the Random Forest and Naive Bayesthe patients of Diabetes and tuberculosis
information that has been calculated so far. The RNN algorithm has who are residing in the urban slums of India. If a patient having tuber
four stages: the first one is the Input Stage, the second one is the culosis is not getting proper treatment, then the person becomes a dia
hidden stage, the third is the weight, and the last is the output stage. betic patient. The continuous supply of glucose makes the person
diabetic. It will increase the risk factor if the person becomes diabetic
7. Discussion patient. The random glucose sampling of a person with HbA1c features
confirms the identification of undetected diabetes cases [31].
In this manuscript, machine learning and deep learning techniques Khan et al. [33] their research focuses on insulin management,
are reviewed. This Paper reviews the supervised, semi-supervised, explicitly examining the regulation and modification of insulin across
reinforcement, unsupervised, and deep learning articles by analysing four to five cell types, named alpha, beta, delta, pp (f cells), and epsilon.
the results, techniques and methods. The main motive behind this Paper These cells were analysed under various parameters, assessing their
is to determine classification methods from a qualitative viewpoint that impact on insulin secretion, glucose regulation, and somatostatin
substantially facilitate future research efforts and increase awareness of secretion. Among these, the beta cell exhibited superior performance.
the advancements in the domain of diabetes detection and prediction. In Vyas Set al. [34], explored various diseases using datasets including
this review, the dataset of 55 articles is reviewed. The research of ma digestive and kidney diseases, digestive disorders, and diabetes. So
chine learning and deep learning will create automatic diabetes detec many multiple algorithms are used for diabetes detection. This paper
tion platforms. used the ANN, Dragonfly algorithm and artificial bee algorithms. A new
This review encompasses a wide range of topics, including the algorithm was implemented called and named it (ANN-DA). The
exploration of various databases and their unique methods, classifica ANN-DA algorithm provides better results as compared to the ANN al
tion techniques, and the diagnosis of diabetes. We also divide into the gorithm. This algorithm predicts diseases like Hepatitis, Diabetes, breast
performance metrics of the models, which serve as indicators of model cancer, blood donations and heart disease [35]. Thippa Reddy et al. [36]
efficiency, accuracy, specificity, F1-score, and recall value. Additionally, also focused on the classification of diabetes data. They have created a
we discuss the empirical-based research, which, despite its time- new algorithm for diabetic classification. This algorithm, named the
consuming nature, has proven to be the most effective in diabetes pre firefly-BAT algorithm (FBAT), is a rule-based fuzzy logic-based predic
diction and detection. tion algorithm. The fuzzy logic-based system has been designed with the
Ayon et al. [27] performed research on diabetes detection and pre help of fuzzy-based rules. The process consists of a rule-based system of
diction with the help of breath. The gas sensor is used for breath diabetes and decision and optimal rule generation via FF-BAT. Olisah
detection, and Acetone as a breath marker. A multivariate relevance et al. [37] performed research on diabetes detection using deep learning
vector machine trained the gas sensor on the database. In modeling, the techniques. PIDD dataset used for detection. The prediction process is
temperature and humidity sensor gas sensor perform analysis by CGS-8 categorized into two sets, the first of which is the data filtration and
intelligent gas sensing and analysis systems. Gupta et al. [23] worked on cleaning process. In the second set, all predictive analyses were per
diabetes detection using ECG signals. The ECG Signals experiment was formed. In the first set, they take the PIDD database, remove the missing
performed using a total of 86 persons to test whether the person had values, give feature importance and feature selection procedure, and
diabetes or not. In total 86 persons, they found 35 persons have diabetes. provide the missing value imputation process. In the second set, data are
The model contains ECG recording, Intrinsic time scale- decomposition tested and trained, and analysis of the basic model is performed to give
(ITD), features, and classify the features of persons. Li et al. [24] per the predicted results.The prediction of diabetes in a lavish lifestyle. In
formedresearch on human genes for the detection of Type-2 diabetes. this paper, survey-based research was conducted. The sample data was
Type-2 diabetes has 1600 single cells, and these 949 cells have Type-2 collected from the population of Jammu & Kashmir by distributing the
patients and 651 cells from everyday concepts. The selected features survey form to J&K. The data was collected from 1939 people. Rename
and algorithms used are KNN, SVM, and RF. After applying this algo this dataset into T1IDM Lifestyle datasets. Data processing techniques
rithm, the results are provided optimized Type-2 diabetes-associated and data splitting procedures have been performed on it. Applying the
and performed optimized classification techniques. K-fold cross-validation technique will result in the T1IDM [38].
Carlton et al. [25] has given three methods are used to perform the Yahyaoui et al. [39] performed Diabetes detection using machine
empirical research. Qualitative interviews will inform the draft PROM learning techniques involves taking a dataset and adding some original
content. Cognitive debriefing on the interview data refines the draft features. Bootstrapping methods are applied for data filtration, and a
from PROM content. A classification system has generated a smaller majority voting process is used to obtain results. Input data are pro
number of items from PROM. Aguilera et al. [26] performed the research cessed using convolutional max pooling layers and feature mapping on
studies on diabetic patients determine whether the diabetic person is in dense layers, with Random Forest being used to obtain the results.
the depression or not. This study was performed on 276 adults. These Khan et al. [45] worked on a groundbreaking multi-view clustering
adults are between 18 and 75 years old. They measure the depressive approach was introduced, leveraging concept factorization to enhance
symptoms on a scale of 8. The depressive scale intervention was checked clustering performance. This novel method surpassed existing tech
within six months, followed by a follow-up procedure. The PIDD dataset niques, showcasing its efficiency in handling multi-view data. Khan et al.
of diabetic patients was analysed using a deep neural network. The [46] developed Another innovative framework, DECCA, was developed
classification procedure in this research was performed using K-fold based on Contractive Autoencoders. It demonstrated superior results in
cross-validation [27]. Diabetes diagnostics employed the same PIDD clustering high-dimensional document data, even on real-world
database. The bootstrapping method was used to correct the dataset. The datasets.
model comprised dataset selection, data pre-processing, feature extrac Debelee et al. [47] worked on the Commonly available datasets for
tion through PCA, and the application of the resample channel to train skin lesion analysis were identified, exploring various machine-learning
11
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
techniques used for skin disease detection. The study examined the Table 4
contributions and limitations of current state-of-the-art methods. Khan Summary of work with their contributions and limitations.
et al. [48] introduced a new multi-view clustering algorithm called S. Authors Details Contribution Limitations
MCNMF that utilizes manifold regularization to significantly improve No
clustering results as compared to other algorithms on various real-world 1 Ahmed et al. [1] Their groundbreaking Fuzzy logic techniques are
datasets. The importance of accurate detection of diabetes is demon work is cantered on ingeniously employed to
strated by proposing a method using machine learning techniques for elevating the precision blend two to three ML
this purpose, testing the process on a diabetes dataset, achieving high and dependability of algorithms.
diabetes prediction
performance compared to previous methods, and emphasizing the sig through the fusion of
nificance of early detection for effective treatments [49]. Rucci et al. multiple machine-
[50] more focused challenges in detecting Type 2 Diabetes and Predia learning models. The
betes present the development and evaluation of predictive models for authors innovatively
crafted a unified machine
identifying at-risk individuals in Argentina and emphasize the impor
learning framework that
tance of early detection in individuals unaware of their condition. The amalgamates the merits
global and local learning models of diabetes detection have been of diverse predictive
designed. Rufo et al. [51] developed a model by integrating diverse algorithms, thereby
machine learning techniques like XGBoost and Naive Bayes for inter enhancing the prediction
outcomes for diabetes.
national learning. The local learning model consists of KNN, SVM and 2 Shokrekhodaei They conducted The VIS-NIS sensor size
RBF for local learning. They apply this model to the Pima Indian Dia al. [2] comprehensive must be minimized so it
betes Dataset, which provides an accuracy of 99.5%. Overall, the paper experiments to validate turns out to be a wearable
offered a vast description of the diabetes dataset used in deep learning the performance of their sensor.
system. They provided a
and machine learning techniques for the identification of diabetes.
detailed analysis of the
Table 4 shows the summary of related works, highlighting both contri system’s accuracy in
butions and limitations of this study. glucose level detection,
comparing it with
8. Conclusion traditional invasive
methods and
demonstrating its
Recognizing the pivotal role of ML techniques in advancement in the efficacy.
healthcare sector, particularly in disease prediction, the manuscript in 3 Sharma et al. [3] They proposed a novel . The sensitivity and
vestigates the utilization of various ML and DL approaches for the early service composition relativity value and
model within the cloud accuracy are less.
detection of Diabetes. The analysis offered in this work revolves around
environment that
five key factors: the dataset used for Diabetes Mellitus, machine integrates various
learning-based identification methods, performance measures, limita healthcare services. This
tions in current diabetes detection research, and the overall status of model allows for the
diabetic research. efficient coordination of
medical resources and
The study objective of this manuscript is to provide a detailed study
services, optimizing the
on DM prediction, diabetes management procedures, and preventive healthcare delivery
techniques. This information is designed to serve as a practical resource process for diabetes
for researchers in the field. The importance of a systematic approach and patients and improving
the overall efficiency and
the inclusion of critical factors contribute to a more nuanced under
effectiveness of medical
standing of the subject matter. As the scientific community remains to care provision.
engage in a comprehensive review, automated DM detection and self- 4 Thesis et al. [4] They critically analyzed The approach they used for
management emerge as vital avenues for further exploration and the integration of the longitudinal patient
development in diabetes research. The insights presented herein lay the machine learning records is not applicable to
techniques with E-Nose the minimal time patients
foundation for ongoing works to address the multifaceted objection
technology, discussing stayed in the hospital.
posed by Diabetes Mellitus on a global scale. Due to the rapid how these combined
advancement of Artificial Intelligence and the Internet of Things, we approaches can increase
would like to integrate further these advanced technologies for data the accuracy and
reliability of non-invasive
processing for the early prediction of Diabetes Mellitus.
diabetes diagnosis. The
review also explored
Ethical approval future directions in
machine learning
This article does not contain any studies with human participants or algorithms for improving
diagnostic performance
animals performed by any of the authors.
and personalized
healthcare solutions in
Funding diabetes management.
5 Lekha et al. [5] They developed an . The MOSFET sensor for
advanced prediction breath detection is not
This research did not receive any specific grant from funding
model that combines wearable. Use for clinical
agencies in the public, commercial, or not-for-profit sectors. process mining with deep purposes is not possible.
learning techniques to
CRediT authorship contribution statement enhance the in-hospital
mortality prediction for
diabetes patients in the
Neha Katiyar: Writing – review & editing, Writing – original draft, intensive care unit (ICU).
Visualization, Methodology, Conceptualization. Hardeo Kumar Tha
(continued on next page)
kur: Supervision, Methodology. Anindya Ghatak: Supervision,
Conceptualization.
12
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
6 Alkhodari et al. They contributed by The approach required results over traditional
[6] developing a machine careful considerations on methods, making it a
learning-based method to the clinical more reliable and
screen for cardiovascular implementations. effective tool for early
autonomic neuropathy diabetes diagnosis and
(CAN) in diabetic prevention.
patients, particularly 13 Mamatha et al. The authors worked on It shows that 768 samples
those with microvascular [13] big data and applied data are big data. These sample
complications. Their mining in healthcare, values are not considered
method utilizes 24-hour specifically focusing on as big data.
heart rate variability the detection of diabetes.
(HRV) data to diagnose They contributed to the
complications of diabetes. potential of big data
7 Al Jlailateyet al . They contributed to the None analytics in enhancing the
[7] practical relevance of understanding, analysis,
machine learning and prediction of
techniques for analysing diabetes; they were more
and converting data from focused on personalized
inertial sensors worn by healthcare solutions.
individuals. Their work 14 Shuja et al. [14] They implemented the The model they provide for
not only improved the Synthetic Minority Over- DL and ML techniques
accuracy and efficiency of sampling Technique needs to be clearly defined.
gesture detection but also (SMOTE) to address data
integrated technologies in imbalance, a common
everyday health challenge in medical data
monitoring devices. analysis. This technique
8 Fazakiset al. [8] They created intelligent The AOC curve value is so significantly improved
machine learning tools for much less. the predictive
the long-term risk of performance of their
developing type 2 model with the data
diabetes prediction. Their mining process.
models can accurately 15 Kazerouni et al. They explored the The model used in the
assess an individual’s risk, [15] potential of long non- paper needs to be more
enabling early coding RNA (lncRNA) clearly defined.
intervention and expression for predicting
preventive healthcare Type 2 diabetes mellitus
measures. (T2DM) and detected
9 Siddiqui et al. They extensively If there were fewer diabetes on an RNA
[9] reviewed non-invasive or features, the accuracy rate molecular basis.
painless blood glucose could be higher. It takes 16 Khanam et al. They contributed to The accuracy is less than
monitoring techniques only Five features. [16] performing detailed 77% on the train and test
from 2012 to 2016, comparative analyses on split phase.
examined the progress in different machine
the diabetes prediction learning algorithms to
area, and facilitated the assess their efficiency and
connection between accuracy in predicting
information technology diabetes. They provide
and healthcare. valuable insight for better
10 Thakkar et al. The study incorporates Precision and recall value prediction and
[10] understanding and using is not calculated. management of diabetes.
advanced analytical Feature description is 17 Chaki et al. [17] They performed a It takes less than no paper
techniques to enhance the absent in the paper. systematic review to to do a diabetic review.
accuracy and reliability of evaluate the effectiveness
diabetes prognosis. By of machine learning (ML)
dissecting and analysing and artificial intelligence
these techniques, more (AI) techniques in the
sophisticated and detection and self-
effective tools for diabetes management of diabetes
management and mellitus and highlighted
research are developed. the research gaps in this
11 Fiarmi et al. They contributed by The model used in the field.
[11] comprehensively paper provide less 18 Ghosh et al. [18] They contributed by The Data Pre-Processing
analysing diabetes data to accuracy. performing research on techniques are used for
interpret the patterns and diabetes management better results.
correlations related to tools.
diabetes complications. 19 Tripathi et al. They conducted an in- The Accuracy must be
They applied data mining [19] depth analysis of various improved.
techniques and the machine learning
effective use of clinical algorithms to determine
data to improve the the most effective
prognosis and treatment approach for predicting
of diabetes diseases. early diabetes. Their
12 Suyanto et al. They use medical datasets In this paper, splinting the research contributes to
[12] to predict diabetes. This dataset into clusters is the optimization
work provides significant impossible. algorithms for healthcare
(continued on next page)
13
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
14
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
15
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
16
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
Table 4 (continued ) [14] M. Shuja, S. Mittal, M. Zaman, Effective prediction of type ii diabetes mellitus using
data mining classifiers and SMOTE, in: InAdvances in Computing and Intelligent
S. Authors Details Contribution Limitations Systems: Proceedings of ICACM 2019, Springer, Singapore, 2020, pp. 195–211.
No [15] F. Kazerouni, A. Bayani, F. Asadi, L. Saeidi, N. Parvizi, Z. Mansoori, Type2 diabetes
mellitus prediction using data mining algorithms based on the long-noncoding
ResNet101, significantly
RNAs expression: a comparison of four data mining approaches, BMC Bioinform.
outperformed previous 21 (1) (2020) 1–13, https://doi.org/10.1186/s12859-020-03719-8.
benchmarks, achieving a [16] J.J. Khanam, S.Y. Foo, A comparison of machine learning algorithms for diabetes
MAP score of 86.4% on prediction, ICT Express 7 (4) (2021) 432–439, https://doi.org/10.1016/j.
the DFUC2020 dataset. icte.2021.02.004.
51 Khan et al .[55] The study achieved The machine learning [17] J. Chaki, S.T. Ganesh, S.K. Cidham, S.A. Theertan, Machine learning and artificial
remarkable accuracy models is trained and intelligence based Diabetes Mellitus detection and self-management: a systematic
rates of 99.35% with ANN varies due to input review, J. King Saud Uni.-Comput. Inform. Sci 34 (6) (2022) 3204–3225. Jun 1.
on the PIMA dataset and attributes and their [18] P. Ghosh, S. Azam, A. Karim, M. Hassan, K. Roy, M. Jonkman, A comparative study
99.36% with RF on the relationship with the target of different machine learning tools in detecting diabetes, Procedia Comput. Sci 192
early diabetes risk class label. (2021) 467–477. Jan 1.
dataset. The results of some [19] G. Tripathi, R. Kumar, Early prediction of diabetes mellitus using machine
learning, in: In2020 8th international conference on reliability, Infocom
The selected features methods decreased due to
technologies and optimization (trends and future directions)(ICRITO), IEEE, 2020,
using Sequential Feature the reduction in the
pp. 1009–1014. Jun 4.
Selection (SFS) number of features.
[20] Z. Ye, J. Wang, H. Hua, X. Zhou, Q. Li, Precise detection and quantitative
significantly improved prediction of blood glucose level with an electronic nose system, IEEE Sens. J 22
the prediction (13) (2022) 12452–12459. Jun 7.
performance of machine [21] H. Zhu, C. Liu, Y. Zheng, J. Zhao, L. Li, A hybrid machine learning algorithm for
learning models. detection of simulated expiratory markers of diabetic patients based on gas sensor
array, IEEE Sens. J 23 (3) (2022) 2940–2947. Dec 19.
[22] T. Zhu, L. Kuang, J. Daniels, P. Herrero, K. Li, P. Georgiou, IoMT-enabled real-time
Declaration of competing interest blood glucose prediction with deep learning and edge computing, IEEE Internet.
Thing. J 10 (5) (2022) 3706–3719. Jan 14.
[23] K. Gupta, V. Bajaj, A robust framework for automated screening of diabetic patient
The authors declare that they have no known competing financial using ecg signals, IEEE Sens. J 22 (24) (2022) 24222–24229. Nov 9.
interests or personal relationships that could have appeared to influence [24] Z. Li, X. Pan, Y.D. Cai, Identification of type 2 diabetes biomarkers from mixed
single-cell sequencing data with feature selection methods, Front. Bioeng.
the work reported in this paper. Biotechnol 10 (2022) 890901. Jun 2.
[25] J. Carlton, P. Powell, D. Rowen, M. Broadley, F. Pouwer, J. Speight, S. Heller, M.
Data availability A. Gall, M. Rosilio, C.J. Child, J. Comins, Producing a preference-based quality of
life measure to quantify the impact of hypoglycaemia on people living with
diabetes: a mixed-methods research protocol, Diabet. Med. 40 (3) (2023) e15007.
No data was used for the research described in the article. Mar.
[26] A. Aguilera, C.A. Figueroa, R. Hernandez-Ramos, U. Sarkar, A. Cemballi, L. Gomez-
Pathak, J. Miramontes, E. Yom-Tov, B. Chakraborty, X. Yan, J. Xu, mHealth app
using machine learning to increase physical activity in diabetes and depression:
References clinical trial protocol for the DIAMANTE Study, BMJ Open 10 (8) (2020) e034723.
Aug 1.
[1] U. Ahmed, G.F. Issa, M.A. Khan, S. Aftab, M.F. Khan, R.A. Said, T.M. Ghazal, [27] S.I. Ayon, M.M. Islam, Diabetes prediction: a deep learning approach, Internat. J.
M. Ahmad, Prediction of Diabetes Empowered With Fused Machine Learning, 10, Inform. Eng. Electr. Busin. 13 (2) (2019) 21. Mar 1.
IEEE Access, 2022, pp. 8529–8538. Jan 11. [28] M. Maniruzzaman, M.J. Rahman, B. Ahammed, M.M. Abedin, Classification and
[2] M. Shokrekhodaei, D.P. Cistola, R.C. Roberts, S. Quinones, Non-invasive Glucose prediction of diabetes disease using machine learning paradigm, Health. Inf. Sci.
Monitoring Using Optical Sensor and Machine Learning Techniques For Diabetes Syst 8 (2020) 1–4. Dec.
Applications, 9, IEEE Access, 2021, pp. 73029–73045. May 11. [29] A. Aada, S. Tiwari, Predicting diabetes in medical datasets using machine learning
[3] S.K. Sharma, A.T. Zamani, A. Abdelsalam, D. Muduli, A.A. Alabrah, N. Parveen, S. techniques, Int. J. Sci. Res. Eng. Trends. 5 (2) (2019) 257–267.
M. Alanazi, A diabetes monitoring system and health-medical service composition [30] B. Pranto, S.M. Mehnaz, E.B. Mahid, I.M. Sadman, A. Rahman, S. Momen,
model in cloud environment, IEEE Access 11 (2023) 32804–32819. Mar 17. Evaluating machine learning methods for predicting diabetes among female
[4] J. Theis, W.L. Galanter, A.D. Boyd, H. Darabi, Improving the in-hospital mortality patients in Bangladesh, Information 11 (8) (2020) 374. Jul 23.
prediction of diabetes ICU patients using a process mining/deep learning [31] M.I. Gröschel, C.F. Luz, S. Batra, S. Ahuja, S. Batra, K. Kranzer, T.S. van der Werf,
architecture, IEEE J. Biomed. Health Inform 26 (1) (2021) 388–399. Jun 28. Random glucose sampling as screening tool for diabetes among disadvantaged
[5] S. Lekha, M. Suchetha, Recent advancements and future prospects on e-nose tuberculosis patients residing in urban slums in India, ERJ open res 5 (1) (2019).
sensors technology and machine learning approaches for non-invasive diabetes Feb 1.
diagnosis: a review, IEEE Rev. Biomed. Eng 14 (2020) 127–138. May 11. [32] A. Nath, S. Biradar, A. Balan, R. Dey, R. Padhi, Physiological models and control for
[6] M. Alkhodari, M. Rashid, M.A. Mukit, K.I. Ahmed, R. Mostafa, S. Parveen, A. type 1 diabetes mellitus: a brief review, IFAC-PapersOnLine 51 (1) (2018)
H. Khandoker, Screening cardiovascular autonomic neuropathy in diabetic patients 289–294. Jan 1.
with microvascular complications using machine learning: a 24-hour heart rate [33] D. Khan, C.R. Moffet, P.R. Flatt, C. Kelly, Role of islet peptides in beta cell
variability study, IEEE Access 9 (2021) 119171–119187. Aug 24. regulation and type 2 diabetes therapy, Peptides 100 (2018) 212–218. Feb 1.
[7] H. Al Jlailaty, A. Celik, M.M. Mansour, A.M Eltawil, Machine learning-based [34] S. Vyas, R. Ranjan, N. Singh, A. Mathur, Review of predictive analysis techniques
unobtrusive intake gesture detection via wearable inertial sensors, IEEE Trans. for analysis diabetes risk, in: In2019 Amity International Conference on Artificial
Biomed. Eng 70 (4) (2022) 1389–1400. Oct 25. Intelligence (AICAI), IEEE, 2019, pp. 626–631. Feb 4.
[8] N. Fazakis, O. Kocsis, E. Dritsas, S. Alexiou, N. Fakotakis, K. Moustakas, Machine [35] M. Yasen, N. Al-Madi, N. Obeid, Optimizing neural networks using dragonfly
learning tools for long-term type 2 diabetes risk prediction, IEEE Access 9 (2021) algorithm for medical prediction, in: In2018 8th international conference on
103737–103757, https://doi.org/10.1109/ACCESS.2021.3098691. computer science and information technology (CSIT), IEEE, 2018, pp. 71–76. Jul
[9] S.A. Siddiqui, Y. Zhang, J. Lloret, H. Song, Z. Obradovic, Pain-free blood glucose 11.
monitoring using wearable sensors: recent advancements and future prospects, [36] G. Thippa Reddy, N Khare, FFBAT-optimized rule based fuzzy logic classifier for
IEEE Rev. Biomed. Eng. 11 (2018) 21–35, https://doi.org/10.1109/ diabetes, Internat. J. Eng. Res. Africa 24 (2016) 137–152. Jul 1.
RBME.2018.2822301. [37] C.C. Olisah, L. Smith, M. Smith, Diabetes mellitus prediction and diagnosis from a
[10] H. Thakkar, V. Shah, H. Yagnik, M. Shah, Comparative anatomization of data data preprocessing and machine learning perspective, Comput. Methods.
miningand fuzzy logic techniques used in diabetes prognosis, Clin. eHealth (2020), Programs. Biomed 220 (2022) 106773. Jun 1.
https://doi.org/10.1016/j.ceh.2020.11. [38] S.M. Ganie, M.B. Malik, An ensemble machine learning approach for predicting
[11] C. Fiarni, E.M. Sipayung, S. Maemunah, Analysis and prediction of diabetes type-II diabetes mellitus based on lifestyle indicators, Healthc. Anal 2 (2022)
complication disease using data mining algorithm, Procedia Comput. Sci 161 100092. Nov 1.
(2019) 449–457. Jan 1. [39] A. Yahyaoui, A. Jamil, J. Rasheed, M. Yesiltepe, A decision support system for
[12] S. Suyanto, S. Meliana, T. Wahyuningrum, S. Khomsah, A new nearest neighbor- diabetes prediction using machine learning and deep learning techniques, in:
based framework for diabetes detection, Expert Syst. Appl. 199 (November 2021) In2019 1st International informatics and software engineering conference
(2022) 116857, https://doi.org/10.1016/j.eswa.2022.116857. (UBMYK), IEEE, 2019, pp. 1–4. Nov 6.
[13] B.G. Mamatha Bai, B.M. Nalini, J. Majumdar, Analysis and detection of diabetes [40] V. Dremin, Z. Marcinkevics, E. Zherebtsov, A. Popov, A. Grabovskis, H. Kronberga,
using data mining techniques—A big data application in health care. Emerging K. Geldnere, A. Doronin, I. Meglinski, A. Bykov, Skin complications of diabetes
Research in Computing, Information, Communication and Applications: ERCICA mellitus revealed by polarized hyperspectral imaging and machine learning, IEEE
2018, Volume 1, Springer, Singapore, 2019, pp. 443–455. Trans. Med. Imag. 40 (4) (2021) 1207–1216. Jan 6.
17
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661
[41] N. Nnamoko, I. Korkontzelos, Efficient treatment of outliers and class imbalance Neha Katiyar, Research Scholar, School- School of Computer
for diabetes prediction, Artif. Intell. Med 104 (2020) 101815. Apr 1. Science Engineering and Technology, Bennett University,
[42] V. Kumar, G.S. Lalotra, P. Sasikala, D.S. Rajput, R. Kaluri, K. Lakshmanna, India, Email- [email protected], Research Area
M. Shorfuzzaman, A. Alsufyani, M. Uddin, Addressing binary classification over (s), Machine Learning, IoT,6 G
class imbalanced clinical datasets using computationally intelligent techniques, in:
InHealthcare, 10, MDPI, 2022, p. 1293. Jul 13.
[43] Q. Wang, W. Cao, J. Guo, J. Ren, Y. Cheng, D.N. Davis, DMP_MI: an effective
diabetes mellitus classification algorithm on imbalanced data with missing values,
IEEE access 7 (2019) 102232–102238. Jul 19.
[44] D.D. Rufo, T.G. Debelee, A. Ibenthal, W.G. Negera, Diagnosis of diabetes mellitus
using gradient boosting machine (LightGBM), Diagnostics 11 (9) (2021) 1714. Sep
19.
[45] M.N. Khan, S.K. Hasnain, M. Jamil, A. Imran, Electronic Signals and Systems:
Analysis, Design and Applications, River Publishers, 2022. Sep 1.
[46] B. Diallo, J. Hu, T. Li, G.A. Khan, X. Liang, Y. Zhao, Deep embedding clustering
based on contractive autoencoder, Neurocomputing 433 (2021) 96–107. Apr 14.
Dr. Hardeo Kumar Thakur, Associate Professor, School-
[47] T.G. Debelee, Skin lesion classification and detection using machine learning
School of Computer Science Engineering and Technology,
techniques: a systematic review, Diagnostics 13 (19) (2023) 3147. Oct 7.
Bennett University, India, [email protected].
[48] G.A. Khan, J. Hu, T. Li, B. Diallo, H. Wang, Multi-view data clustering via non-
in, Research Area(s), Data mining,Dynamic Graph Mining,
negative matrix factorization with manifold regularization, Int. J. Mach. Learn.
Data Analytics
Cyber (2022) 1–3. Mar 1.
[49] A.U. Haq, J.P. Li, J. Khan, M.H. Memon, S. Nazir, S. Ahmad, G.A. Khan, A. Ali,
Intelligent machine learning approach for effective recognition of diabetes in E-
healthcare using clinical data, Sensors 20 (9) (2020) 2649. May 6.
[50] E. Rucci, G. Tittarelli, F. Ronchetti, J.F. Elgart, L. Lanzarini, J.J. Gagliardino, First
experiences with the identification of people at risk for diabetes in argentina using
machine learning techniques, arXiv preprint arXiv:2403.18631. (2024). Mar 27.
[51] D.D. Rufo, T.G. Debelee, W.G. Negera, A hybrid machine learning model based on
global and local learner algorithms for diabetes mellitus prediction, J. Biomim.,
Biomat. Biomed. Eng. 54 (2022) 65–88. Feb 10.
[52] B.F. Wee, S. Sivakumar, K.H. Lim, W.K. Wong, F.H. Juwono, Diabetes detection
based on machine learning and deep learning approaches, Multimed. Tools Appl 83 Dr. Anindya Ghatak, Assistant Professor, School- School of
(8) (2024) 24153–24185. Mar. Computer Science Engineering and Technology, Bennett Uni
[53] H.R. Al-Absi, A. Pai, U. Naeem, F.K. Mohamed, S. Arya, R.A. Sbeit, M. Bashir, M. versity, India, Email- anindya.ghatak@ bennett.edu.in,
M. El Shafei, N. El Hajj, T Alam, DiaNet v2 deep learning based method for diabetes Research Area(s), Functional Analysis, Operator Theory and
diagnosis using retinal images, Sci. Rep 14 (1) (2024) 1595. Jan 18. Operator Algebras and its application in Quantum information
[54] R. Sarmun, M.E. Chowdhury, M. Murugappan, A. Aqel, M. Ezzuddin, S.M. Rahman, theory.
A. Khandakar, S. Akter, R. Alfkey, M.A. Hasan, Diabetic foot ulcer detection:
combining deep learning models for improved localization, Cognit. Comput (2024)
1–9. Apr 1.
[55] Q.W. Khan, K. Iqbal, R. Ahmad, A. Rizwan, A.N. Khan, D. Kim, An intelligent
diabetes classification and perception framework based on ensemble and deep
learning method, PeerJ Comput. Sci. 10 (2024) e1914. Mar 29.
18