0% found this document useful (0 votes)
3 views18 pages

1 s2.0 S2772671124002419 Main (Asp)

This systematic review discusses recent advancements in machine learning and deep learning techniques for the early detection of Diabetes Mellitus, highlighting the importance of early diagnosis to prevent complications. The paper evaluates various methodologies, datasets, performance metrics, and limitations in diabetic research, providing a comprehensive overview for healthcare researchers. It categorizes machine learning techniques into supervised, unsupervised, semi-supervised, and reinforcement learning, emphasizing their roles in diabetes detection and management.

Uploaded by

kavin a s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views18 pages

1 s2.0 S2772671124002419 Main (Asp)

This systematic review discusses recent advancements in machine learning and deep learning techniques for the early detection of Diabetes Mellitus, highlighting the importance of early diagnosis to prevent complications. The paper evaluates various methodologies, datasets, performance metrics, and limitations in diabetic research, providing a comprehensive overview for healthcare researchers. It categorizes machine learning techniques into supervised, unsupervised, semi-supervised, and reinforcement learning, emphasizing their roles in diabetes detection and management.

Uploaded by

kavin a s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Contents lists available at ScienceDirect

e-Prime - Advances in Electrical


Engineering, Electronics and Energy
journal homepage: www.elsevier.com/locate/prime

Recent advancements using machine learning & deep learning approaches


for diabetes detection: a systematic review
Neha Katiyar * , Hardeo Kumar Thakur , Anindya Ghatak
Bennett University, Greater Noida, India

A R T I C L E I N F O A B S T R A C T

Keywords: Nowadays, Diabetes Mellitus is one of the significant health challenges that affects many people across the world.
Diabetes mellitus Early detection of Diabetes Mellitus will help in preventing complications, i.e., kidney disease, nerve damage, eye
Glucose damage, etc. Over the past few years, several Machine Learning and Deep Learning techniques have been applied
Diabetic symptoms
for the early detection of Diabetes Mellitus. The paper provides reviews on various Machine Learning and Deep
TYpe 2 diabetes mellitus
Deep learning
Learning techniques applied for early detection of Diabetes mellitus. The review criteria mainly focus on five
Diabetes Management topics: the diabetes dataset, methods used, performance metrics, limitations of the work, and the overall status of
diabetic research. The objective of this paper is to provide a comprehensive review of Diabetes Mellitus pre­
diction techniques applying Machine Learning and Deep Learning that will be helpful sources for researchers in
the healthcare field.

learning methodologies in early-stage detection and diagnosis of dia­


betes through automated systems.
1. Introduction Diabetes datasets play a crucial role in early-stage detection. Ahmed
et al. [1] have mentioned in the article that patient record is obtained
In the realm of healthcare, it is crucial to anticipate illnesses and from electronic devices and paper records, with the automotive record
facilitate their prompt identification. The anticipation of diseases be­ having timestamps event and the paper record having only "logical time"
comes feasible with a comprehensive understanding of their symptom­ slots. The diabetes datasets help to detect diabetes in a person. Attributes
atic manifestations. Such diseases can stem from genetic predispositions of a dataset play a crucial role in detection. Researchers and scientists
or can be attributed to individuals living conditions. Presently, there is a use different data files to predict diabetes. The ML-based model is
growing inclination towards adopting opulent lifestyles, which can required for relevant datasets having essential aspects for training and
disrupt the body’s hormonal balance, leading to a myriad of hormonal validations. Choosing the appropriate elements from the dataset
irregularities, including ailments like diabetes. A considerable number increased the ability of the ML model to predict accurate data. The
of individuals are grappling with such ailments at an early age, machine learning model is also deliberate in terms of efficiency and
contributing to a substantial repository of electronic health data. effectiveness and is attributed to the transparency of a user. In recent
Employing diverse data mining methodologies and analyses, the days, many research papers have been published on diabetes detection,
healthcare system strives to harness this extensive information. This interpretation, prevention, and glucose management with machine
comprehensive information exerts a meaningful influence on effectively learning.
regulating and handling the issues of diabetic persons. Al Jlailaty et al.
[7] performeddiabetes detection was quickly with the dataset. The
dataset set consists of specific parameters that are identified soon, like 1.1. Brief of diabetes mellitus
glucose level, Body mass Index, etc. These datasets are refined with data
mining techniques. The domain of data mining in healthcare constitutes Diabetes Mellitus occurs in humans and animals. Diabetes affects 1.5
an expansive field of study encompassing a wide range of million people in a year. Thakkar et al. [10] have provided the info
machine-learning techniques, such as Support Vector Machines (SVM), about 48% of people in this world die due to diabetes mellitus. This
Regression, k-nearest Neighbours (KNN), and Artificial Neural Networks disease occurs when the disorder of carbohydrate metabolism shows the
(ANN), among others—consequently, the progression in machine character of impaired ability in the human body not producing enough

* Corresponding author.
E-mail address: [email protected] (N. Katiyar).

https://doi.org/10.1016/j.prime.2024.100661
Received 12 January 2024; Received in revised form 31 May 2024; Accepted 20 June 2024
Available online 28 June 2024
2772-6711/© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

description of Diabetes and the types of Diabetes. Section 3 provides


Nomenclature machine Learning and different classification techniques. Section 4
provides a brief about the survey paper criteria. Section 5 shows the
DT Decision Tree systematic review of the dataset pre-owned by researchers. Section 6
SVM Support Vector Machine provides an overview of ML and Deep Learning approaches in the area of
DL Deep Learning Diabetes Mellitus. This section also consists of an analysis of various
ML Machine Learning performance measures and limitations. Section 7 is based on the various
PCA Principle Component Analysis discussion topics. Section 8 has future scope and challenges, and the
DM Diabetes Mellitus conclusion is summarized from the paper.
ANN Artificial Neural Network
KNN K-Nearest Neighbour 2. Types of diabetes
NB Naive Bayes
PIDD Pima Indian Diabetes Dataset There are four types of diabetes, as mentioned in Fig. 1-
RF Random Forest
CNN Convolutional Neural Network 2.1. Type-1 diabetes
ETL Extraction, Transformation and Load
ECG Electrocardiogram Signals Type-1 diabetes is found in the human-resistant body. It is a chronic
RNA Ribonucleic acid autoimmune disease that perturbs the pancreas. In Type-1 diabetes, the
TD1 Type-1 Diabetes immune system attacks and damages insulin, producing beta cells in the
TD2 Type-2 Diabetes pancreas. Insulin regulates blood sugar (glucose) levels by enabling the
LSTM Long short-term memory absorption of blood sugar into cells, where it is utilized for energy
IoT Internet of Things production. When the beta cells are abolished, the body is unable to
RNN Recurrent Neural Networks regenerate insulin, leading to elevated blood sugar levels.
Without insulin, the glucose level rises. To prevent this type of dia­
betes, people have to take insulin every day to maintain their glucose
levels, and that keeps them alive. The problems that arise with Type-1
quantity of insulin. The supply of insulin level is not maintained in the diabetes are heart disease, strokes, dental disease, foot problems,
human body, as mentioned in the article Siddiqui et al. [9]. depression, and kidney diseases as mentioned in the article, Zhu et al.
Diabetes results from a malfunctioning pancreas in the human body, [22]. The symptoms of this diabetes, as seen in the human body, are
as mentioned in the article Alkhodari et al. [6]. For example, when the excessive thirst, frequent urination, extreme hunger, unexplained
human pancreas is not producing insulin as it should. The human body weight loss, fatigue and blurred vision.
does not supply insulin adequately.A person may get affected by dia­
betes due to three main reasons: genetics, surroundings, and an affluent 2.2. Type-2 diabetes
lifestyle. Lifestyle is an essential factor in becoming a person diabetic
and non-diabetic. If a person’s ancestor is non-diabetic previously but Type-2 diabetes happens when the human body regulates sugar as
the person has diabetes, then this has to be done by an affluent lifestyle. fuel. The sugar is called glucose. When sugar circulates in the human
The next cause is the environmental impact. Now, people are influenced body for a long time, it raises the blood sugar level in humans. The high
by the effects of losing weight. To lose weight people are taken poor diet sugar level discards the human immune system, circulatory systems, and
and frequent exercises. That causes the problem of renal failure. Renal nervous systems. In Type-2 diabetes, through the pancreas, good enough
failure is the first step towards post-diabetes. In this modern world, quality insulin is available, although the insulin does not work correctly
people are facing issues like frequent hunger, frequent urination, weight because of the rise in sugar levels as mentioned in the article Kazerouni
loss and vision loss due to these signs of diabetes. et al. [15]. Type-2 diabetes causes problems like increased thirst,
The benevolence of this review paper in the domain of diabetic frequent urination, fatigue, and areas of darkened skin, usually armpits
research considered ML-based access in DM detection, prevention, self- and necks.
management, and personification. A review paper plays a crucial role in
the study because it efficiently summarizes cutting-edge research in a 2.3. Pregestational diabetes
specific area. The demographic information from ML approaches
diabetes-related health measurements and the risk factors easily calcu­ Pregestational diabetes is the combination of Type-1 and Type-2
lated models. The ML model is used for long-term accuracy calculations diabetes that occurs in females during the time of pregnancy. Preg­
given by Lekha et al. [5]. It is the main objective of ML models. The nant women have pre-existing diabetes during the time of pregnancy.
objectives of the Diabetes Mellitus Machine learning model of this According to Fazakis et al. [8] pregestational increases the risk of mal­
manuscript are given as follows- functions in the baby during the growth process in the ovary. The result
of this type of diabetes is the prematurity of the baby, and operative
1. The repository’s publicly available datasets are used to explore the delivery takes place. The diabetic women proceed to cesarean section.
different types of datasets used in DM detection. Women with pre-gestational diabetes were overweight and older. Dia­
2. The pre-processing method applies to Diabetes Detection. betes occurs during this pre-gestational time and lasts longer in women.
3. Machine Learning-driven performance in detection and classification This diabetes occurs in women after pregnancy.
tasks.
4. Deep learning methods for Classification, Detection, and Diabetes 2.4. Gestational diabetes
Management.
5. Performance metrics for diabetes detection and diagnosis Gestational diabetes is detected for the first time during pregnancy.
algorithms. Gestational diabetes affects the glucose level of pregnant women as well
6. The limitations in the ML model for diabetes detection and as the body. It can also affect the baby’s fitness. During the time of
prevention. pregnancy, gestational diabetes can easily be controlled by eating
healthy food, exercising, and taking proper medication. The symptoms
This manuscript is classified as follows- Section 2 has a precise of diabetes do not quickly happen, given some unnoticeable signs occur,

2
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Fig. 1. Types of Diabetesand description of research work.

like increased thirst and more frequent urination studied by Suyanto labelled data have similar features and characteristics. The machine
et al. [12]. In this type of diabetes, a monthly check-up is recommended learning procedure supervises the outputs of a given set of inputs. In this
by the doctor. algorithm, suppose that the input variable of the diabetic dataset is X,
and the output variable is Y. When we apply this learning algorithm, we
3. Types of machine learning techniques get the predicted new input as X. The latest output is Y given by[16].
The semi-supervised algorithms used for diabetes detection are KNN,
The Machine Learning algorithms are categorized into four sections, DT and SVM. It is an effective approach where the model is trained for
as shown in Fig. 2. the labelled data, meaning the input features are associated with the
diabetes outcomes. It also uses overfitting techniques. Additionally, the
1) Supervised learning, which operates on labelled data. supervised algorithm performs feature selection and extraction tech­
2) Unsupervised learning, which works with unlabeled data. niques to enhance the model interoperability.
3) Semi-Supervised learning, which utilizes both labelled and unlabeled
data. 3.2. Unsupervised learning
4) Reinforcement learning, which operates based on feedback systems
and experiences as given in article Shokrekhodaei et al. [2]. Unsupervised learning uses unclassified, unlabeled data. The main
objective of unsupervised learning is to acquire unhidden information
from datasets. We can get unhidden information from unique training
3.1. Supervised learning systems or add the classification process from the datasets. This learning
applies clustering techniques and divides the data according to the
The supervised learning worked on the labelled datasets. The features from a group of items that have similar features.

Fig. 2. Types of Machine Learning Techniques.

3
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

The unsupervised algorithms used for diabetes detection are PCA, label are not pre-defined. It totally worked on the hit-and-trial proced­
Neural Networks, Hierarchal Clustering, and K-mean Clustering. Lekha ure. The Agent’s action worked according to the previous action. Ghosh
et al. [5] performed research using the techniques of diabetes detection et al. [18] focused on the reinforcement learning procedure mainly
use clustering, dimensionality reduction and generative models. depends upon the Agent, environment, action, reward and policy.
The reinforcement learning used for diabetes detection is Thompson
3.3. Semi-Supervised learning sampling and posterior sampling. The main goal of reinforcement
learning is to optimize the insulin dosage for a patient with diabetes and
The ML algorithm is an intermediate between supervised and unsu­ minimize the risk of diabetes hypoglycemia (low blood sugar level).
pervised learning algorithms [3-4]. It consolidates both labelled and
unlabeled data. In this algorithm, the training data is in the form of 4. A brief about the survey paper criteria
tuples. Various labels are applied to the dataset to easily provide the
predicted output. If we apply a pseudo-labeling process, the output may In the healthcare sector, particularly within the realm of diabetes
be incorrect. mellitus, many works are available. The work chosen for our review
The Supervised algorithm used for diabetes detection is clustering, process is sourced from Scopus or may originate from peer-reviewed
densities and dimensionality reduction. In semi-supervised learning, journals. Their primary purpose lies in aiding academic researchers
diabetes detection is performed with fine-tuning, Interference, Moni­ and scientists in the early-stage prediction of diabetes. Additionally,
toring and Updating. these research papers may serve as a foundation for the development of
advanced techniques for diabetes prediction. Notably, these papers
3.4. Reinforcement learning straddle two prominent domains: medical science and computer science.
Chaki et al. [17] has given a prevalent theme in diabetes detection
Reinforcement learning uses feedback systems for learning. In this research involves the intricate interplay between Machine Learning
learning procedure, the Agent learns and improves its performance by (ML) and Deep Learning (DL) techniques. These intelligent computa­
using feedback systems. It does not use any labelled data. It solves the tional methods represent cutting-edge approaches in the healthcare
problem of sequential decision-making. In this learning, the output and field, offering a wealth of precise manuscripts that harness the power of

Fig. 3. Paper Selection Process for Review.

4
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

these techniques. Table 1


Although published a large number of articles based on these two Diabetes dataset description with references.
advanced techniques, our scope of research has been restricted over the S. Datasets DatasetDescription with References/
last seven years (2018 to 2024). It is compulsory to publish many arti­ No. links Published Year
cles. However, the articles are bounded to the areas of ML and DL, 1 Pima Indian Diabetes The Pima Indian Diabetes Ahmed et al. [1],
although the wider area of computer science. Dataset (PIDD) Dataset consists of data Sharma et al. [3],
The adjacent phase performed a systematic review of ML techniques from 768 women. This data Thakkar et al.
and retrieved all the articles related to the area. It also removed those is freely available on the [10],
UCI repository and Kaggle. Suyanto et al.
papers that did not consist of the field of diabetes mellitus using Machine The dataset encompasses [12],
Learning. This paper cumulated the review of 55 papers. The review variables such as blood Mamatha et al.
paper was categorized into classification, detection, and diagnosis of glucose levels, insulin, [13]
disease. The absolute collection of papers is shown in Fig. 3. body mass index, and Khanam et al.
more. It consists of eight [16],
independent variables, Gosh et al. [18],
5. Datasets used by researchers with one being categorical Tripathi et al.
and the remaining seven [19]
Datasets play an essential role in finding the results in machine continuous. Therefore, at a Ayon et al. [27],
minimum, eight plots are Aada et al. [29],
learning. A dataset is a collection of data that has similar values and
required. To extract Pranto et al. [30],
features. In the context of tabular data, which has similar features, meaningful insights and Olisah et al. [37],
tabular data is represented in the form of rows and columns. A column ensure a thorough analysis, Yahayoni et al.
represents a particular variable. When it comes to diabetic datasets, the target variable result is [39]
many sampling techniques are applied, and a large number of obser­ plotted against each
independent variable if we
vations are performed.
want to derive any
To perform diabetic research on a dataset. The data must be inferences and leave no
normalized, and the data must be divided into pre-processing states. In stone unturned for it.
this study, we have gone through various types of datasets. The Pima 2 Signals Data The dataset consists of the Shokrekhodaei
Wavelength Data wavelength of blood tissue et al. [2]
Indian diabetes dataset [PIDD] is the most popular dataset used by re­
level between various nm.
searchers in the study of diabetes detection and prediction. Tripathi The whole wavelength
et al. [19] mentioned in the article the PIDD data contains information measures the intensity
on 768 women and their health records. This dataset was given and between 410 and 990 nm.
verified by the National Institute of Diabetes and Digestive and Kidney The intensity of four
wavelengths is (485, 645,
Diseases. The dataset was taken from women who were at least 21 years
860 nm, and 940 nm).
old. Diabetes datasets are available in many forms, including numerical 3 Medical Information This dataset is publicly Lekha et al. [5]
data, alphanumerical data, and image datasets, as shown in Table 1. Mart for Intensive Care accessible. It is a relational
Each type of dataset uses so many adaptive techniques like segmenting, (MIMIC-IIIrd) database that contains the
thresholding, sampling bagging and boosting. In some diabetic-based data of unidentified
patients. The database
research, data collection may be done through an online survey op­ includes information on
tion. In some research, the data may be used as filtered data, and other 38,597 adult patients who
researchers may use unfiltered data. Some researchers may remove the were admitted between
missing values from the data, and some use the original data and focus 2001 and 2012.
4 Bangladesh Institute of This dataset includes 95 Alkhodari et al.
on machine-learning techniques. Nnamoko et al. [41] have performed
Health and Sciences patients in the Bangladesh [6]
the class imbalance procedure. To combat this, they implemented (Diabetic Datasets) Institute of Health and
SMOTE algorithms. Additionally, they apply data pre-processing tech­ Sciences from December
niques to enhance the performance of training data by artificial minority 18 to 2017 and April 26 to
instances within the training data. It performs classification and bias 2018. There are 45 Males
and 50 females in this
towards standard cases. Kumar et al. [42] explored class-imbalanced dataset.
learning scenarios. They meticulously examined a total of 51 articles 5 OREBA The OREBA stands for Al Jlailaty et al.
related to imbalanced data. Within their work, they explored seven FIC (Objectively Recognizing [7]
distinct class-balancing techniques. CLEMSON Eating Behavior and
Associated Intake). The
Diabetes detection on imbalanced data. Wang et al. [43] opted the
objective of the OREBA
naive Bayes method to normalize data with missing values. Further­ dataset is to consist of a
more, they harnessed the adaptive synthetic sampling method to miti­ multi-sensor recording of
gate class imbalance issues. To validate their methodology, they applied communal intake
these techniques to the Pima Indian Diabetic Dataset. The significance of occasions for researchers
hooked on the automated
machine learning and deep learning approaches in developing diabetes exposure of absorption
detection systems, the need for improved data availability, the potential gestures. The gestures
of reconstructing detection models based on non-invasive measure­ study is easily performed
ments, and the importance of selecting appropriate datasets for training with this dataset.
The Multimedia
models to address the rising prevalence of diabetes worldwide [52].
Understanding Group
Al-Absi et al. [53]. worked on the enhancement of diabetes diagnosis provided the Food Intake
accuracy by developing an improved predictive model using retinal Cycle (FIC) dataset to
images from the Qatari population, achieving an accuracy of over 92% determine in-meal
with the DiaNet v2 model. eatingbehaviour.
The CLEMSON consists of
Diabetes datasets are the base of research in the field of prediction data from 271 participants
and detection of diabetes. There can be so many types of data collection, with 488 recordings. Each
such as open access, survey-based data collection, and self-created data (continued on next page)
collection, as mentioned in the article by Ye et al. [20]. Data can be

5
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 1 (continued ) Table 1 (continued )


S. Datasets DatasetDescription with References/ S. Datasets DatasetDescription with References/
No. links Published Year No. links Published Year

consumption gesture, of diabetes. The dataset


including hand, utensil, consists of approximately
container, and food, is 210 measurements
easily collected and obtained from 41
monitored with pre- participants (21 females
processing. and 20 males). Each
6 FINDRISC FINDRISC is a diabetic risk Fazakis, et al. [8] participant sample was
ELSA assessment tool that is an collected each day for the
effective undiagnosed tool dataset.
for Type-2 Diabetes 12 University California The dataset consists of 62 Zhu et al. [21]
Mellitus. Irvine gas sensor sensors at various
The ELSA stands for datasets concentrations. One
(English Longitudinal concentration has a sample
Study of Aging). The ELSA size of 20* 7. The training
database is a standard that set selected 44 samples for
can stratify train and test each concentration.
split procedures. It can 13 OhioT1DM In this article, three Zhu et al. [22]
preserve the dataset with ABC4D datasets are used-
fixed class properties. ARISES 1)The first one is the
7 PubMed Survey PubMed is a Digital Library Siddiqui et al. [9] OhioT1DM dataset, which
Database taken from the and Springer for non- is publicly available and
year (2012–2016). invasive blood glucose contains data on Type-1
monitoring from diabetic patients.
2012–2016. After finding 2)The second dataset is the
the datasets for the ABC4D dataset, which
keywords non-invasive consists of the data of 25
blood glucose self- patients with Type-1
monitoring and invasive diabetes. The dataset
monitoring, the research contains information on a
data can be used in six-month clinical period.
biomedical, computer 3)The third dataset is
science, and related fields. ARISES. It consists of data
8 Sri Pamela Hospital The dataset has 158 Fiarmi et al. [11] from 12 patients with
Kumpulan Plane records and 15 attributes. Type-1 diabetes. The data
Hospital The ETL process is set has information on the
Indonesia performed on the database 12 weeks of the 12
to obtain the eight patients.
attributes required for 14 ECG signal data on 218 This research data has 86 Gupta et al. [23]
patient information and patients diabetic personrecording
diabetes detection. to be done for the
9 Survey Based database The Kashmir Valley dataset Shuja et al. [14] automatic screening
from Kashmir Valley has a set of 734 patients of algorithm of diabetes
all age sets. The dataset mellitus. They select some
collected in the Kashmir features after that, apply
Valley was taken from one the Kruskal wall test.
diagnostic lab. The dataset 15 RNA Sequencing Data The RNA sequencing Li et al. [24]
has 11 attributes: Age, dataset has 1600 human
Plasma Glucose Fasting, pancreatic cells. In this
Post- Prandial Glucose data set, 651 people have
Level, Body Mass Index, non-pancreatic datasets,
Diastolic Blood Pressure, and 941 people have TD1
waist thickness, and diabetic pancreatic cells.
diabetic history of the 16 Self-Created database It is survey-based research Carlton et al. [25]
patient included other research collection of on diabetes. Researchers
attributes. data for Type-1&Type-2 collected the data and
10 ShohadaHospital Diabetes data was Kazerouni et al. diabetes named this project PROMs.
Tehran collected from 200 [15] The National Health
unrelated diabetic Iranian Service (NHS) site in the
patients, 100 T2DM United Kingdom (UK). Due
patients, and a to COVID-19 constraints,
comparative study interviews will be taken
between 100 healthy remotely (either via web
persons paired for age and chat or telephone)
sex. This diabetic study according to persons’
was also based on six preferred availability on
attributes. T2DM patients date, time, and medium. In
were enlisted from this, they collected the data
individuals belonging to from 1000 participants.
the Diabetic Clinic at 17 Self-Createddataset This dataset consists of 276 Aguilera et al.
Shohada Hospital, Tehran, collection persons the study is based [26]
Iran. on the age group between
11 Hainan Medical The Department of Ye et al. [20] (18 to 75) years. The
University China Respiratory Disease clinical follow-up was
datasets gathered breath taken of the person,
samples for the detection whether the person is
(continued on next page)

6
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 1 (continued ) Table 2


S. Datasets DatasetDescription with References/
Datasets with associated download links and websites.
No. links Published Year Datasets Websites& Links
diabetic or non-diabetic; it Pima Indian Diabetes Dataset Link-Pima Indians Diabetes Database (kaggle.
covers six months. (PIDD) com)
18 National Health and This dataset consists of Maniruzzaman Signals Data Link- A dataset of hemoglobin blood value and
Nutrition Dataset 6561 et al. [28] Wavelength Data photoplethysmography signal for machine
people. In this dataset, 657 learning-based non-invasive hemoglobin
are diabetic, and 5904 are measurement - ScienceDirect
non-diabetic. This study Medical Information Mart for Link- MIMIC-III, a freely accessible critical care
consists of 50% of male Intensive Care (MIMIC-IIIrd) database - PMC (nih.gov)
data. This study used an Bangladesh Institute of Health Link- Field Research and Practice Centre (FRPC)
open data survey based on and Sciences (Diabetic - Bangladesh University of Health Sciences
a study of the US Datasets) (buhs.ac.bd)
population. OREBA Link- 1)OREBA Dataset | Papers With Code
19 Clinical data collection This clinical data collection Gröschel et al. FIC 2) The Food Intake Cycle (FIC) Dataset |
by OpASHA procedure took place [31] CLEMSON Multimedia Understanding Group (auth.gr)
between 2013 and 2015. 3) All Data Sets | Data Sets | Clemson University
They took data from 336 FINDRISC Link-1) FINDRISC Diabetes Risk Calculator |
patients with diabetes, ELSA QxMD
86% of whom were 2) Accessing ELSA data | ELSA (elsa-project.ac.
diagnosed with a blood uk)
glucose level. PubMed Survey Database taken Link- PubMed (nih.gov)
20 National Institute of Researchers took the Yasen et al. [35] from the year (2012–2016).
Diabetes and Digestive diabetes dataset from the Sri Pamela Hospital Link- Indonesian hospital database - Mendeley
and Kidney Diseases National Institute of Kumpulan Plane Hospital Data
Diabetes and Digestive and Indonesia
Kidney Diseases and tested Survey Based database from Link- Prevalence and Early Prediction of
patients for diabetes or Kashmir Valley Diabetes Using Machine Learning in North
non-diabetes. Of this Kashmir: A Case Study of District Bandipora -
diabetic data, 506 are PMC (nih.gov)
training samples, and 262 Shohada Hospital Tehran Link- Prevalence, awareness, treatment and
are testing samples, and control of diabetes among Iranian population:
they took a total of 9 results of four national cross-sectional STEPwise
attributes to perform the approach to surveillance surveys - PMC (nih.
research. gov)
21 Sree Diabetes The diabetic data was Thipa Reddy et al. Hainan Medical University China Link- Hainan Medical University (hainmc.edu.
Diagnostic center taken from the Sree [36], cn)
Kurnool diabetic center Kurnool in University California Irvine gas Link- Gas Sensor Array Drift Dataset - UCI
Andhra Pradesh. It was sensor datasets Machine Learning Repository
collected from September OhioT1DM Link- 1)OhioT1DM Dataset
2014 to January 2015. The ABC4D 2) Advanced Bolus Calculator for Diabetes
data was collected from the ARISES (ABC4D) | Research groups | Imperial College
age group 18 to 77. The London
data has been measured in ECG signal data on 218 patients Link- ECG signals (1000 fragments) - Mendeley
various attributes. The Data
diabetes status is estimated RNA Sequencing Data Link-https:/www.ncbi.nlm.nih.gov/geo/quer
on each class label. y/acc-cgi?acc=GSE81608.
22 Self-Created Data This research data was Ganie et al. [38] National Health and Nutrition Link- National Health and Nutrition
Collection from collected over two years. Dataset Examination Survey (kaggle.com)
Jammu& Kashmir The data was collected Clinical data collection by Link- OpASHA | Other papers (worldbank.org)
Valley from 2019 to 2020. The OpASHA
data was collected in the National Institute of Diabetes and Link- Diabetes Statistics - NIDDK (nih.gov)
Jammu Kashmir valley Digestive and Kidney Diseases
from different Sree Diabetes Diagnostic center Link- SREE DIABETES SPECIALITY CENTER
departments, such as Kurnool
inpatient, outpatient, PPDBA program dataset Link-[PDF] First Experiences with the
emergency case, etc. The Identification of People at Risk for Diabetes in
data collected from both Argentina using Machine Learning Techniques |
regions like urban and Semantic Scholar
rural.
24 Self-Created The experimental research Dremin et al. [40]
Small dataset on Skin data on the skin is from 20 used the glucose samples but also used the genes and image
Thickness patients. The data has the
classification.
skin information of 10 men
and ten women. The data
was taken from (11 to 54 6. State of art deep learning and machine learning approaches
years). The measurement in diabetes mellitus
of skin performed on the
dorsal surface of the foot.
Machine Learning and Deep Learning techniques always provide
better results when applied to healthcare data worldwide. Both learning
collected by respiratory sensor. Data was collected from various blood techniques are frequently used to solve real-life problems. These tech­
tissue levels and their wavelengths. Each piece of data provides infor­ niques are commonly used for diabetes detection and prediction pro­
mation according to the research and its experimental process. There are cedures. Table 3 shows the review and limitations of ML and DL
so many datasets that are used for diabetes detection and prediction. Classifiers in diabetes mellitus.
These datasets are available with links, as shown in Table 2. Some Finally, their commonly reviewed algorithm of ML and DL is used in
frequently and rarely used datasets are found in Table 1, which not only this paper, as shown in Fig. 4. Here are several widely used algorithms:

7
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 3 Table 3 (continued )


Critical Review of ML and DL Classifier in Diabetes Mellitus. S. References Research Area ML and DL Performance
S. References Research Area ML and DL Performance No. Classifier
No. Classifier
validation
1 Ahmed et al. Machine SVM 94.8% ANN
[1] Learning ANN Accuracy 17 Chaki et al. Artificial Reinforcement –
2 Shokrekhodaei Machine SVM 99.78% [17] Intelligence Learning
et al. [2] Learning and RNN Accuracy Machine CNN
Regression Learning
Neural Network Diabetes
3 Sharma et al. Extreme ANN 90.57% Management
[3] Machine Accuracy Techniques
Learning 72.23% 18 Ghosh et al. Machine Random Forest 99.35%
Techniques Sensitivity [18] Learning SVM Accuracy using
Amazon Cloud 75.35% Cross Validation Gradient SVM
Specifity Boosting
4 Thesis et al. [4] Sensor LSTM 98% Accuracy 19 Tripathi et al. Machine KNN 87.66%
Technology and [19] Learning SVM Accuracy
Machine Random Forest
Learning 20 Ye et al. [20] Deep Learning SVM 90.04%
5 Lekha et al. [5] Machine CNN 95% Accuracy Machine Gradient Accuracy from
Learning PCA Learning Boosting E-Nose System
Data Mining Regression Logistic
6 Alkhodari et al. Machine SVM 90.51% Regression
[6] Learning CNN Accuracy of 21 Zhu et al. [21] Machine SVM 90.55%
Neuropathy SVM Learning XGB Boost Accuracy
98.5% Gas Sensor Gradient
Accuracy of Array Boosting
SVM Boosting Technique
7 Al Jlailate et al. Machine SVM 95% Accuracy 22 Zhu et al. [22] IoT LSTM The IoT based
[7] Learning PCA Edge Computing RNN model on
Deep Neural KNN Deep Learning glucose sensor
Network data provide
Image 77% accuracy
Processing 23 Gupta et al. Machine Decision Tree 86.9%
8 Fazakis et al. Machine SVM The AOC Curve [23] Learning ECG Signals Accuracy
[8] Learning value is 0.884% Signal Classification
Artificial Processing
Intelligence 24 Li et al. [24] Machine SVM 94.2%
IoT Learning Monte Carlo Accuracy
9 Siddiqui et al. Glucose SVM Pain Free RNA Sequencing Feature
[9] Monitoring ECG Signals monitoring of Selection
Sensors Diabetic 25 Carlton et al. Quantitative Cognitive This
patients for a [25] surveying debriefing psychological
day. Qualitative interviews study on
10 Thakkar et al. Fuzzy Logic Fuzzy Neural 96% Accuracy interviews Refining the diabetic
[10] Classification Network descriptive patients
Random Forest Random Forest system increased
Machine people interest
Learning accuracy up to
11 Fiarmi et al. Machine Naive Bayes 68% Accuracy 80%.
[11] Learning C4.5 from the 26 Aguilera et al. Machine Regression In this learning
Data Mining Classification retinopathy [26] Learning Learning methods this
Clustering K- Means model. Application Bayesian study taken on
Clustering Study Method m-health app
12 Suyanto et al. Machine KNN 98.07% Thompson and more than
[12] Learning 5-Fold Cross Accuracy Sampling 80% people
Dimensionality Validation Methods interested in
Reduction study.
13 Mamatha et al. Big Data Gausian Naive The precision 27 Ayon et al. [27] Deep Learning Deep Neural In this paper the
[13] Analytics Bayes value from the Cross Validation Network accuracy is
Machine BIRCH model is 0.735. Methods Ten-Fold Cross 98.35%.
Learning Algorithm The recall value Validation F1 Score Value
from model is Five Cross is 98
0.79. Validation The Specificity
14 Shuja et al. [14] Deep Learning SVM 94.70% value with
Machine MLP (Multilayer Accuracy tenfold cross
Learning Perceptron) validation is
Oversampling 98.80%
Techniques 28 Manizurraman Machine Logistic 90.62%
15 Kazerouni et al Machine KNN 95% Accuracy et al. [28] Learning Regression Accuracy
.[15] Learning ANN Regression Naive Bayes achieved with
SVM Decision Tree Naive bayes.
16 Khanam et al. Machine SVM 86% Accuracy Random Forest 90.62%
[16] Learning Logistic with ANN Accuracy
Regression Regression achieved with
K-foldcross- Random Forest
(continued on next page)

8
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 3 (continued ) Table 3 (continued )


S. References Research Area ML and DL Performance S. References Research Area ML and DL Performance
No. Classifier No. Classifier

Classifier is VGG-11 deep impressive


94.5%. learning accuracy of
29 Aada et al . Machine Naive Bayes The Accuracy network, 93%.
[29]. Learning Decision Tree on the datasets 43 Sarmun et al. Deep Learning Weighted This model
KNN is 94.8% [54] bounding box provides the
The Precision fusion (WBF), mean average
value is 73.28% non-maximum precision
The recall value suppression (mAP) score of
is 73.79% (NMS), Soft- 86.4%
30 Pranto et al. Machine Decision Tree The accuracy NMS
[30] Learning KNN on Kurmitola 44 Khan et al. [55] Deep Learning Sequential The random
Random Forest Hospital Feature forest achieved
Naive Bayes dataset is Selection (SFS) an accuracy of
81.2% and Mutual 99.36%.
The accuracy Information (MI)
on PIDD dataset feature selection
is 77.9%
31 Gröschel et al. Tuberculosis Random Glucose Person
[31] Data Mining Sampling diagnoses the 1) SVM-The SVM is used to classify the data and map it into high-
techniques diabetes with dimensional feature space by separating the data with a hyper­
accuracy of
86%.
plane. The hyperplane separates the data into different samples and
32 Nath et al. [32] Artificial Hovorka’s – classes. A decision boundary is created using kernel functions like
Pancreas System Model linear, polynomial, and sigmoid. Mamatha et al. [13] focused on the
Glucose Dolla Man’s kernel function is an essential aspect of SVM. It allows the value
Monitoring Model
associated with the decision boundary to create the difference be­
System
33 Khan et al. [33] Type-2 Diabetes Regulation of – tween various classes. The SVM model is so much simplified for
prediction on 5- Insulin on the diabetes detection. However, in this model, the possibility of over­
types of cell Five types of fitting is also increased. This model can accurately classify the dia­
Cells betic data based on test sets, training sets and validation sets without
34 Vyas et al .[34]. Diabetes Risk Swarm –
Analysis Algorithm
adding unnecessary complexity to any prevention and detection of
PSO Algorithm diabetes.
35 Yasen et al. Machine ANN The accuracy 2) KNN- KNN is an algorithm that categorises unlabelled data points
[35] Learning Dragon fly for diabetes is depending on the similarity of various data points in a close promi­
Deep learning Algorithm 79.77%.
nent region. The KNN data points are labelled, and there are no issues
Data Analysis Firefly The accuracy
Algorithm for heart in calculating the distance of data points. The problem arises when
disease is 90%. the KNN gets unlabelled data [7]. In KNN, two data points are taken,
36 Thippa Reddy Machine Firefly The Accuracy and a voting rule applies to the classification of datasets. Two stan­
et al. [36] Learning Algorithm get from firefly dard methods are applied to categorize the unlabelled data points
Deep learning BAT Algorithm algorithm is
Soft Computing Optimization 70%.
are-
Techniques a) Majority voting rule is used.
37 Olisah et al. Machine Random Forest 97.3% b) The Inverse distance weighting rule. The distance between the
[37] Learning SVM Accuracy two points is also measured using the Euclidian distance formula.
Deep learning Deep Neural achieved with
Data Handling Network these
algorithms. The class with minimum distributed value gets the vote, and the vote
38 Ganie et al . Machine SVM 99.4% is like an assigned label to the unlabelled data points.
[38]. Learning Decision Tree Accuracy
Cross Validation SMOTE achieved with 1) Decision Tree Classifier-Maniruzzaman et al. [28] mentioned in the
Methods Logistic these
Regression algorithms.
article decision tree has a tree structure that is the same as a tree. A
39 Yahayaoui Machine SVM 65.38% tree has a structure of roots, nodes, branches, and leaves. The deci­
et al. [39] Learning Random Forest Accuracy sion tree also has the same structure as decision nodes, leaf nodes,
Deep Learning achieved with and branches. The leaf nodes and terminal nodes show the class label
SVM algorithm.
of a tree for a final prediction and split the dataset into decreasing
83.67%
Accuracy levels of entropy. The tree started from a route node, the splitting of a
achieved with node until it reached the end. The performance of the decision tree
Random Forest. was measured into three points:
40 Dremin et al. Machine Fuzzy Neural The a) The maximum depth of a tree is shown as max_depth,
[40] Learning Network polarization
Image Random Forest Index of
b) The minimum no of the split is shown as min_samples_split,
Processing Diabetic person c) The maximum number of leaves in the tree is denoted as
is 95% min_leaf_nodes.
41 Wee et al. [52] Deep Learning CNN The Deep 2) Principal Component Analysis-Sharma et al. [3] more focused on
Machine PCA learning model
the principal component analysis is an essential technique for
Learning LDA provides the
DNN accuracy of dimensional reduction. It takes the feature set and attribute set for
89%. dimensionality reduction. This reduction in dimensionality helps
42 Al-Absi et al. Deep Learning Transfer This model simplify the data and often leads to improved model performance
[53] IoT learning provides

9
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Fig. 4. Categorizationof Algorithms uses Machine Learning and Deep Learning techniques analysed in this manuscript.

and computational efficiency. In the context of diabetes, PCA can be 1) Random Forest- Random Forest is a type of decision tree; this tree is
employed in several ways – also known as the form CART (Classification and Regression Tree). It
1. Feature reduction—Diabetes detection and prediction are based is a collection of more decision trees. In this tree, each node sub-tree
on features (e.g., glucose, insulin blood pressure, BMI, etc.). PCA is connected to the other node of the tree. Alkhodari et al [6] has
can decrease the feature count by merging similar types of fea­ provide the concept of bagging and boosting is commonly used, and
tures into a smaller set of uncorrelated principal components. It researchers are interested in using random forest and boosting
reduces the multi-co linearity issues and supports model techniques for diabetes prediction. Both Random Forest and Boosting
interpretability. are popular ensemble machine learning methods used for classifi­
2. Data Visualization—In PCA, the high-dimensional diabetes data cation tasks like predicting diabetes. Whenever people apply a
in 2D or 3D can be easily visualized by plotting the data points of boosting technique in a random forest, it starts by building a baseline
the principal components. random forest model. The random forest worked on predictions
3. Pre-processing—This is indulgently working on data pre- depending on the features, such as the number of trees, maximum
processing techniques; it’s a step before feeding the data into depth, and minimum samples per leaf.
machine learning models. 2) Multilayer Perceptron-MLP works in the same way the human
brain works. The human brain handles and responds to any situation
It performs the dimensionality reduction procedure to reduce the risk the same way MLP does. MLP is the advanced version of ANN. A
of overfitting. Multilayer Perceptron (MLP) is a form of artificial neural network
that features several layers of nodes, also known as neurons, which
1) Deep Neural Network- It relies on unsupervised learning that is are interconnected. It usually includes an input layer, several hidden
used in the prediction and detection of many diseases and many layers, and an output layer. Within each layer, the neurons execute
health applications. This neural network is used for measuring blood calculations on the incoming data. In a multilayer perceptron (MLP),
sugar levels and early detection of diabetes, and it’s essential for neurons are organized into layers. Each neuron in these layers gets
efficient management and prevention of complications in the context inputs from the previous layer, calculates a weighted sum of these
of diabetes. Deep Neural Networks are used in several ways: inputs, adds a bias term, and then processes the data using an acti­
1. Data Collection—This is the first step in DNN to assemble infor­ vation function. The architecture of an MLP is so dense it consists of
mative data on patients. The data set consists of blood glucose more hidden layers and complex features from the data as given in
levels, age, Body Mass Index, Blood pressure, and other health- the article Shuja et al. [14].
relevant indicators. This dataset is used to detect whether a per­ 3) Convolutional Neural Network-CNN is a category of deep learning
son is diabetic or not. that works similarly to the feed-forward neural network. In CNN, the
2. Network Architecture Model—The DNN architecture model is data is analyzed through transitional and rotational methods. This
very complex in its performance. It may consist of some other neural layer is applied to the convolutional operation input data. It
neural architecture, such as feed-forward neural networks, con­ also applies some filters to the input data. This network trained itself
volutional neural networks (CNNs) for image-based features, or automatically based on features and patterns [6]. This algorithm is
recurrent neural networks (RNNs) for sequential data for the specially designed to process and analyze visual data, such as images
blood glucose detection of diabetes. and videos.
3. Training of Model- The diabetic dataset is categorized into a 4) Artificial Neural Networks-ANNs work the same way humans
training set or validation set. respond to any situation. It processes the data based on computa­
tional techniques. The ANN applies the model to our problems and
The DNN model is trained on the training set using an intensification obtains high performance. The ANN algorithm maps the input data
algorithm like gradient descent to minimize the loss function of the combined with appropriate output classes. According to Kazerouni
diabetic dataset as mentioned in the article of Theis et al. [4]. et al. [15] the ANN consists of an Input layer, a Hidden Layer, and an

10
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

output layer. Its architecture indulges in a neural network with many the model with classifiers such as KNN, Naive Bayes, and decision trees.
artificial neurons, which is termed as each unit arranged in a The model was prepared to forecast the prediction result [29].
sequence of layers. The research based on female patients in Bangladesh has been used
5) Recurrent Neural Network-RNN worked for the sequential data. In to practice diabetes detection. In this research, the two databases formed
RNN, the processing of sequential data depends on the time series. a comparative study with various machine learning algorithms PrantoB,
The time series depends upon from 1 to π. The RNN is called recur­ et al. [30].has taken two datasets taken are PIDD, and second one is
rent because it executes the same task for every element of the Kurmitola General Hospital Datasets. Eight samples for feature extrac­
sequence. Zhu et al. [22] worked on-time computations, its output is tion used four machine Learning algorithms: KNN, Decision Tree,
dependent. Another process of thinking of RNN is that it captures the Random Forest and Naive Bayesthe patients of Diabetes and tuberculosis
information that has been calculated so far. The RNN algorithm has who are residing in the urban slums of India. If a patient having tuber­
four stages: the first one is the Input Stage, the second one is the culosis is not getting proper treatment, then the person becomes a dia­
hidden stage, the third is the weight, and the last is the output stage. betic patient. The continuous supply of glucose makes the person
diabetic. It will increase the risk factor if the person becomes diabetic
7. Discussion patient. The random glucose sampling of a person with HbA1c features
confirms the identification of undetected diabetes cases [31].
In this manuscript, machine learning and deep learning techniques Khan et al. [33] their research focuses on insulin management,
are reviewed. This Paper reviews the supervised, semi-supervised, explicitly examining the regulation and modification of insulin across
reinforcement, unsupervised, and deep learning articles by analysing four to five cell types, named alpha, beta, delta, pp (f cells), and epsilon.
the results, techniques and methods. The main motive behind this Paper These cells were analysed under various parameters, assessing their
is to determine classification methods from a qualitative viewpoint that impact on insulin secretion, glucose regulation, and somatostatin
substantially facilitate future research efforts and increase awareness of secretion. Among these, the beta cell exhibited superior performance.
the advancements in the domain of diabetes detection and prediction. In Vyas Set al. [34], explored various diseases using datasets including
this review, the dataset of 55 articles is reviewed. The research of ma­ digestive and kidney diseases, digestive disorders, and diabetes. So
chine learning and deep learning will create automatic diabetes detec­ many multiple algorithms are used for diabetes detection. This paper
tion platforms. used the ANN, Dragonfly algorithm and artificial bee algorithms. A new
This review encompasses a wide range of topics, including the algorithm was implemented called and named it (ANN-DA). The
exploration of various databases and their unique methods, classifica­ ANN-DA algorithm provides better results as compared to the ANN al­
tion techniques, and the diagnosis of diabetes. We also divide into the gorithm. This algorithm predicts diseases like Hepatitis, Diabetes, breast
performance metrics of the models, which serve as indicators of model cancer, blood donations and heart disease [35]. Thippa Reddy et al. [36]
efficiency, accuracy, specificity, F1-score, and recall value. Additionally, also focused on the classification of diabetes data. They have created a
we discuss the empirical-based research, which, despite its time- new algorithm for diabetic classification. This algorithm, named the
consuming nature, has proven to be the most effective in diabetes pre­ firefly-BAT algorithm (FBAT), is a rule-based fuzzy logic-based predic­
diction and detection. tion algorithm. The fuzzy logic-based system has been designed with the
Ayon et al. [27] performed research on diabetes detection and pre­ help of fuzzy-based rules. The process consists of a rule-based system of
diction with the help of breath. The gas sensor is used for breath diabetes and decision and optimal rule generation via FF-BAT. Olisah
detection, and Acetone as a breath marker. A multivariate relevance et al. [37] performed research on diabetes detection using deep learning
vector machine trained the gas sensor on the database. In modeling, the techniques. PIDD dataset used for detection. The prediction process is
temperature and humidity sensor gas sensor perform analysis by CGS-8 categorized into two sets, the first of which is the data filtration and
intelligent gas sensing and analysis systems. Gupta et al. [23] worked on cleaning process. In the second set, all predictive analyses were per­
diabetes detection using ECG signals. The ECG Signals experiment was formed. In the first set, they take the PIDD database, remove the missing
performed using a total of 86 persons to test whether the person had values, give feature importance and feature selection procedure, and
diabetes or not. In total 86 persons, they found 35 persons have diabetes. provide the missing value imputation process. In the second set, data are
The model contains ECG recording, Intrinsic time scale- decomposition tested and trained, and analysis of the basic model is performed to give
(ITD), features, and classify the features of persons. Li et al. [24] per­ the predicted results.The prediction of diabetes in a lavish lifestyle. In
formedresearch on human genes for the detection of Type-2 diabetes. this paper, survey-based research was conducted. The sample data was
Type-2 diabetes has 1600 single cells, and these 949 cells have Type-2 collected from the population of Jammu & Kashmir by distributing the
patients and 651 cells from everyday concepts. The selected features survey form to J&K. The data was collected from 1939 people. Rename
and algorithms used are KNN, SVM, and RF. After applying this algo­ this dataset into T1IDM Lifestyle datasets. Data processing techniques
rithm, the results are provided optimized Type-2 diabetes-associated and data splitting procedures have been performed on it. Applying the
and performed optimized classification techniques. K-fold cross-validation technique will result in the T1IDM [38].
Carlton et al. [25] has given three methods are used to perform the Yahyaoui et al. [39] performed Diabetes detection using machine
empirical research. Qualitative interviews will inform the draft PROM learning techniques involves taking a dataset and adding some original
content. Cognitive debriefing on the interview data refines the draft features. Bootstrapping methods are applied for data filtration, and a
from PROM content. A classification system has generated a smaller majority voting process is used to obtain results. Input data are pro­
number of items from PROM. Aguilera et al. [26] performed the research cessed using convolutional max pooling layers and feature mapping on
studies on diabetic patients determine whether the diabetic person is in dense layers, with Random Forest being used to obtain the results.
the depression or not. This study was performed on 276 adults. These Khan et al. [45] worked on a groundbreaking multi-view clustering
adults are between 18 and 75 years old. They measure the depressive approach was introduced, leveraging concept factorization to enhance
symptoms on a scale of 8. The depressive scale intervention was checked clustering performance. This novel method surpassed existing tech­
within six months, followed by a follow-up procedure. The PIDD dataset niques, showcasing its efficiency in handling multi-view data. Khan et al.
of diabetic patients was analysed using a deep neural network. The [46] developed Another innovative framework, DECCA, was developed
classification procedure in this research was performed using K-fold based on Contractive Autoencoders. It demonstrated superior results in
cross-validation [27]. Diabetes diagnostics employed the same PIDD clustering high-dimensional document data, even on real-world
database. The bootstrapping method was used to correct the dataset. The datasets.
model comprised dataset selection, data pre-processing, feature extrac­ Debelee et al. [47] worked on the Commonly available datasets for
tion through PCA, and the application of the resample channel to train skin lesion analysis were identified, exploring various machine-learning

11
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

techniques used for skin disease detection. The study examined the Table 4
contributions and limitations of current state-of-the-art methods. Khan Summary of work with their contributions and limitations.
et al. [48] introduced a new multi-view clustering algorithm called S. Authors Details Contribution Limitations
MCNMF that utilizes manifold regularization to significantly improve No
clustering results as compared to other algorithms on various real-world 1 Ahmed et al. [1] Their groundbreaking Fuzzy logic techniques are
datasets. The importance of accurate detection of diabetes is demon­ work is cantered on ingeniously employed to
strated by proposing a method using machine learning techniques for elevating the precision blend two to three ML
this purpose, testing the process on a diabetes dataset, achieving high and dependability of algorithms.
diabetes prediction
performance compared to previous methods, and emphasizing the sig­ through the fusion of
nificance of early detection for effective treatments [49]. Rucci et al. multiple machine-
[50] more focused challenges in detecting Type 2 Diabetes and Predia­ learning models. The
betes present the development and evaluation of predictive models for authors innovatively
crafted a unified machine
identifying at-risk individuals in Argentina and emphasize the impor­
learning framework that
tance of early detection in individuals unaware of their condition. The amalgamates the merits
global and local learning models of diabetes detection have been of diverse predictive
designed. Rufo et al. [51] developed a model by integrating diverse algorithms, thereby
machine learning techniques like XGBoost and Naive Bayes for inter­ enhancing the prediction
outcomes for diabetes.
national learning. The local learning model consists of KNN, SVM and 2 Shokrekhodaei They conducted The VIS-NIS sensor size
RBF for local learning. They apply this model to the Pima Indian Dia­ al. [2] comprehensive must be minimized so it
betes Dataset, which provides an accuracy of 99.5%. Overall, the paper experiments to validate turns out to be a wearable
offered a vast description of the diabetes dataset used in deep learning the performance of their sensor.
system. They provided a
and machine learning techniques for the identification of diabetes.
detailed analysis of the
Table 4 shows the summary of related works, highlighting both contri­ system’s accuracy in
butions and limitations of this study. glucose level detection,
comparing it with
8. Conclusion traditional invasive
methods and
demonstrating its
Recognizing the pivotal role of ML techniques in advancement in the efficacy.
healthcare sector, particularly in disease prediction, the manuscript in­ 3 Sharma et al. [3] They proposed a novel . The sensitivity and
vestigates the utilization of various ML and DL approaches for the early service composition relativity value and
model within the cloud accuracy are less.
detection of Diabetes. The analysis offered in this work revolves around
environment that
five key factors: the dataset used for Diabetes Mellitus, machine integrates various
learning-based identification methods, performance measures, limita­ healthcare services. This
tions in current diabetes detection research, and the overall status of model allows for the
diabetic research. efficient coordination of
medical resources and
The study objective of this manuscript is to provide a detailed study
services, optimizing the
on DM prediction, diabetes management procedures, and preventive healthcare delivery
techniques. This information is designed to serve as a practical resource process for diabetes
for researchers in the field. The importance of a systematic approach and patients and improving
the overall efficiency and
the inclusion of critical factors contribute to a more nuanced under­
effectiveness of medical
standing of the subject matter. As the scientific community remains to care provision.
engage in a comprehensive review, automated DM detection and self- 4 Thesis et al. [4] They critically analyzed The approach they used for
management emerge as vital avenues for further exploration and the integration of the longitudinal patient
development in diabetes research. The insights presented herein lay the machine learning records is not applicable to
techniques with E-Nose the minimal time patients
foundation for ongoing works to address the multifaceted objection
technology, discussing stayed in the hospital.
posed by Diabetes Mellitus on a global scale. Due to the rapid how these combined
advancement of Artificial Intelligence and the Internet of Things, we approaches can increase
would like to integrate further these advanced technologies for data the accuracy and
reliability of non-invasive
processing for the early prediction of Diabetes Mellitus.
diabetes diagnosis. The
review also explored
Ethical approval future directions in
machine learning
This article does not contain any studies with human participants or algorithms for improving
diagnostic performance
animals performed by any of the authors.
and personalized
healthcare solutions in
Funding diabetes management.
5 Lekha et al. [5] They developed an . The MOSFET sensor for
advanced prediction breath detection is not
This research did not receive any specific grant from funding
model that combines wearable. Use for clinical
agencies in the public, commercial, or not-for-profit sectors. process mining with deep purposes is not possible.
learning techniques to
CRediT authorship contribution statement enhance the in-hospital
mortality prediction for
diabetes patients in the
Neha Katiyar: Writing – review & editing, Writing – original draft, intensive care unit (ICU).
Visualization, Methodology, Conceptualization. Hardeo Kumar Tha­
(continued on next page)
kur: Supervision, Methodology. Anindya Ghatak: Supervision,
Conceptualization.

12
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 4 (continued ) Table 4 (continued )


S. Authors Details Contribution Limitations S. Authors Details Contribution Limitations
No No

6 Alkhodari et al. They contributed by The approach required results over traditional
[6] developing a machine careful considerations on methods, making it a
learning-based method to the clinical more reliable and
screen for cardiovascular implementations. effective tool for early
autonomic neuropathy diabetes diagnosis and
(CAN) in diabetic prevention.
patients, particularly 13 Mamatha et al. The authors worked on It shows that 768 samples
those with microvascular [13] big data and applied data are big data. These sample
complications. Their mining in healthcare, values are not considered
method utilizes 24-hour specifically focusing on as big data.
heart rate variability the detection of diabetes.
(HRV) data to diagnose They contributed to the
complications of diabetes. potential of big data
7 Al Jlailateyet al . They contributed to the None analytics in enhancing the
[7] practical relevance of understanding, analysis,
machine learning and prediction of
techniques for analysing diabetes; they were more
and converting data from focused on personalized
inertial sensors worn by healthcare solutions.
individuals. Their work 14 Shuja et al. [14] They implemented the The model they provide for
not only improved the Synthetic Minority Over- DL and ML techniques
accuracy and efficiency of sampling Technique needs to be clearly defined.
gesture detection but also (SMOTE) to address data
integrated technologies in imbalance, a common
everyday health challenge in medical data
monitoring devices. analysis. This technique
8 Fazakiset al. [8] They created intelligent The AOC curve value is so significantly improved
machine learning tools for much less. the predictive
the long-term risk of performance of their
developing type 2 model with the data
diabetes prediction. Their mining process.
models can accurately 15 Kazerouni et al. They explored the The model used in the
assess an individual’s risk, [15] potential of long non- paper needs to be more
enabling early coding RNA (lncRNA) clearly defined.
intervention and expression for predicting
preventive healthcare Type 2 diabetes mellitus
measures. (T2DM) and detected
9 Siddiqui et al. They extensively If there were fewer diabetes on an RNA
[9] reviewed non-invasive or features, the accuracy rate molecular basis.
painless blood glucose could be higher. It takes 16 Khanam et al. They contributed to The accuracy is less than
monitoring techniques only Five features. [16] performing detailed 77% on the train and test
from 2012 to 2016, comparative analyses on split phase.
examined the progress in different machine
the diabetes prediction learning algorithms to
area, and facilitated the assess their efficiency and
connection between accuracy in predicting
information technology diabetes. They provide
and healthcare. valuable insight for better
10 Thakkar et al. The study incorporates Precision and recall value prediction and
[10] understanding and using is not calculated. management of diabetes.
advanced analytical Feature description is 17 Chaki et al. [17] They performed a It takes less than no paper
techniques to enhance the absent in the paper. systematic review to to do a diabetic review.
accuracy and reliability of evaluate the effectiveness
diabetes prognosis. By of machine learning (ML)
dissecting and analysing and artificial intelligence
these techniques, more (AI) techniques in the
sophisticated and detection and self-
effective tools for diabetes management of diabetes
management and mellitus and highlighted
research are developed. the research gaps in this
11 Fiarmi et al. They contributed by The model used in the field.
[11] comprehensively paper provide less 18 Ghosh et al. [18] They contributed by The Data Pre-Processing
analysing diabetes data to accuracy. performing research on techniques are used for
interpret the patterns and diabetes management better results.
correlations related to tools.
diabetes complications. 19 Tripathi et al. They conducted an in- The Accuracy must be
They applied data mining [19] depth analysis of various improved.
techniques and the machine learning
effective use of clinical algorithms to determine
data to improve the the most effective
prognosis and treatment approach for predicting
of diabetes diseases. early diabetes. Their
12 Suyanto et al. They use medical datasets In this paper, splinting the research contributes to
[12] to predict diabetes. This dataset into clusters is the optimization
work provides significant impossible. algorithms for healthcare
(continued on next page)

13
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 4 (continued ) Table 4 (continued )


S. Authors Details Contribution Limitations S. Authors Details Contribution Limitations
No No

applications, improving treatment, and


the predictive accuracy monitoring of type 2
for early intervention in diabetes.
diabetes.. 25 Carton et al. They Contributed by The Full psychometric
20 Ye et al. [20] They contributed by The experiment was [25] elaborating a new Testing is required for this
designing the model of an performed on fewer preference-based model model.
e-nose system that could people. It takes only 41 to quantify the
achieve high levels of people. More people take significance of
accuracy and precision in the time to get better hypoglycaemia on the
predicting blood glucose results. quality of life of people
levels. This advancement with diabetes. This
not only proves the addresses a significant
system’s effectiveness in gap in diabetic care and
real-world applications measures how
but also paves the way for hypoglycaemia affects
further research and patients’ lives.
development in non- 26 Aguilera et al. They created a detailed It takes a lot of time for this
invasive glucose [26] clinical trial protocol to research, approxsix
monitoring technologies. test the DIAMANTE app’s months.
21 Zhu et al. [21] Their work showcases the The working with ethanol effectiveness rigorously.
effective use of a gas gasses. Its clinical This contribution is vital
sensor array in identifying implementation required for providing empirical
the breath markers of some precautions. evidence on how machine
diabetic patients. This learning and mHealth
approach provides a non- solutions improve health
invasive, real-time outcomes, specifically by
method for diabetes increasing physical
detection, contributing to workload in patients with
easier and more patient- diabetes and depression.
friendly diabetes 27 Ayon et al [27] The comparison The present class of dataset
monitoring and performance of their deep is 259, which is less.
management. learning model with
22 Zhu et al. [22] They innovatively The accuracy rate is less. traditional machine
combined the Internet of Few methods and learning algorithms
Medical Things (IoMT) techniques apply to the provides beneficial
with deep learning datasets. insights into the
algorithms to develop a advantages and
real-time blood glucose disadvantages of deep
prediction system. This learning in the context of
system represents a medical prediction. This
significant advancement comparative study helps
in the remote and to highlight the features
continuous monitoring of of deep learning in
glucose levels, enhancing enhancing the predictive
diabetes management analytics models for
through timely and diabetes.
accurate data analysis. 28 Manizurraman The study evaluated The model is vast because
23 Gupta et al. [23] Their research The drawbacks of this et al. [28] various machine learning it takes so many
contributes to remodeling paper accuracy must be techniques and their algorithms.
the accuracy and improved. performance in diabetes
reliability of diabetes detection. By assessing
screening. By leveraging these models on various
ECG signals commonly metrics, the most effective
collected in clinical for diabetes classification
settings, the framework and prediction was
enhances the capability to determined, thereby
identify at-risk guiding future research
individuals, thus and clinical practice in
supporting timely and this area.
appropriate medical 29 Aada et al. [29] They provided a The Accuracy get from
intervention. comprehensive analysis using decision tree is so
24 Li et al. [24] They contributed mixed The number of cells in a and comparison of the much less.
single-cell sequencing group needs to be performance of these
data to identify balanced. It looks like a machine learning
biomarkers for type 2 raw data set. techniques in predicting
diabetes. This innovative diabetes.
approach allows for a 30 Pranto et al. They contributed a The ANN algorithm is used
more detailed and [30] thorough comparative for this extension of the
nuanced understanding of analysis of various research work.
the cellular and molecular machine learning
mechanisms involved in algorithms that
the disease, leading to the accurately predicted
discovery of specific diabetes. This analysis not
biomarkers that can be only highlights the most
used for diagnosis, suitable predictive
(continued on next page)

14
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 4 (continued ) Table 4 (continued )


S. Authors Details Contribution Limitations S. Authors Details Contribution Limitations
No No

models for diabetes ultimately contributing to


prediction but also better preventive
contributes to the broader healthcare strategies.
field of medical 35 Yasen et al.[35] They created an The optimization
informatics. Their optimized neural network techniques may be used
research demonstrates model applied to medical along with this algorithm.
that regional and prediction tasks,
demographic factors can demonstrating its
influence the selection potential in accurately
and performance of diagnosing or predicting
machine learning medical conditions. This
methods in healthcare contribution is significant
applications. as it provides a practical
31 Grocheal et al. They investigated the Accuracy must be example of how advanced
[31] efficacy of random improved. optimization algorithms
glucose sampling for are used in neural
screening diabetes in networks in the
disadvantaged healthcare domain.
populations, specifically 36 Thippa Reddy The use of the Firefly The accuracy could be
among tuberculosis et al. [36] Algorithm to optimize the better.
patients in urban slums in rule base of the fuzzy
India. This approach logic classifier represents
addresses the need for an innovative approach in
accessible and practical the field. This method not
diabetes screening only improves the
methods with limited classifier’s performance
resources. but also contributes to a
32 Nath et al. [32] They contributed a The noise cancelation broader understanding of
detailed review of process is required for how bio-inspired
different physiological diabetic people. algorithms can be
models designed for effectively applied in
understanding and optimizing complex
simulating Type 1 systems, particularly in
Diabetes Mellitus. This the healthcare domain.
contribution is accurate 37 Olisah et al. [37] They highlighted the They take only adult
as it consolidates critical role of data pre- persons’ data for this
knowledge on modeling processing in the machine implementation. Data on
approaches that can learning pipeline for children and older people
replicate the diabetes prediction and must be added.
physiological processes of diagnosis. They explored
T1DM, aiding in the various pre-processing
research and techniques to enhance the
development of treatment quality and reliability of
methods. the data before it is used
33 Khan et al. [33] The article also explores This paper requires a in machine learning
the therapeutic potential model for cell separation in models, showing how
of these islet peptides in different types, such as proper data preparation
treating type 2 diabetes. Alpha cell, Beta Cell, and can significantly impact
Because these peptides Delta Cell. the accuracy of the
can be targeted or predictive outcomes.
mimicked in therapeutic 38 Ganie et al. [38] Their research The Voting techniques they
interventions, they emphasizes the used can’t provide accurate
contributed to the vast significance of lifestyle or valid results.
field of diabetes research, indicators in predicting
which is developing novel type-II diabetes mellitus,
treatment strategies that highlighting how factors
could enhance beta cell such as diet, physical
function and improve activity, and body weight
glycaemic control in type are critical in developing
2 diabetes patients. and managing the
34 Vyas et al .[34] The article critically They take less paper for disease. The study
evaluates the review. contributes to
effectiveness, strengths, understanding how
and limitations of lifestyle data can be
different predictive effectively utilized in
analysis techniques in the predictive models to aid
context of diabetes in early diagnosis and
prospect assessment. This preventive healthcare
evaluation is crucial for strategies.
understanding how these 39 Yahayaoui et al. They contributed to The no of diabetic person
methods can be optimally [39] enhancing predictive data is so much less.
applied and further analytics in the
developed to enhance the healthcare sector by
accuracy and reliability of demonstrating how
diabetes risk predictions, advanced data processing
(continued on next page)

15
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 4 (continued ) Table 4 (continued )


S. Authors Details Contribution Limitations S. Authors Details Contribution Limitations
No No

and analysis techniques NMF and manifold filling it with average


can be employed to regularization and feature values.
predict diabetes more focused on multi-view
effectively. Using data’s complementary
sophisticated algorithms behavior to conserve the
in their DSS represents a data space’s local
significant step forward in geometrical shape.
developing intelligent 46 Haq et al. [49] They include developing a The study’s limitations are
health monitoring and diagnosis system for not explicitly stated in the
management tools. accurate diabetes paper. The paper focuses
40 Dremin et al. This contribution is The data taken by less no of detection in the e- on the proposed method for
[40] significant in showcasing patients. healthcare environment, detecting diabetes using
how machine learning enhancing prediction ML techniques and
can enhance the analysis accuracy, proposing new highlights its achievements
and interpretation of approaches for effective and potential future
complex imaging data, diabetes detection, and applications.
leading to more accurate evaluating classification
and efficient diagnosis of performance based on
skin-related diabetes different feature sets and
complications. cross-validation methods.
41 Rufo et al. [44] Develop an optimal and Lack of enough data on 47 Rucci et al. [50] The study objectives are Requirements of large
accurate diabetes recommended indicators to develop and assess datasets to enhance the
diagnosis model based on like OGTT and HbA1c due predictive models to quality and
machine learning to their expensiveness. identify people at risk for representativeness.
algorithms. The inclusion of invasive Type 2 Diabetes (T2D)
diabetes indicators may and Prediabetes (PD),
limit the application of self- specifically in Argentina,
testing. present the background
They limited for the work, describe the
generalization capability database processing,
due to only having Ethiopia analyses the results
as a proxy for ethnicity. obtained, and present
42 Khan et al. [45] It introduced a unique The suggested method conclusions and possible
multi-view clustering leverages the manifold future work.
technique to address the structure to preserve a 48 Wee et al. [52] The study objectives are Flaws and disadvantages in
challenges in clustering uniform geometrical to investigate and discuss existing datasets such as
multi-view data. That representation across the usability of ML and DL inaccurate data or few
created a concept various data spaces approaches in diabetes recorded samples.
factorization method for effectively. identification/ Fluctuation in values of
consensus manifold classifications. They different recorded classes
regularization models to highlight the future in the affecting machine learning
effectively factorize the research of feature models.
original data matrix and selection techniques Lack of a standard
value the shared densely used to improve procedure for collecting
geometrical structure the results of diabetes non-invasive
from multiple-view data. identifications. measurements for diabetes
43 Khan et al. [46] They propose advanced Future studies ought to classification.
techniques to avoid more meticulously 49 Al-Absi et al . The study objectives Absence of comparison
uninterested clusters by evaluate the impact of [53] include developing an with human-level
computing an accurate recurrent neural networks advanced predictive intelligence in diabetes
term in the loss function on document clustering. model for diabetes screening.
that penalizes the cluster. Integrating recurrent diagnosis using retinal Lack of public availability
neural networks with CtAE images, addressing the of datasets.
could lead to enhanced limitations of current
clustering accuracy. No diagnostic methods,
identified financial utilizing a massive dataset
conflicts or personal from Qatar Biobank and
relationships might have Hamad Medical
affected the findings Corporation to enhance
presented in this paper. accuracy, and achieving
44 Debelee et al. Summarize the open Lack of detailed high precision,
[47] challenges in skin disease comparison with other sensitivity, and specificity
and cancer detection and methods. in distinguishing diabetic
classification. Sole reliance on a single patients from the control
dataset. group.
The absence of information 50 Sarmun et al . Accurate and early Lack of diversity in non-
on computational [54] detection of Diabetes on DFU terms in the dataset.
resources is required. Foot Ulcers through Predominance of white
45 Khan et al. [48] They contributed a new Non-convex optimization image analysis can individuals in the data.
multi-view clustering problem. prevent amputations and
method that depends Difficulty in calculating the fatalities.
upon NMF and manifold global minimum due to the The Weighted Bounding
regularization, non-convex nature of the Box Fusion (WBF)
introduced a new objective function. approach, combining
function that integrates Handling missing data by YOLOv8m and FRCNN-
(continued on next page)

16
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

Table 4 (continued ) [14] M. Shuja, S. Mittal, M. Zaman, Effective prediction of type ii diabetes mellitus using
data mining classifiers and SMOTE, in: InAdvances in Computing and Intelligent
S. Authors Details Contribution Limitations Systems: Proceedings of ICACM 2019, Springer, Singapore, 2020, pp. 195–211.
No [15] F. Kazerouni, A. Bayani, F. Asadi, L. Saeidi, N. Parvizi, Z. Mansoori, Type2 diabetes
mellitus prediction using data mining algorithms based on the long-noncoding
ResNet101, significantly
RNAs expression: a comparison of four data mining approaches, BMC Bioinform.
outperformed previous 21 (1) (2020) 1–13, https://doi.org/10.1186/s12859-020-03719-8.
benchmarks, achieving a [16] J.J. Khanam, S.Y. Foo, A comparison of machine learning algorithms for diabetes
MAP score of 86.4% on prediction, ICT Express 7 (4) (2021) 432–439, https://doi.org/10.1016/j.
the DFUC2020 dataset. icte.2021.02.004.
51 Khan et al .[55] The study achieved The machine learning [17] J. Chaki, S.T. Ganesh, S.K. Cidham, S.A. Theertan, Machine learning and artificial
remarkable accuracy models is trained and intelligence based Diabetes Mellitus detection and self-management: a systematic
rates of 99.35% with ANN varies due to input review, J. King Saud Uni.-Comput. Inform. Sci 34 (6) (2022) 3204–3225. Jun 1.
on the PIMA dataset and attributes and their [18] P. Ghosh, S. Azam, A. Karim, M. Hassan, K. Roy, M. Jonkman, A comparative study
99.36% with RF on the relationship with the target of different machine learning tools in detecting diabetes, Procedia Comput. Sci 192
early diabetes risk class label. (2021) 467–477. Jan 1.
dataset. The results of some [19] G. Tripathi, R. Kumar, Early prediction of diabetes mellitus using machine
learning, in: In2020 8th international conference on reliability, Infocom
The selected features methods decreased due to
technologies and optimization (trends and future directions)(ICRITO), IEEE, 2020,
using Sequential Feature the reduction in the
pp. 1009–1014. Jun 4.
Selection (SFS) number of features.
[20] Z. Ye, J. Wang, H. Hua, X. Zhou, Q. Li, Precise detection and quantitative
significantly improved prediction of blood glucose level with an electronic nose system, IEEE Sens. J 22
the prediction (13) (2022) 12452–12459. Jun 7.
performance of machine [21] H. Zhu, C. Liu, Y. Zheng, J. Zhao, L. Li, A hybrid machine learning algorithm for
learning models. detection of simulated expiratory markers of diabetic patients based on gas sensor
array, IEEE Sens. J 23 (3) (2022) 2940–2947. Dec 19.
[22] T. Zhu, L. Kuang, J. Daniels, P. Herrero, K. Li, P. Georgiou, IoMT-enabled real-time
Declaration of competing interest blood glucose prediction with deep learning and edge computing, IEEE Internet.
Thing. J 10 (5) (2022) 3706–3719. Jan 14.
[23] K. Gupta, V. Bajaj, A robust framework for automated screening of diabetic patient
The authors declare that they have no known competing financial using ecg signals, IEEE Sens. J 22 (24) (2022) 24222–24229. Nov 9.
interests or personal relationships that could have appeared to influence [24] Z. Li, X. Pan, Y.D. Cai, Identification of type 2 diabetes biomarkers from mixed
single-cell sequencing data with feature selection methods, Front. Bioeng.
the work reported in this paper. Biotechnol 10 (2022) 890901. Jun 2.
[25] J. Carlton, P. Powell, D. Rowen, M. Broadley, F. Pouwer, J. Speight, S. Heller, M.
Data availability A. Gall, M. Rosilio, C.J. Child, J. Comins, Producing a preference-based quality of
life measure to quantify the impact of hypoglycaemia on people living with
diabetes: a mixed-methods research protocol, Diabet. Med. 40 (3) (2023) e15007.
No data was used for the research described in the article. Mar.
[26] A. Aguilera, C.A. Figueroa, R. Hernandez-Ramos, U. Sarkar, A. Cemballi, L. Gomez-
Pathak, J. Miramontes, E. Yom-Tov, B. Chakraborty, X. Yan, J. Xu, mHealth app
using machine learning to increase physical activity in diabetes and depression:
References clinical trial protocol for the DIAMANTE Study, BMJ Open 10 (8) (2020) e034723.
Aug 1.
[1] U. Ahmed, G.F. Issa, M.A. Khan, S. Aftab, M.F. Khan, R.A. Said, T.M. Ghazal, [27] S.I. Ayon, M.M. Islam, Diabetes prediction: a deep learning approach, Internat. J.
M. Ahmad, Prediction of Diabetes Empowered With Fused Machine Learning, 10, Inform. Eng. Electr. Busin. 13 (2) (2019) 21. Mar 1.
IEEE Access, 2022, pp. 8529–8538. Jan 11. [28] M. Maniruzzaman, M.J. Rahman, B. Ahammed, M.M. Abedin, Classification and
[2] M. Shokrekhodaei, D.P. Cistola, R.C. Roberts, S. Quinones, Non-invasive Glucose prediction of diabetes disease using machine learning paradigm, Health. Inf. Sci.
Monitoring Using Optical Sensor and Machine Learning Techniques For Diabetes Syst 8 (2020) 1–4. Dec.
Applications, 9, IEEE Access, 2021, pp. 73029–73045. May 11. [29] A. Aada, S. Tiwari, Predicting diabetes in medical datasets using machine learning
[3] S.K. Sharma, A.T. Zamani, A. Abdelsalam, D. Muduli, A.A. Alabrah, N. Parveen, S. techniques, Int. J. Sci. Res. Eng. Trends. 5 (2) (2019) 257–267.
M. Alanazi, A diabetes monitoring system and health-medical service composition [30] B. Pranto, S.M. Mehnaz, E.B. Mahid, I.M. Sadman, A. Rahman, S. Momen,
model in cloud environment, IEEE Access 11 (2023) 32804–32819. Mar 17. Evaluating machine learning methods for predicting diabetes among female
[4] J. Theis, W.L. Galanter, A.D. Boyd, H. Darabi, Improving the in-hospital mortality patients in Bangladesh, Information 11 (8) (2020) 374. Jul 23.
prediction of diabetes ICU patients using a process mining/deep learning [31] M.I. Gröschel, C.F. Luz, S. Batra, S. Ahuja, S. Batra, K. Kranzer, T.S. van der Werf,
architecture, IEEE J. Biomed. Health Inform 26 (1) (2021) 388–399. Jun 28. Random glucose sampling as screening tool for diabetes among disadvantaged
[5] S. Lekha, M. Suchetha, Recent advancements and future prospects on e-nose tuberculosis patients residing in urban slums in India, ERJ open res 5 (1) (2019).
sensors technology and machine learning approaches for non-invasive diabetes Feb 1.
diagnosis: a review, IEEE Rev. Biomed. Eng 14 (2020) 127–138. May 11. [32] A. Nath, S. Biradar, A. Balan, R. Dey, R. Padhi, Physiological models and control for
[6] M. Alkhodari, M. Rashid, M.A. Mukit, K.I. Ahmed, R. Mostafa, S. Parveen, A. type 1 diabetes mellitus: a brief review, IFAC-PapersOnLine 51 (1) (2018)
H. Khandoker, Screening cardiovascular autonomic neuropathy in diabetic patients 289–294. Jan 1.
with microvascular complications using machine learning: a 24-hour heart rate [33] D. Khan, C.R. Moffet, P.R. Flatt, C. Kelly, Role of islet peptides in beta cell
variability study, IEEE Access 9 (2021) 119171–119187. Aug 24. regulation and type 2 diabetes therapy, Peptides 100 (2018) 212–218. Feb 1.
[7] H. Al Jlailaty, A. Celik, M.M. Mansour, A.M Eltawil, Machine learning-based [34] S. Vyas, R. Ranjan, N. Singh, A. Mathur, Review of predictive analysis techniques
unobtrusive intake gesture detection via wearable inertial sensors, IEEE Trans. for analysis diabetes risk, in: In2019 Amity International Conference on Artificial
Biomed. Eng 70 (4) (2022) 1389–1400. Oct 25. Intelligence (AICAI), IEEE, 2019, pp. 626–631. Feb 4.
[8] N. Fazakis, O. Kocsis, E. Dritsas, S. Alexiou, N. Fakotakis, K. Moustakas, Machine [35] M. Yasen, N. Al-Madi, N. Obeid, Optimizing neural networks using dragonfly
learning tools for long-term type 2 diabetes risk prediction, IEEE Access 9 (2021) algorithm for medical prediction, in: In2018 8th international conference on
103737–103757, https://doi.org/10.1109/ACCESS.2021.3098691. computer science and information technology (CSIT), IEEE, 2018, pp. 71–76. Jul
[9] S.A. Siddiqui, Y. Zhang, J. Lloret, H. Song, Z. Obradovic, Pain-free blood glucose 11.
monitoring using wearable sensors: recent advancements and future prospects, [36] G. Thippa Reddy, N Khare, FFBAT-optimized rule based fuzzy logic classifier for
IEEE Rev. Biomed. Eng. 11 (2018) 21–35, https://doi.org/10.1109/ diabetes, Internat. J. Eng. Res. Africa 24 (2016) 137–152. Jul 1.
RBME.2018.2822301. [37] C.C. Olisah, L. Smith, M. Smith, Diabetes mellitus prediction and diagnosis from a
[10] H. Thakkar, V. Shah, H. Yagnik, M. Shah, Comparative anatomization of data data preprocessing and machine learning perspective, Comput. Methods.
miningand fuzzy logic techniques used in diabetes prognosis, Clin. eHealth (2020), Programs. Biomed 220 (2022) 106773. Jun 1.
https://doi.org/10.1016/j.ceh.2020.11. [38] S.M. Ganie, M.B. Malik, An ensemble machine learning approach for predicting
[11] C. Fiarni, E.M. Sipayung, S. Maemunah, Analysis and prediction of diabetes type-II diabetes mellitus based on lifestyle indicators, Healthc. Anal 2 (2022)
complication disease using data mining algorithm, Procedia Comput. Sci 161 100092. Nov 1.
(2019) 449–457. Jan 1. [39] A. Yahyaoui, A. Jamil, J. Rasheed, M. Yesiltepe, A decision support system for
[12] S. Suyanto, S. Meliana, T. Wahyuningrum, S. Khomsah, A new nearest neighbor- diabetes prediction using machine learning and deep learning techniques, in:
based framework for diabetes detection, Expert Syst. Appl. 199 (November 2021) In2019 1st International informatics and software engineering conference
(2022) 116857, https://doi.org/10.1016/j.eswa.2022.116857. (UBMYK), IEEE, 2019, pp. 1–4. Nov 6.
[13] B.G. Mamatha Bai, B.M. Nalini, J. Majumdar, Analysis and detection of diabetes [40] V. Dremin, Z. Marcinkevics, E. Zherebtsov, A. Popov, A. Grabovskis, H. Kronberga,
using data mining techniques—A big data application in health care. Emerging K. Geldnere, A. Doronin, I. Meglinski, A. Bykov, Skin complications of diabetes
Research in Computing, Information, Communication and Applications: ERCICA mellitus revealed by polarized hyperspectral imaging and machine learning, IEEE
2018, Volume 1, Springer, Singapore, 2019, pp. 443–455. Trans. Med. Imag. 40 (4) (2021) 1207–1216. Jan 6.

17
N. Katiyar et al. e-Prime - Advances in Electrical Engineering, Electronics and Energy 9 (2024) 100661

[41] N. Nnamoko, I. Korkontzelos, Efficient treatment of outliers and class imbalance Neha Katiyar, Research Scholar, School- School of Computer
for diabetes prediction, Artif. Intell. Med 104 (2020) 101815. Apr 1. Science Engineering and Technology, Bennett University,
[42] V. Kumar, G.S. Lalotra, P. Sasikala, D.S. Rajput, R. Kaluri, K. Lakshmanna, India, Email- [email protected], Research Area
M. Shorfuzzaman, A. Alsufyani, M. Uddin, Addressing binary classification over (s), Machine Learning, IoT,6 G
class imbalanced clinical datasets using computationally intelligent techniques, in:
InHealthcare, 10, MDPI, 2022, p. 1293. Jul 13.
[43] Q. Wang, W. Cao, J. Guo, J. Ren, Y. Cheng, D.N. Davis, DMP_MI: an effective
diabetes mellitus classification algorithm on imbalanced data with missing values,
IEEE access 7 (2019) 102232–102238. Jul 19.
[44] D.D. Rufo, T.G. Debelee, A. Ibenthal, W.G. Negera, Diagnosis of diabetes mellitus
using gradient boosting machine (LightGBM), Diagnostics 11 (9) (2021) 1714. Sep
19.
[45] M.N. Khan, S.K. Hasnain, M. Jamil, A. Imran, Electronic Signals and Systems:
Analysis, Design and Applications, River Publishers, 2022. Sep 1.
[46] B. Diallo, J. Hu, T. Li, G.A. Khan, X. Liang, Y. Zhao, Deep embedding clustering
based on contractive autoencoder, Neurocomputing 433 (2021) 96–107. Apr 14.
Dr. Hardeo Kumar Thakur, Associate Professor, School-
[47] T.G. Debelee, Skin lesion classification and detection using machine learning
School of Computer Science Engineering and Technology,
techniques: a systematic review, Diagnostics 13 (19) (2023) 3147. Oct 7.
Bennett University, India, [email protected].
[48] G.A. Khan, J. Hu, T. Li, B. Diallo, H. Wang, Multi-view data clustering via non-
in, Research Area(s), Data mining,Dynamic Graph Mining,
negative matrix factorization with manifold regularization, Int. J. Mach. Learn.
Data Analytics
Cyber (2022) 1–3. Mar 1.
[49] A.U. Haq, J.P. Li, J. Khan, M.H. Memon, S. Nazir, S. Ahmad, G.A. Khan, A. Ali,
Intelligent machine learning approach for effective recognition of diabetes in E-
healthcare using clinical data, Sensors 20 (9) (2020) 2649. May 6.
[50] E. Rucci, G. Tittarelli, F. Ronchetti, J.F. Elgart, L. Lanzarini, J.J. Gagliardino, First
experiences with the identification of people at risk for diabetes in argentina using
machine learning techniques, arXiv preprint arXiv:2403.18631. (2024). Mar 27.
[51] D.D. Rufo, T.G. Debelee, W.G. Negera, A hybrid machine learning model based on
global and local learner algorithms for diabetes mellitus prediction, J. Biomim.,
Biomat. Biomed. Eng. 54 (2022) 65–88. Feb 10.
[52] B.F. Wee, S. Sivakumar, K.H. Lim, W.K. Wong, F.H. Juwono, Diabetes detection
based on machine learning and deep learning approaches, Multimed. Tools Appl 83 Dr. Anindya Ghatak, Assistant Professor, School- School of
(8) (2024) 24153–24185. Mar. Computer Science Engineering and Technology, Bennett Uni­
[53] H.R. Al-Absi, A. Pai, U. Naeem, F.K. Mohamed, S. Arya, R.A. Sbeit, M. Bashir, M. versity, India, Email- anindya.ghatak@ bennett.edu.in,
M. El Shafei, N. El Hajj, T Alam, DiaNet v2 deep learning based method for diabetes Research Area(s), Functional Analysis, Operator Theory and
diagnosis using retinal images, Sci. Rep 14 (1) (2024) 1595. Jan 18. Operator Algebras and its application in Quantum information
[54] R. Sarmun, M.E. Chowdhury, M. Murugappan, A. Aqel, M. Ezzuddin, S.M. Rahman, theory.
A. Khandakar, S. Akter, R. Alfkey, M.A. Hasan, Diabetic foot ulcer detection:
combining deep learning models for improved localization, Cognit. Comput (2024)
1–9. Apr 1.
[55] Q.W. Khan, K. Iqbal, R. Ahmad, A. Rizwan, A.N. Khan, D. Kim, An intelligent
diabetes classification and perception framework based on ensemble and deep
learning method, PeerJ Comput. Sci. 10 (2024) e1914. Mar 29.

18

You might also like