0% found this document useful (0 votes)

92 views

Hospital Readmission Prediction Using Machine Learning Techniques

This document discusses predicting hospital readmission using machine learning techniques. It presents a comparative study of logistic regression, multi-layer perceptron, Naive Bayesian classifier, decision tree, and support vector machine models to predict readmission likelihood using a large US hospital dataset. The study found that support vector machines showed the best performance while Naive Bayesian and logistic regression were the worst.

Uploaded by

Faraz Khan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views

Hospital Readmission Prediction Using Machine Learning Techniques

Uploaded by

Faraz Khan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/332898304

Hospital Readmission Prediction using Machine Learning Techniques

Article in International Journal of Advanced Computer Science and Applications · January 2019

DOI: 10.14569/IJACSA.2019.0100425

CITATIONS READS
0 66

2 authors, including:

Hanan Elazhary
Electronics Research Institute
49 PUBLICATIONS 114 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Using Intelligent Agents with the aid of Natural Language in Education View project

HPC & Cloud View project

All content following this page was uploaded by Hanan Elazhary on 26 August 2019.

The user has requested enhancement of the downloaded file.

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

Hospital Readmission Prediction using Machine

Learning Techniques
A Comparative Study

Samah Alajmani1 Hanan Elazhary2

Computer Science Department College of Computer Science & Engineering
Faculty of Computing & Information Technology University of Jeddah, Jeddah, Saudi Arabia
King Abdulaziz University Computers & Systems Department
Jeddah, Saudi Arabia Electronics Research Institute, Cairo, Egypt

Abstract—One of the most critical problems in healthcare is disciplines including, but not limited to, data processing,
predicting the likelihood of hospital readmission in case of statistics, algebra, knowledge analytics, information theory,
chronic diseases such as diabetes to be able to allocate necessary control theory, biology, statistics, cognitive science,
resources such as beds, rooms, specialists, and medical staff, for philosophy, and complexity of computations. This field plays
an acceptable quality of service. Unfortunately relatively few an important role in term of discovering valuable knowledge
research studies in the literature attempted to tackle this from databases which could contain records of supply
problem; the majority of the research studies are concerned with maintenance, medical records, financial transactions,
predicting the likelihood of the diseases themselves. Numerous applications of loans, etc. [5].
machine learning techniques are suitable for prediction.
Nevertheless, there is also shortage in adequate comparative As indicated in Fig. 1, machine learning techniques can be
studies that specify the most suitable techniques for the broadly classified into three main categories [3]. Supervised
prediction process. Towards this goal, this paper presents a learning techniques involve learning from training data, guided
comparative study among five common techniques in the by the data scientist. There are two basic types of learning
literature for predicting the likelihood of hospital readmission in missions: classification and regression. Models of classification
case of diabetic patients. Those techniques are logistic regression attempt to predict distinguished classes, such as blood groups,
(LR) analysis, multi-layer perceptron (MLP), Naïve Bayesian while models of regression prognosticate numerical values [3].
(NB) classifier, decision tree, and support vector machine (SVM). In unsupervised learning, on the other hand, the system could
The comparative study is based on realistic data gathered from a
attempt to find hidden data patterns, associations among
number of hospitals in the United States. The comparative study
revealed that SVM showed best performance, while the NB
features or variables, or data trends [3], [4]. The main objective
classifier and LR analysis were the worst. of unsupervised learning is the ability to specify hidden
structures or data distributions without being subject to
Keywords—Decision tree; hospital readmission; logistic supervision or the prior categorization of the training data [6].
regression; machine learning; multi-layer perceptron; Naïve Finally, in reinforcement learning the system attempts to learn
Bayesian classifier; support vector machines through interactions (trial and error) with a dynamic
environment. During this learning mode, the computer program
I. INTRODUCTION provides access to a dynamic environment in order to perform
Nowadays, numerous chronic diseases, such as diabetes, a specific objective. It is worth noting that in this case, the
are widespread in the world; and the number of patients is system does not have prior knowledge regarding the
increasing continuously. The estimated number of diabetic environment‟s behavior, and the only way to figure it out is
adults in 2014 was 422 million versus 108 million in 1980 [1]. through trial and error [3], [7], [8].
Such patients visit hospitals frequently, requiring continuous According to Kaelbling et al., the term healthcare
preparation for ensuring the availability of required resources informatics refers to the combination between machine
including hospital beds, rooms, and enough medical staff for an learning and healthcare with the purpose of specifying interest
acceptable quality of service. Accordingly, predicting the patterns [9]. In addition to this, it has the potential for
likelihood of readmission of a given patient is of ultimate establishing a good relationship between patients and doctors,
importance. In fact readmission during a one month period (30 and minimizing the increasing cost of healthcare [10]. The goal
days) of discharge indicates "a high-priority healthcare quality of this paper is to apply machine learning techniques, and
measure" and the goal is to address this problem [2]. specifically prediction techniques, for predicting the likelihood
Machine learning, which is one of the most important of readmission of patients to hospitals. This problem hasn‟t
branches of artificial intelligence, provides methods and been adequately addressed in the literature. In fact most
techniques for learning from experience [3]. Researchers often research efforts are oriented towards prediction of diseases.
use it for complex statistical analysis tasks [4]. It is a wide Machine learning includes numerous analytic techniques for
multidisciplinary domain which is based on numerous prediction and the literature lacks adequate comparative studies

212 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

that assist in selecting a suitable technique for this purpose. naïve, since it assumes independence among variables used in
Our research is based on a large data set collected by numerous the classification process [15], [17], [18].
United States hospitals [11], [12]. In short, this paper has two
main contributions as follows: D. Support Vector Machine
Support vector machines (SVMs) are supervised learning
 Analyzing five most common machine learning models, which can be applied for classification analysis and
techniques for prediction and providing a comparative regression analysis. They have been proposed by Vapnik in
study among them. 1995. They can perform both linear and non-linear
 Addressing the problem of patient readmission to classification tasks [5], [12], [17], [19].
hospitals, since it has been rarely addressed by E. Decision Tree
researchers.
Decision trees are one of the most famous techniques in
Organization of the rest of the paper is as follows: First, we machine learning. A decision tree relies on classification by
present background about the machine learning techniques using attribute values for making decisions. In general, a
considered in this research. This is followed by related work to decision tree is a group of nodes, leaves, a root and branches
highlight the contributions of the paper. We then present our [20]. Many algorithms have been proposed in the literature for
methodology and discuss the results of the experiments. implementing decision trees. One important algorithm is
Finally, we sum up this work via a conclusion and discussion CART (Classification and Regression Tree). It is used for
of possible future work. dealing with continuous and categorical variables [8], [21].
III. RELATED WORK
Many researchers attempted to use machine learning
techniques in healthcare problems other than hospital
readmission likelihood prediction. For example, Arun and
Sittidech used K-Nearest Neighbor (KNN), NB, and decision
trees with boosting, bagging, and ensemble learning in diabetes
classification. Their experiments confirmed that the highest
Fig. 1. Classification of Machine Learning Technqiues. accuracy is obtained by applying bagging with decision trees
[22]. On the other hand, Perveen et al. attempted to improve
II. BACKGROUND the performance of such algorithms using AdaBoost. The
This section discusses the five basic machine learning evaluation of experimental outcomes showed that AdaBoost
techniques employed in this research study. had better performance in comparison to bagging [23]. Orabi et
al. [24] suggested integrating regression with randomization
A. Logistic Regression Analysis for predicting diabetes cases according to age, with an accuracy
Regression is a statistical notion that can be used to identify of 84%. Other researchers proposed building a predictive
the relationship weight between one variable called the model using three machine learning techniques, which are
dependent variable and a group of other changeable variables random forests (RFs), LR, and SVMs; for predicting diabetes
denoted as the independent variables. Logistic regression (LR) in Indians females, in addition to the factors causing diabetes.
is a non-linear regression model, used to estimate the Their comparative study concluded that RFs had the best
likelihood that an event will occur as a function of others [13]. performance among the others [25].

B. Artificial Neural Network Relatively few research studies addressed the problem of
hospital readmission likelihood prediction. For example, Strack
An Artificial Neural Network (ANN) is a computational et al. used statistical models for this purpose [12]. Other
model which attempts to emulate the human brain parallel researchers focused on comparing different machine learning
processing nature. An ANN is a network of strongly techniques for addressing this problem. For example, Kerexeta
interconnected processing elements (neurons), which operate in [26] proposed two approaches. In the first, they combined
parallel [14] inspired by the biological nervous systems [15]. supervised and unsupervised classification techniques, while in
ANNs are broadly used in many researches because they are the latter, they combined NB and decision trees. They showed
capable of modeling non-linear systems, where relationships that the former approach had a better performance in
among variables are either unknown of quite complicated [14]. comparison to the latter in terms of readmission prediction.
An example of an ANN is the Multi-Layer Perceptron (MLP),
which is typically formed of three layers of neurons (input To sum up, relatively few research efforts in healthcare are
layer, output layer, and hidden layer) and its neurons use non- concerned with the problem of prediction of hospital
linear functions for data processing [16]. readmission likelihood. Additionally, there is a shortage of
adequate comparative studies for comparing machine learning
C. Naïve Bayesian Classifier techniques used for prediction. Hence, this paper attempts to
Naïve Bayesian (NB) classifier relies on applying Bayes‟ tackle those two problems by comparing five common machine
theorem to estimate the most probable membership of a given learning techniques for tackling the problem of hospital
event in one of a set of possible classes. It is described as being readmission likelihood prediction based on real data.

213 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

Understanding
data

Constructing selected models:

1. Logistic regression
2. Multi-layer perceptron Comparative
Data Model
Dataset 3. Support vector machine Results
preprocessing evaluation study
4. Naïve Bayesian classifier
5. Decision tree

Features
selection

Fig. 2. Proposed Methodology.

Fig. 3. Analysis of Variables.

Fig. 4. Analysis of Variables (cont.).

214 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

Fig. 5. Analysis of Variables (Cont.).

Fig. 6. Analysis of Variables (Cont.).

Fig. 7. Analysis of Variables (Cont.).

215 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

Fig. 8. Analysis of Variables (Cont).

weights of the variables. We then utilized a threshold of 0.014

IV. METHODOLOGY to obtain our feature set [29], [30]. Accordingly, the features
Before starting the comparative study, it is important to A, AS, and DM were rejected since their weights were lower
understand the data, perform preprocessing if necessary, and than 0.014. All the other features depicted in Fig. 9 were
select features appropriate for the experiments as depicted in selected.
Fig. 2. Those tasks are explained below. It is worth noting that
all the experiments were conducted using Python. TABLE I. DIABETES PATIENTS‟ DATA
A. Data Preparation
Variables
Variable Data type
1) Understanding data: In this study, we exploited a Abbreviation
sample of a diabetes patients‟ dataset, which has been
Race R Categorical
extracted from many hospitals in the United States [11], [12].
Gender G Categorical
This dataset includes 3090 instances in the age range of 30-50
and with 18 attributes. Table I depicts the variables of the Age A Categorical
dataset together with their descriptions. The scientific Admission type Id AT Integer
meanings of those variables are beyond the scope of this paper. Discharge disposition Id DI Integer
Fig. 3 through 8 depict the distribution of those features.
Admission source Id AS Integer
2) Data pre-processing: This is a very important stage
which includes data transformation and cleaning. In data Medical specialty MS Categorical
transformation, some variables were transformed from A1Cresult A1Cresult Categorical
categorical to binary (0/1) such as (Change, DM, G, and A). Time in hospital TH Integer
Some other variables were transformed from integer to string
Number of lab procedures NL Integer
such as AS, DI, and AS. In data cleaning, some values of
Number of procedures NP Integer
categorical data were missing and had to be accounted for. For
this purpose, we employed imputation (substitution) via the Number of medications NM Integer
mode of the categorical data. Number of outpatient NO Integer
3) Feature Selection: In this step, we perform feature Number of emergency NE Integer
selection for dimensionality reduction. In other words, we
Number of inpatient NI Integer
select the most relevant features. In this study, towards this
goal, we assessed the impact of variables on our target. This Number of diagnosis ND Integer
helped us eliminate variables with low importance. Features Change Change Categorical
which have high influence on accuracy are the most important DiabetesMed DM Categorical
[27]. We used the Gradient Boosting technique [28] for
DM 0.008867 Rejected
categorical features. Table II demonstrates the average

216 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

Fig. 9. Importance of Variables.

TABLE II. FEATURES IMPORTANCE 1) Logistic regression: This model was built by importing
Variable Importance (average) Decision
the Logistic regression module and using it to generate the
classifier. Grid search was employed to detect the optimal
R 0.029016 Confirmed
accuracy and the best hyper-parameters.
G 0.020294 Confirmed 2) Support vector machine: Kernel SVM was trained
A 0.010165 Rejected using the training set. A support vector classification (SVC)
AT 0.023854 Confirmed task was used. In this technique, there are many parameters
DI 0.027554 Confirmed such as C, kernel, and gamma; where C represents the error
AS 0.008961 Rejected
term penalty parameter, kernel determines the kernel type
which can be utilized in the algorithm (in our case it is „rbf‟),
MS 0.019117 Confirmed
and gamma indicates the coefficient of the kernel, such that a
A1Cresult 0.020177 Confirmed
high value of gamma attempts to completely fit the set of
TH 0.062025 Confirmed training data. Grid search was employed to determine the
NL 0.149317 Confirmed optimal parameters and accuracy. Table III illustrates that the
NP 0.046521 Confirmed optimal accuracy for C is 10. On the other hand, the optimal
NM 0.111058 Confirmed accuracy for gamma was 0.3.
NO 0.030696 Confirmed
TABLE III. SVM ACCURACY
NE 0.066718 Confirmed
C Accuracy
NI 0.104099 Confirmed
0.1 0.8169582
ND 0.055811 Confirmed 1.00 0.9102736
Change 0.023027 Confirmed 10.00 0.9246298
3) Decision tree: This model was generated using the
B. Constructing Machine Learning Techniques Models
„gini‟ function to evaluate the split quality of the tree. In our
In this comparative study, the selected models included one study, the min_samples_split = 30 is the minimal number of
output/target with two values (True or False) regarding hospital
samples needed for splitting an internal node, and max_depth
readmission during a period of 30 days. In other words, the
value of the readmission parameter is true if readmission is is the maximal tree depth. Grid search was conducted and the
done during a period of 30 days. Otherwise, in case of no best accuracy for max_depth was 15 as depicted in Table IV.
readmission or in case readmission is done after 30 days, its
TABLE IV. DECISION TREE ACCURACY
value if false. The set of drivers for the prediction was
comprised of the selected features as discussed above. The max_depth Accuracy
training dataset and the testing dataset were selected randomly. 5 0.8326603
Additionally, 10-fold cross validation was applied by selecting
40% of the data for testing and the rest for training. The 10 0.8627187
settings of the various models are discussed below. 15 0.8788694

217 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

TABLE V. MLP WEIGHT MATRIX RESULTS

Hidden2.1 Hidden2.2 Hidden2.3 Hidden2.4 Hidden2.5 Readmitted<30 Readmitted>30

num_lab_procedures -0.0021 0.0263 -0.2052 0.2031 -0.0536 0.0000 0.0000
num_medications 0.0055 -0.1040 2.1573 1.3116 0.0572 0.0000 0.0000
number_inpatient -0.0141 -0.0330 -3.7337 3.9118 0.0097 0.0000 0.0000
num_emergency -0.0105 -0.0008 0.1937 -0.2044 0.0118 0.0000 0.0000
time_in_hospital 0.0113 0.0307 -1.0520 1.3083 -0.0158 0.0000 0.0000
number_diagnosis -0.0064 0.0004 -0.8371 -0.0016 0.0135 0.0000 0.0000
num_procedures 0.0005 -0.0362 1.3875 0.0360 0.0034 0.0000 0.0000
number_oupatient 0.0096 -.00106 0.0029 0.7529 0.0561 0.0000 0.0000
race_caucasian -0.0238 -0.1076 -0.0529 -1.2512 4.5722 0.0000 0.0000
discharge_disposition_id_2 -0.7015 -3.2359 3.0491 1.3420 -0.8454 0.0000 0.0000
admission_type_id_3 -0.0043 0.0155 2.0813 1.5240 -2.4905 0.0000 0.0000
change 0.0019 0.0294 -0.5155 -0.9567 -0.0422 0.0000 0.0000
medical_speciality_Family/GeneralPractice -0.0643 -2.8393 5.2166 0.8710 -0.0294 0.0000 0.0000
admission_type_id_2 0.0163 0.0751 0.1518 0.1636 -2.2760 0.0000 0.0000
A1Cresult_Norm 4.1982 -5.7995 -1.5765 -1.0361 -0.0567 0.0000 0.0000
gender_Male -0.0045 0.0313 -0.7745 -1.3289 -0.0169 0.0000 0.0000
discharge_disposition_id_6 0.0011 -0.0197 0.4592 0.6655 0.0107 0.0000 0.0000
discharge_disposition_id_22 -0.0233 -2.7176 -0.0397 2.7966 2.1024 0.0000 0.0000
Hidden2.1 0.0000 0.0000 0.0000 0.0000 0.0000 -2.4213 0.3857
Hidden2.2 0.0000 0.0000 0.0000 0.0000 0.0000 -1.0671 1.4849
Hidden2.3 0.0000 0.0000 0.0000 0.0000 0.0000 1.0464 1.0589
Hidden2.4 0.0000 0.0000 0.0000 0.0000 0.0000 0.0924 0.6212
Hidden2.5 0.0000 0.0000 0.0000 0.0000 0.0000 -4.5618 -0.0239

4) Naïve bayesian classifier: A NB model was created Accuracy = (TP+TN)/(TP+FP+FN+TN) (1)

using Gaussian Naive Bayes, which assumes that the attributes Recall = TP/(TP+FN) (2)
follow a natural distribution.
5) Multi-Layer perceptron: We built a MLP network Precision = TP/(TP+FP) (3)
using 18 inputs. The number of neurons in a hidden layer was F1_score =2* (Recall*Precision) / (Recall + Precision) (4)
5. The function of the neurons was stochastic gradient descent.
The maximum number of iterations was 300, and the two Accuracy indicates how often the classifier is correct. The
outputs were (readmitted < 30 and readmitted > 30). Table V recall is a sensitivity measure (ratio of TPs to the sum of TPs
and FNs). It indicates the rate of cases the model predicted the
illustrates that result of MLP weight matrix after training.
patient will be readmitted in a month period (relative to the
V. RESULTS AND DISCUSSION number of cases the patient was actually readmitted). The
precision measures the rate of cases that the model predicts the
This work utilized various performance measures to patient will be readmitted in a month period correctly
compare the studied techniques [31]. Specifically, we relied on compared to total number of cases in which the model predicts
accuracy, recall, precision, and F1 scores for this purpose. the patients will be readmitted. Table VI depicts the values of
Those parameters are defined in terms of the true positives the performance measures.
(TP), true negatives (TN), false positives (FP), and false
negatives (FN) as indicated in equations (1) through (4). TPs As previously noted, we used 10-fold cross validation for
are cases in which we predicted yes (they will be readmitted in the models. Table VII and Fig. 10 depict the training and
a month period), and they were really readmitted. TNs are testing accuracy for each model under 10-fold cross validation.
cases in which we predicted no, and they were not readmitted.
Finally, Table VIII illustrates the minimum, maximum and
On the other hand, FPs are cases in which we predicted yes,
mean accuracy for every model. It is clear that SVM achieved
but they were not actually readmitted; Type I error. Finally,
the highest accuracy of 0.9522. It was followed by DT, with
FNs are cases in which we predicted no, but they were actually
accuracy of 0.9251, and then MLP with accuracy of 0.8358.
readmitted; Type II error.
The lowest performance was detected for NB classifier and LR
analysis, which achieved respectively 0.69069 and 0.6865.

218 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

TABLE VI. PERFORMANCE MEASURES FOR THE EMPLOYED TECHNIQUES

Model/ Measures Accuracy Recall Precision F1_score

Decision Tree 0.878869 0.854745 0.908571 0.876033
Logistic Regression 0.641095 0.626773 0.651013 0.638663
Naïve Bayes 0.638852 0.433511 0.746565 0.548514
Support Vector Machine 0.294630 0.942376 0.911664 0.926765
Multi-Layer Perceptron 0.799910 0.781028 0.815741 0.798007

Fig. 10. 10-Fold Cross Validation for Models.

219 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019

TABLE VII. 10-FOLD CROSS VALIDATION FOR EMPLOYED TECHNIQUES Workshop on Realizing AI Synergies in Software Engineering, 2012,
pp. 37-41.
SVM CART NB LR MLP [10] R. Bhardwaj, A. Nambiar, and D. Dutta, "A Study of Machine Learning
0.952239 0.898507 0.570149 0.641791 0.794030 in Healthcare," Proc. - Int. Comput. Softw. Appl. Conf., vol. 2, pp. 236-
241, 2017.
0.943284 0.901493 0.644776 0.686567 0.835821 [11] A. Asuncion and D. Newman, "UCI Machine Learning Repository,"
0.940299 0.886567 0.585075 0.620896 0.776119 2007. [Online]. Available: https://archive.ics.uci.edu/ml/index.php.
[12] S. B. et al., "Impact of HbA1c measurement on hospital readmission
0.895522 0.907463 0.659701 0.647761 0.820896 rates: Analysis of 70,000 clinical database patient records," Biomed Res.
0.922388 0.853731 0.579104 0.600000 0.800000 Int., vol. 2014, 2014.
[13] A. H. Karp, "Using logistic regression to predict customer retention,"
0.916168 0.883234 0.622754 0.628743 0.775449 Proc. Elev. Northeast SAS Users Gr. Conf. http//www. lexjansen.
0.922156 0.865269 0.646707 0.682635 0.796407 com/nesug/nesug98/solu/p095. pdf, 1998.
[14] F. Amato, A. López, E. M. Peña-Méndez, P. Va?hara, A. Hampl, and J.
0.946108 0.925150 0.655689 0.646707 0.790419
Havel, "Artificial neural networks in medical diagnosis," J. Appl.
0.924925 0.906907 0.690691 0.630631 0.792793 Biomed., vol. 11, no. 2, pp. 47-58, 2013.
[15] S. F., "Machine-Learning Techniques for Customer Retention: A
0.939940 0.906907 0.657658 0.669670 0.795796
Comparative Study," Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 2, pp.
273-281, 2018.
TABLE VIII. ACCURACY OF EMPLOYED TECHNIQUES [16] N. Jothi, N. Rashid, and W. Husain, "Data Mining in Healthcare - A
Review," Procedia Comput. Sci., vol. 72, pp. 306-313, 2015.
No Model Min Max Mean [17] D. Sisodia and D. Sisodia, "Prediction of Diabetes using Classification
1 Support Vector Machine 0.895522 0.952239 0.930303 Algorithms," Procedia Comput. Sci., vol. 132, no. Iccids, pp. 1578-1585,
2018.
2 Decision Tree 0.853731 0.925150 0.893523 [18] A. Hazra, S. Kumar, and A. Gupta, "Study and Analysis of Breast
3 Multi-Layer Perceptron 0.775449 0.835821 0.797773 Cancer Cell Detection using Naïve Bayes, SVM and Ensemble
Algorithms," Int. J. Comput. Appl., vol. 145, no. 2, pp. 39-45, 2016.
4 Naïve Bayes 0.570149 0.690691 0.631230
[19] E. Holzschuh, "A Decision-Theoretic Generalization of On-Line
5 Logistic Regression 0.600000 0.686567 0.645540 Learning and an Application to Boosting*," Reports Prog. Phys., vol.
55, no. 7, pp. 1035-1091, 1992.
VI. CONCLUSION AND FUTURE WORK [20] R. Sharma, V. Sugumaran, H. Kumar, and M. Amarnath, "A
comparative study of naive Bayes classifier and Bayes net classifier for
This paper presented a comparative study among five fault diagnosis of roller bearing using sound signal," Int. J. Decis.
machine learning techniques; namely LR, MLP, NB classifier, Support Syst., vol. 1, no. 1, p. 115, 2015.
decision trees, and SVMS; for predicting the likelihood of [21] S. Mandal, A. Gupta, A. Mukherjee, and A. Mukherjee, "Heart Disease
hospital readmission of diabetes patients. The study relied on Diagnosis and Prediction Using Machine Learning and Data Mining
Techniques?: A Review," vol. 10, no. 7, pp. 2137-2159, 2017.
real data collected from hospitals in the United States. Based
[22] N. Nai-Arun and P. Sittidech, "Ensemble Learning Model for Diabetes
on the study, the SVM provided the best performance. Classification," Adv. Mater. Res., vol. 931-932, pp. 1427-1431, 2014.
Nevertheless, the study will be extended to compare additional [23] S. Perveen, M. Shahbaz, A. Guergachi, and K. Keshavjee, "Performance
techniques and larger datasets will be considered as well. Analysis of Data Mining Classification Techniques to Predict Diabetes,"
REFERENCES Procedia Comput. Sci., vol. 82, no. March, pp. 115-121, 2016.
[1] G. Roglic, "Global report on diabetes.," World Heal. Organ., vol. 58, no. [24] K. Orabi, Y. Kamal, and T. Rabah, "Early Predictive System for
12, pp. 1-88, 2016. Diabetes Mellitus Disease," vol. 1, pp. 420-427, 2016.
[2] D. Rubin, K. Donnell-Jackson, R. Jhingan, S. Golden, and A. Paranjape, [25] D. Paul, "Analysing Feature Importances for Diabetes Prediction using
"Early readmission among patients with diabetes: A qualitative Machine Learning Debadri," 2018 IEEE 9th Annu. Inf. Technol.
assessment of contributing factors," J. Diabetes Complications, vol. 28, Electron. Mob. Commun. Conf., pp. 924-928, 2018.
no. 6, pp. 869-873, 2014. [26] J. Kerexeta, A. Artetxe, V. Escolar, A. Lozano, and N. Larburu,
[3] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and "Predicting 30-day Readmission in Heart Failure using Machine
I. Chouvarda, "Machine Learning and Data Mining Methods in Diabetes Learning Techniques," Proc. 11th Int. Jt. Conf. Biomed. Eng. Syst.
Research," Comput. Struct. Biotechnol. J., vol. 15, pp. 104-116, 2017. Technol., vol. 5, no. Biostec, pp. 308-315, 2018.
[4] P. Chowriappa, S. Dua, and Y. Todorov, "Machine Learning in [27] S. F., “Machine-Learning Techniques for Customer Retention: A
Healthcare Informatics," vol. 56, pp. 1-23, 2014. Comparative Study,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 2, pp.
273–281, 2018.
[5] T. Mitchell, "Machine learning (mcgraw-hill international editions
computer science series)," 1997. [28] K. Kira and L. A. Rendell, A Practical Approach to Feature Selection.
Morgan Kaufmann Publishers, Inc., 1992.
[6] E. Bose and K. Radhakrishnan, "Using Unsupervised Machine Learning
to Identify Subgroups among Home Health Patients with Heart Failure [29] D. W. Opitz, “Feature selection for ensembles,” Proc. 16th Natl. Conf.
Using Telehealth," CIN - Comput. Informatics Nurs., vol. 36, no. 5, pp. Artif. Intell. AAAI, vol. 16, no. 3, pp. 379–384, 1999.
242-248, 2018. [30] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature
[7] L. Kaelbling, A. Littman, and A. Moore, "Reinforcement learning: A Selection 1 Introduction,” An Introd. to Var. Featur. Sel., vol. 3, pp.
survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, 1996. 1157–1182, 2003.
[8] K. Shailaja, B. Seetharamulu, and M. Jabbar, "Machine Learning in [31] M. Sokolova and G. Lapalme, “A systematic analysis of performance
Healthcare: A Review," 2018 Second Int. Conf. Electron. Commun. measures for classification tasks,” Information Processing and
Aerosp. Technol., no. Iceca, pp. 910-914, 2018. Managemet, vol. 45, 2009, pp. 427-437.
[9] J. Davies and J. Gibbons, "Machine Learning and Software Engineering
in Health Informatics," in Proceedings of the First International

220 | P a g e
www.ijacsa.thesai.org
View publication stats

Chapter I
100% (2)
Chapter I
30 pages
Puting Universe A Journey Through A Revolution 0521150183 PDF
100% (3)
Puting Universe A Journey Through A Revolution 0521150183 PDF
415 pages
Couple Stady - David H. Olson
No ratings yet
Couple Stady - David H. Olson
22 pages
Hospital prediction using data mining
No ratings yet
Hospital prediction using data mining
9 pages
Research Article Enhance-Net: An Approach To Boost The Performance of Deep Learning Model Based On Real-Time Medical Images
No ratings yet
Research Article Enhance-Net: An Approach To Boost The Performance of Deep Learning Model Based On Real-Time Medical Images
15 pages
s12911 019 1004 8 PDF
No ratings yet
s12911 019 1004 8 PDF
16 pages
Prediction of Diseases Using Random Forest
No ratings yet
Prediction of Diseases Using Random Forest
8 pages
Journal of Healthcare Engineering - 2019 - Fu - Machine Learning For Medical Imaging
No ratings yet
Journal of Healthcare Engineering - 2019 - Fu - Machine Learning For Medical Imaging
2 pages
Intellihealth
No ratings yet
Intellihealth
16 pages
Experimental Disease Prediction Research On Combining Natural Language Processing and Machine Learning
No ratings yet
Experimental Disease Prediction Research On Combining Natural Language Processing and Machine Learning
6 pages
Survey of Machine Learning Algorithms For Disease Diagnostic
No ratings yet
Survey of Machine Learning Algorithms For Disease Diagnostic
16 pages
aaem-12-e22
No ratings yet
aaem-12-e22
15 pages
Wa0068.
No ratings yet
Wa0068.
22 pages
Deep-CNN-based-brain-tumor-detection-in-_2024_International-Journal-of-Intel
No ratings yet
Deep-CNN-based-brain-tumor-detection-in-_2024_International-Journal-of-Intel
8 pages
1 s2.0 S2214785321052202 Main
No ratings yet
1 s2.0 S2214785321052202 Main
5 pages
jmir-2023-1-e43311
No ratings yet
jmir-2023-1-e43311
16 pages
Multiple Disease Prediction Using Machine Learning Algorithms
No ratings yet
Multiple Disease Prediction Using Machine Learning Algorithms
5 pages
health_1
No ratings yet
health_1
11 pages
The Prediction of Outpatient No-Show Visits by Usi
No ratings yet
The Prediction of Outpatient No-Show Visits by Usi
8 pages
Clinical_concept_annotation_with_contextual_word_e
No ratings yet
Clinical_concept_annotation_with_contextual_word_e
31 pages
An Overview of Deep Learning in Medical Imaging Focusing On MRI
No ratings yet
An Overview of Deep Learning in Medical Imaging Focusing On MRI
26 pages
IJMLR221709
No ratings yet
IJMLR221709
17 pages
1 s2.0 S2666521220300090 Main
No ratings yet
1 s2.0 S2666521220300090 Main
5 pages
An Operational Guide To Translational Clinical Machine Learning in Academic Medical Centers
No ratings yet
An Operational Guide To Translational Clinical Machine Learning in Academic Medical Centers
7 pages
Arti Ficial Intelligence in Medicine: Sciencedirect
No ratings yet
Arti Ficial Intelligence in Medicine: Sciencedirect
12 pages
sensors-24-06322-v2
No ratings yet
sensors-24-06322-v2
17 pages
Data Mining Based Hybrid Intelligent System For Medical Application
No ratings yet
Data Mining Based Hybrid Intelligent System For Medical Application
9 pages
1-s2.0-S2215016124001079-main
No ratings yet
1-s2.0-S2215016124001079-main
22 pages
Performance Analysis of Machine Learning Algorithms For Big Data Classification ML and AI Based Algorithms For Big Data Analysis
No ratings yet
Performance Analysis of Machine Learning Algorithms For Big Data Classification ML and AI Based Algorithms For Big Data Analysis
16 pages
Diseases Analysis and Prediction System Using Machine Learning - IEEE
No ratings yet
Diseases Analysis and Prediction System Using Machine Learning - IEEE
5 pages
Olympic Logo
No ratings yet
Olympic Logo
4 pages
JCM 08 01050 PDF
No ratings yet
JCM 08 01050 PDF
13 pages
Machine Learning Classification Techniques For Heart Disease Prediction: A Review
No ratings yet
Machine Learning Classification Techniques For Heart Disease Prediction: A Review
8 pages
Disease Prediction by Machine Learning
No ratings yet
Disease Prediction by Machine Learning
6 pages
Drug Recommendation System Based On Sentiment
No ratings yet
Drug Recommendation System Based On Sentiment
7 pages
journal-14-07-2023-Review of Application of Machine Learning in Healthcare 2019
No ratings yet
journal-14-07-2023-Review of Application of Machine Learning in Healthcare 2019
3 pages
2 PB
No ratings yet
2 PB
10 pages
Knowledge Guided Data Centric AI in Healthcare 1685207849
No ratings yet
Knowledge Guided Data Centric AI in Healthcare 1685207849
21 pages
A Review of Heart Disease Classification Base on M
No ratings yet
A Review of Heart Disease Classification Base on M
23 pages
A Review of Explainable Deep Learning Cancer Detection Models in Medical Imaging
No ratings yet
A Review of Explainable Deep Learning Cancer Detection Models in Medical Imaging
21 pages
The Need for Multimodal Health Data Modeling a Practi 2023 Journal of Biome
No ratings yet
The Need for Multimodal Health Data Modeling a Practi 2023 Journal of Biome
12 pages
A Case Assessment of Knowledge-Based Fit in Frame For Diagnosis of Human Eye Diseases
No ratings yet
A Case Assessment of Knowledge-Based Fit in Frame For Diagnosis of Human Eye Diseases
11 pages
Computer-Aided Diagnosis Based on Extreme Learning Machine: A Review
No ratings yet
Computer-Aided Diagnosis Based on Extreme Learning Machine: A Review
17 pages
hospital management
No ratings yet
hospital management
8 pages
Exploring the Opportunities and Challenges of Implementing Artificial Intelligence in Healthcare - A Systematic Literature Review
No ratings yet
Exploring the Opportunities and Challenges of Implementing Artificial Intelligence in Healthcare - A Systematic Literature Review
9 pages
Exploring The Opportunities and Challenges of Implementing Artificial Intelligence in Healthcare - A Systematic Literature Review
No ratings yet
Exploring The Opportunities and Challenges of Implementing Artificial Intelligence in Healthcare - A Systematic Literature Review
9 pages
Federated Learning for Healthcare- Systematic Review and Architecture Proposal
No ratings yet
Federated Learning for Healthcare- Systematic Review and Architecture Proposal
23 pages
Sharma 2022 J. Phys. Conf. Ser. 2267 012157
No ratings yet
Sharma 2022 J. Phys. Conf. Ser. 2267 012157
9 pages
Self-Diagnosis With Advanced Hospital Management-IJRASET
No ratings yet
Self-Diagnosis With Advanced Hospital Management-IJRASET
5 pages
Ijsr_paperformat Edited
No ratings yet
Ijsr_paperformat Edited
5 pages
SemanticWeb Main Manuscript
No ratings yet
SemanticWeb Main Manuscript
6 pages
An Application of Machine Learning in Ivf Comparing The Accuracy of Classification Alogithims For The Prediction of Twin
No ratings yet
An Application of Machine Learning in Ivf Comparing The Accuracy of Classification Alogithims For The Prediction of Twin
5 pages
(IJCST-V10I5P13) :mrs R Jhansi Rani, K Prem Kumar Reddy
No ratings yet
(IJCST-V10I5P13) :mrs R Jhansi Rani, K Prem Kumar Reddy
7 pages
Base Paper
No ratings yet
Base Paper
21 pages
Machine Learning For Health Services Researchers
No ratings yet
Machine Learning For Health Services Researchers
8 pages
Application of Selected Classification Schemes For Fault Diagnosis of Actuator Systems
No ratings yet
Application of Selected Classification Schemes For Fault Diagnosis of Actuator Systems
10 pages
Machine Learning in Medicine - A Practical Introduction
No ratings yet
Machine Learning in Medicine - A Practical Introduction
18 pages
Review of Immunotherapy Classification: Application Domains, Datasets, Algorithms and Software Tools From Machine Learning Perspective
No ratings yet
Review of Immunotherapy Classification: Application Domains, Datasets, Algorithms and Software Tools From Machine Learning Perspective
10 pages
XYZ
No ratings yet
XYZ
18 pages
PDF
No ratings yet
PDF
12 pages
Use of Machine Learning To Analyse Routinely Collected Intensive Care Unit Data: A Systematic Review
No ratings yet
Use of Machine Learning To Analyse Routinely Collected Intensive Care Unit Data: A Systematic Review
11 pages
Machine Learning in Healthcare
From Everand
Machine Learning in Healthcare
Vaibhav Rupapara
No ratings yet
Clinical Decision Support System: Fundamentals and Applications
From Everand
Clinical Decision Support System: Fundamentals and Applications
Fouad Sabry
5/5 (1)
Dasika
No ratings yet
Dasika
17 pages
Machine Learning Algorithms Utilizing Functional Respiratory Imaging May Predict COPD Exacerbations
No ratings yet
Machine Learning Algorithms Utilizing Functional Respiratory Imaging May Predict COPD Exacerbations
9 pages
Machine-Learning-Based Laboratory Developed Test For The Diagnosis of Sepsis in High-Risk Patients
No ratings yet
Machine-Learning-Based Laboratory Developed Test For The Diagnosis of Sepsis in High-Risk Patients
9 pages
Every Bit Counts Using Deep Learning and Vectorization To Analyze Healthcare Big Data
No ratings yet
Every Bit Counts Using Deep Learning and Vectorization To Analyze Healthcare Big Data
24 pages
Ziweritin Bitcoin Price Predition 444
No ratings yet
Ziweritin Bitcoin Price Predition 444
11 pages
Family Conflicts and Psychological Wellbeing of Children A Case of Kayonza District in Rwanda
No ratings yet
Family Conflicts and Psychological Wellbeing of Children A Case of Kayonza District in Rwanda
12 pages
Nurs 470 Ebp Presentation
No ratings yet
Nurs 470 Ebp Presentation
42 pages
Chapter 2 of One - Theory of Control Chart
No ratings yet
Chapter 2 of One - Theory of Control Chart
36 pages
ISO27002 Entity Security Assessment Tool Kit User Instruction Changesv1.1
No ratings yet
ISO27002 Entity Security Assessment Tool Kit User Instruction Changesv1.1
2 pages
Boarding House - Research
100% (1)
Boarding House - Research
81 pages
Research Paper Jit
No ratings yet
Research Paper Jit
33 pages
Interactive Games Comprehension Development: A Narratology of Students' Challenges
No ratings yet
Interactive Games Comprehension Development: A Narratology of Students' Challenges
48 pages
Strategic Management Accounting - What Is The Current State of The Concept?
No ratings yet
Strategic Management Accounting - What Is The Current State of The Concept?
9 pages
Note Buad806 Mod1 3buhumcr6wxmrre
No ratings yet
Note Buad806 Mod1 3buhumcr6wxmrre
251 pages
Informe de Ingles de Auditoria Integral
No ratings yet
Informe de Ingles de Auditoria Integral
13 pages
Download Full (Ebook) Euroscepticism, Democracy and the Media: Communicating Europe, Contesting Europe by Manuela Caiani, Simona Guerra ISBN 9781137596420, 9781137596437, 9781952192418, 1137596422, 1137596430, 1952192412 PDF All Chapters
100% (17)
Download Full (Ebook) Euroscepticism, Democracy and the Media: Communicating Europe, Contesting Europe by Manuela Caiani, Simona Guerra ISBN 9781137596420, 9781137596437, 9781952192418, 1137596422, 1137596430, 1952192412 PDF All Chapters
38 pages
Community Engagement Action
No ratings yet
Community Engagement Action
69 pages
Level 5 NVQ in Occupational Health and Safety Practice
70% (10)
Level 5 NVQ in Occupational Health and Safety Practice
52 pages
Positive and Negative Symptoms Questionnaire-Revised
No ratings yet
Positive and Negative Symptoms Questionnaire-Revised
46 pages
1 s2.0 S0748575115300245 Main
No ratings yet
1 s2.0 S0748575115300245 Main
15 pages
University of Calgary
No ratings yet
University of Calgary
13 pages
Preparation Directions and Notes: Assessment Part 1 - Leadership Case Study
No ratings yet
Preparation Directions and Notes: Assessment Part 1 - Leadership Case Study
6 pages
Waiting Lines - Problems and Solutions
No ratings yet
Waiting Lines - Problems and Solutions
10 pages
Research - Group 4 - Set A Final
No ratings yet
Research - Group 4 - Set A Final
52 pages
Business Plan - Feasibility Study Presentation
No ratings yet
Business Plan - Feasibility Study Presentation
10 pages
NABL 163 - Policy For Participation in Proficiency Testing Activities PDF
No ratings yet
NABL 163 - Policy For Participation in Proficiency Testing Activities PDF
8 pages
Research Methods PPT Maseeh
No ratings yet
Research Methods PPT Maseeh
9 pages
The Ten Schools of Thoughts by Henry Mintzeberg
No ratings yet
The Ten Schools of Thoughts by Henry Mintzeberg
5 pages
Sample Research Study 1
50% (2)
Sample Research Study 1
16 pages
BBRC4103 Kaedah Penyelidikan September Semester 2020
No ratings yet
BBRC4103 Kaedah Penyelidikan September Semester 2020
12 pages

Hospital Readmission Prediction Using Machine Learning Techniques

Uploaded by

Hospital Readmission Prediction Using Machine Learning Techniques

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Hospital Readmission Prediction using Machine Learning Techniques

Article in International Journal of Advanced Computer Science and Applications · January 2019

HPC & Cloud View project

The user has requested enhancement of the downloaded file.

Hospital Readmission Prediction using Machine

Samah Alajmani1 Hanan Elazhary2

Constructing selected models:

Fig. 2. Proposed Methodology.

Fig. 3. Analysis of Variables.

Fig. 4. Analysis of Variables (cont.).

Fig. 5. Analysis of Variables (Cont.).

Fig. 6. Analysis of Variables (Cont.).

Fig. 7. Analysis of Variables (Cont.).

Fig. 8. Analysis of Variables (Cont).

weights of the variables. We then utilized a threshold of 0.014

Fig. 9. Importance of Variables.

TABLE V. MLP WEIGHT MATRIX RESULTS

Hidden2.1 Hidden2.2 Hidden2.3 Hidden2.4 Hidden2.5 Readmitted<30 Readmitted>30

4) Naïve bayesian classifier: A NB model was created Accuracy = (TP+TN)/(TP+FP+FN+TN) (1)

TABLE VI. PERFORMANCE MEASURES FOR THE EMPLOYED TECHNIQUES

Model/ Measures Accuracy Recall Precision F1_score

Fig. 10. 10-Fold Cross Validation for Models.

You might also like