Hospital Readmission Prediction Using Machine Learning Techniques
Hospital Readmission Prediction Using Machine Learning Techniques
net/publication/332898304
CITATIONS READS
0 66
2 authors, including:
Hanan Elazhary
Electronics Research Institute
49 PUBLICATIONS 114 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Using Intelligent Agents with the aid of Natural Language in Education View project
All content following this page was uploaded by Hanan Elazhary on 26 August 2019.
Abstract—One of the most critical problems in healthcare is disciplines including, but not limited to, data processing,
predicting the likelihood of hospital readmission in case of statistics, algebra, knowledge analytics, information theory,
chronic diseases such as diabetes to be able to allocate necessary control theory, biology, statistics, cognitive science,
resources such as beds, rooms, specialists, and medical staff, for philosophy, and complexity of computations. This field plays
an acceptable quality of service. Unfortunately relatively few an important role in term of discovering valuable knowledge
research studies in the literature attempted to tackle this from databases which could contain records of supply
problem; the majority of the research studies are concerned with maintenance, medical records, financial transactions,
predicting the likelihood of the diseases themselves. Numerous applications of loans, etc. [5].
machine learning techniques are suitable for prediction.
Nevertheless, there is also shortage in adequate comparative As indicated in Fig. 1, machine learning techniques can be
studies that specify the most suitable techniques for the broadly classified into three main categories [3]. Supervised
prediction process. Towards this goal, this paper presents a learning techniques involve learning from training data, guided
comparative study among five common techniques in the by the data scientist. There are two basic types of learning
literature for predicting the likelihood of hospital readmission in missions: classification and regression. Models of classification
case of diabetic patients. Those techniques are logistic regression attempt to predict distinguished classes, such as blood groups,
(LR) analysis, multi-layer perceptron (MLP), Naïve Bayesian while models of regression prognosticate numerical values [3].
(NB) classifier, decision tree, and support vector machine (SVM). In unsupervised learning, on the other hand, the system could
The comparative study is based on realistic data gathered from a
attempt to find hidden data patterns, associations among
number of hospitals in the United States. The comparative study
revealed that SVM showed best performance, while the NB
features or variables, or data trends [3], [4]. The main objective
classifier and LR analysis were the worst. of unsupervised learning is the ability to specify hidden
structures or data distributions without being subject to
Keywords—Decision tree; hospital readmission; logistic supervision or the prior categorization of the training data [6].
regression; machine learning; multi-layer perceptron; Naïve Finally, in reinforcement learning the system attempts to learn
Bayesian classifier; support vector machines through interactions (trial and error) with a dynamic
environment. During this learning mode, the computer program
I. INTRODUCTION provides access to a dynamic environment in order to perform
Nowadays, numerous chronic diseases, such as diabetes, a specific objective. It is worth noting that in this case, the
are widespread in the world; and the number of patients is system does not have prior knowledge regarding the
increasing continuously. The estimated number of diabetic environment‟s behavior, and the only way to figure it out is
adults in 2014 was 422 million versus 108 million in 1980 [1]. through trial and error [3], [7], [8].
Such patients visit hospitals frequently, requiring continuous According to Kaelbling et al., the term healthcare
preparation for ensuring the availability of required resources informatics refers to the combination between machine
including hospital beds, rooms, and enough medical staff for an learning and healthcare with the purpose of specifying interest
acceptable quality of service. Accordingly, predicting the patterns [9]. In addition to this, it has the potential for
likelihood of readmission of a given patient is of ultimate establishing a good relationship between patients and doctors,
importance. In fact readmission during a one month period (30 and minimizing the increasing cost of healthcare [10]. The goal
days) of discharge indicates "a high-priority healthcare quality of this paper is to apply machine learning techniques, and
measure" and the goal is to address this problem [2]. specifically prediction techniques, for predicting the likelihood
Machine learning, which is one of the most important of readmission of patients to hospitals. This problem hasn‟t
branches of artificial intelligence, provides methods and been adequately addressed in the literature. In fact most
techniques for learning from experience [3]. Researchers often research efforts are oriented towards prediction of diseases.
use it for complex statistical analysis tasks [4]. It is a wide Machine learning includes numerous analytic techniques for
multidisciplinary domain which is based on numerous prediction and the literature lacks adequate comparative studies
212 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
that assist in selecting a suitable technique for this purpose. naïve, since it assumes independence among variables used in
Our research is based on a large data set collected by numerous the classification process [15], [17], [18].
United States hospitals [11], [12]. In short, this paper has two
main contributions as follows: D. Support Vector Machine
Support vector machines (SVMs) are supervised learning
Analyzing five most common machine learning models, which can be applied for classification analysis and
techniques for prediction and providing a comparative regression analysis. They have been proposed by Vapnik in
study among them. 1995. They can perform both linear and non-linear
Addressing the problem of patient readmission to classification tasks [5], [12], [17], [19].
hospitals, since it has been rarely addressed by E. Decision Tree
researchers.
Decision trees are one of the most famous techniques in
Organization of the rest of the paper is as follows: First, we machine learning. A decision tree relies on classification by
present background about the machine learning techniques using attribute values for making decisions. In general, a
considered in this research. This is followed by related work to decision tree is a group of nodes, leaves, a root and branches
highlight the contributions of the paper. We then present our [20]. Many algorithms have been proposed in the literature for
methodology and discuss the results of the experiments. implementing decision trees. One important algorithm is
Finally, we sum up this work via a conclusion and discussion CART (Classification and Regression Tree). It is used for
of possible future work. dealing with continuous and categorical variables [8], [21].
III. RELATED WORK
Many researchers attempted to use machine learning
techniques in healthcare problems other than hospital
readmission likelihood prediction. For example, Arun and
Sittidech used K-Nearest Neighbor (KNN), NB, and decision
trees with boosting, bagging, and ensemble learning in diabetes
classification. Their experiments confirmed that the highest
Fig. 1. Classification of Machine Learning Technqiues. accuracy is obtained by applying bagging with decision trees
[22]. On the other hand, Perveen et al. attempted to improve
II. BACKGROUND the performance of such algorithms using AdaBoost. The
This section discusses the five basic machine learning evaluation of experimental outcomes showed that AdaBoost
techniques employed in this research study. had better performance in comparison to bagging [23]. Orabi et
al. [24] suggested integrating regression with randomization
A. Logistic Regression Analysis for predicting diabetes cases according to age, with an accuracy
Regression is a statistical notion that can be used to identify of 84%. Other researchers proposed building a predictive
the relationship weight between one variable called the model using three machine learning techniques, which are
dependent variable and a group of other changeable variables random forests (RFs), LR, and SVMs; for predicting diabetes
denoted as the independent variables. Logistic regression (LR) in Indians females, in addition to the factors causing diabetes.
is a non-linear regression model, used to estimate the Their comparative study concluded that RFs had the best
likelihood that an event will occur as a function of others [13]. performance among the others [25].
B. Artificial Neural Network Relatively few research studies addressed the problem of
hospital readmission likelihood prediction. For example, Strack
An Artificial Neural Network (ANN) is a computational et al. used statistical models for this purpose [12]. Other
model which attempts to emulate the human brain parallel researchers focused on comparing different machine learning
processing nature. An ANN is a network of strongly techniques for addressing this problem. For example, Kerexeta
interconnected processing elements (neurons), which operate in [26] proposed two approaches. In the first, they combined
parallel [14] inspired by the biological nervous systems [15]. supervised and unsupervised classification techniques, while in
ANNs are broadly used in many researches because they are the latter, they combined NB and decision trees. They showed
capable of modeling non-linear systems, where relationships that the former approach had a better performance in
among variables are either unknown of quite complicated [14]. comparison to the latter in terms of readmission prediction.
An example of an ANN is the Multi-Layer Perceptron (MLP),
which is typically formed of three layers of neurons (input To sum up, relatively few research efforts in healthcare are
layer, output layer, and hidden layer) and its neurons use non- concerned with the problem of prediction of hospital
linear functions for data processing [16]. readmission likelihood. Additionally, there is a shortage of
adequate comparative studies for comparing machine learning
C. Naïve Bayesian Classifier techniques used for prediction. Hence, this paper attempts to
Naïve Bayesian (NB) classifier relies on applying Bayes‟ tackle those two problems by comparing five common machine
theorem to estimate the most probable membership of a given learning techniques for tackling the problem of hospital
event in one of a set of possible classes. It is described as being readmission likelihood prediction based on real data.
213 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
Understanding
data
Features
selection
214 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
215 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
216 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
TABLE II. FEATURES IMPORTANCE 1) Logistic regression: This model was built by importing
Variable Importance (average) Decision
the Logistic regression module and using it to generate the
classifier. Grid search was employed to detect the optimal
R 0.029016 Confirmed
accuracy and the best hyper-parameters.
G 0.020294 Confirmed 2) Support vector machine: Kernel SVM was trained
A 0.010165 Rejected using the training set. A support vector classification (SVC)
AT 0.023854 Confirmed task was used. In this technique, there are many parameters
DI 0.027554 Confirmed such as C, kernel, and gamma; where C represents the error
AS 0.008961 Rejected
term penalty parameter, kernel determines the kernel type
which can be utilized in the algorithm (in our case it is „rbf‟),
MS 0.019117 Confirmed
and gamma indicates the coefficient of the kernel, such that a
A1Cresult 0.020177 Confirmed
high value of gamma attempts to completely fit the set of
TH 0.062025 Confirmed training data. Grid search was employed to determine the
NL 0.149317 Confirmed optimal parameters and accuracy. Table III illustrates that the
NP 0.046521 Confirmed optimal accuracy for C is 10. On the other hand, the optimal
NM 0.111058 Confirmed accuracy for gamma was 0.3.
NO 0.030696 Confirmed
TABLE III. SVM ACCURACY
NE 0.066718 Confirmed
C Accuracy
NI 0.104099 Confirmed
0.1 0.8169582
ND 0.055811 Confirmed 1.00 0.9102736
Change 0.023027 Confirmed 10.00 0.9246298
3) Decision tree: This model was generated using the
B. Constructing Machine Learning Techniques Models
„gini‟ function to evaluate the split quality of the tree. In our
In this comparative study, the selected models included one study, the min_samples_split = 30 is the minimal number of
output/target with two values (True or False) regarding hospital
samples needed for splitting an internal node, and max_depth
readmission during a period of 30 days. In other words, the
value of the readmission parameter is true if readmission is is the maximal tree depth. Grid search was conducted and the
done during a period of 30 days. Otherwise, in case of no best accuracy for max_depth was 15 as depicted in Table IV.
readmission or in case readmission is done after 30 days, its
TABLE IV. DECISION TREE ACCURACY
value if false. The set of drivers for the prediction was
comprised of the selected features as discussed above. The max_depth Accuracy
training dataset and the testing dataset were selected randomly. 5 0.8326603
Additionally, 10-fold cross validation was applied by selecting
40% of the data for testing and the rest for training. The 10 0.8627187
settings of the various models are discussed below. 15 0.8788694
217 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
218 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
219 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 4, 2019
TABLE VII. 10-FOLD CROSS VALIDATION FOR EMPLOYED TECHNIQUES Workshop on Realizing AI Synergies in Software Engineering, 2012,
pp. 37-41.
SVM CART NB LR MLP [10] R. Bhardwaj, A. Nambiar, and D. Dutta, "A Study of Machine Learning
0.952239 0.898507 0.570149 0.641791 0.794030 in Healthcare," Proc. - Int. Comput. Softw. Appl. Conf., vol. 2, pp. 236-
241, 2017.
0.943284 0.901493 0.644776 0.686567 0.835821 [11] A. Asuncion and D. Newman, "UCI Machine Learning Repository,"
0.940299 0.886567 0.585075 0.620896 0.776119 2007. [Online]. Available: https://archive.ics.uci.edu/ml/index.php.
[12] S. B. et al., "Impact of HbA1c measurement on hospital readmission
0.895522 0.907463 0.659701 0.647761 0.820896 rates: Analysis of 70,000 clinical database patient records," Biomed Res.
0.922388 0.853731 0.579104 0.600000 0.800000 Int., vol. 2014, 2014.
[13] A. H. Karp, "Using logistic regression to predict customer retention,"
0.916168 0.883234 0.622754 0.628743 0.775449 Proc. Elev. Northeast SAS Users Gr. Conf. http//www. lexjansen.
0.922156 0.865269 0.646707 0.682635 0.796407 com/nesug/nesug98/solu/p095. pdf, 1998.
[14] F. Amato, A. López, E. M. Peña-Méndez, P. Va?hara, A. Hampl, and J.
0.946108 0.925150 0.655689 0.646707 0.790419
Havel, "Artificial neural networks in medical diagnosis," J. Appl.
0.924925 0.906907 0.690691 0.630631 0.792793 Biomed., vol. 11, no. 2, pp. 47-58, 2013.
[15] S. F., "Machine-Learning Techniques for Customer Retention: A
0.939940 0.906907 0.657658 0.669670 0.795796
Comparative Study," Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 2, pp.
273-281, 2018.
TABLE VIII. ACCURACY OF EMPLOYED TECHNIQUES [16] N. Jothi, N. Rashid, and W. Husain, "Data Mining in Healthcare - A
Review," Procedia Comput. Sci., vol. 72, pp. 306-313, 2015.
No Model Min Max Mean [17] D. Sisodia and D. Sisodia, "Prediction of Diabetes using Classification
1 Support Vector Machine 0.895522 0.952239 0.930303 Algorithms," Procedia Comput. Sci., vol. 132, no. Iccids, pp. 1578-1585,
2018.
2 Decision Tree 0.853731 0.925150 0.893523 [18] A. Hazra, S. Kumar, and A. Gupta, "Study and Analysis of Breast
3 Multi-Layer Perceptron 0.775449 0.835821 0.797773 Cancer Cell Detection using Naïve Bayes, SVM and Ensemble
Algorithms," Int. J. Comput. Appl., vol. 145, no. 2, pp. 39-45, 2016.
4 Naïve Bayes 0.570149 0.690691 0.631230
[19] E. Holzschuh, "A Decision-Theoretic Generalization of On-Line
5 Logistic Regression 0.600000 0.686567 0.645540 Learning and an Application to Boosting*," Reports Prog. Phys., vol.
55, no. 7, pp. 1035-1091, 1992.
VI. CONCLUSION AND FUTURE WORK [20] R. Sharma, V. Sugumaran, H. Kumar, and M. Amarnath, "A
comparative study of naive Bayes classifier and Bayes net classifier for
This paper presented a comparative study among five fault diagnosis of roller bearing using sound signal," Int. J. Decis.
machine learning techniques; namely LR, MLP, NB classifier, Support Syst., vol. 1, no. 1, p. 115, 2015.
decision trees, and SVMS; for predicting the likelihood of [21] S. Mandal, A. Gupta, A. Mukherjee, and A. Mukherjee, "Heart Disease
hospital readmission of diabetes patients. The study relied on Diagnosis and Prediction Using Machine Learning and Data Mining
Techniques?: A Review," vol. 10, no. 7, pp. 2137-2159, 2017.
real data collected from hospitals in the United States. Based
[22] N. Nai-Arun and P. Sittidech, "Ensemble Learning Model for Diabetes
on the study, the SVM provided the best performance. Classification," Adv. Mater. Res., vol. 931-932, pp. 1427-1431, 2014.
Nevertheless, the study will be extended to compare additional [23] S. Perveen, M. Shahbaz, A. Guergachi, and K. Keshavjee, "Performance
techniques and larger datasets will be considered as well. Analysis of Data Mining Classification Techniques to Predict Diabetes,"
REFERENCES Procedia Comput. Sci., vol. 82, no. March, pp. 115-121, 2016.
[1] G. Roglic, "Global report on diabetes.," World Heal. Organ., vol. 58, no. [24] K. Orabi, Y. Kamal, and T. Rabah, "Early Predictive System for
12, pp. 1-88, 2016. Diabetes Mellitus Disease," vol. 1, pp. 420-427, 2016.
[2] D. Rubin, K. Donnell-Jackson, R. Jhingan, S. Golden, and A. Paranjape, [25] D. Paul, "Analysing Feature Importances for Diabetes Prediction using
"Early readmission among patients with diabetes: A qualitative Machine Learning Debadri," 2018 IEEE 9th Annu. Inf. Technol.
assessment of contributing factors," J. Diabetes Complications, vol. 28, Electron. Mob. Commun. Conf., pp. 924-928, 2018.
no. 6, pp. 869-873, 2014. [26] J. Kerexeta, A. Artetxe, V. Escolar, A. Lozano, and N. Larburu,
[3] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and "Predicting 30-day Readmission in Heart Failure using Machine
I. Chouvarda, "Machine Learning and Data Mining Methods in Diabetes Learning Techniques," Proc. 11th Int. Jt. Conf. Biomed. Eng. Syst.
Research," Comput. Struct. Biotechnol. J., vol. 15, pp. 104-116, 2017. Technol., vol. 5, no. Biostec, pp. 308-315, 2018.
[4] P. Chowriappa, S. Dua, and Y. Todorov, "Machine Learning in [27] S. F., “Machine-Learning Techniques for Customer Retention: A
Healthcare Informatics," vol. 56, pp. 1-23, 2014. Comparative Study,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 2, pp.
273–281, 2018.
[5] T. Mitchell, "Machine learning (mcgraw-hill international editions
computer science series)," 1997. [28] K. Kira and L. A. Rendell, A Practical Approach to Feature Selection.
Morgan Kaufmann Publishers, Inc., 1992.
[6] E. Bose and K. Radhakrishnan, "Using Unsupervised Machine Learning
to Identify Subgroups among Home Health Patients with Heart Failure [29] D. W. Opitz, “Feature selection for ensembles,” Proc. 16th Natl. Conf.
Using Telehealth," CIN - Comput. Informatics Nurs., vol. 36, no. 5, pp. Artif. Intell. AAAI, vol. 16, no. 3, pp. 379–384, 1999.
242-248, 2018. [30] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature
[7] L. Kaelbling, A. Littman, and A. Moore, "Reinforcement learning: A Selection 1 Introduction,” An Introd. to Var. Featur. Sel., vol. 3, pp.
survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, 1996. 1157–1182, 2003.
[8] K. Shailaja, B. Seetharamulu, and M. Jabbar, "Machine Learning in [31] M. Sokolova and G. Lapalme, “A systematic analysis of performance
Healthcare: A Review," 2018 Second Int. Conf. Electron. Commun. measures for classification tasks,” Information Processing and
Aerosp. Technol., no. Iceca, pp. 910-914, 2018. Managemet, vol. 45, 2009, pp. 427-437.
[9] J. Davies and J. Gibbons, "Machine Learning and Software Engineering
in Health Informatics," in Proceedings of the First International
220 | P a g e
www.ijacsa.thesai.org
View publication stats