0% found this document useful (0 votes)
86 views

Detection of Heart Failure Using Different Machine Learning Algorithms

Heart is the key organ of our body as blood circulation towards other organs depends upon efficient working of the heart
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Detection of Heart Failure Using Different Machine Learning Algorithms

Heart is the key organ of our body as blood circulation towards other organs depends upon efficient working of the heart
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Detection of Heart Failure Using Different Machine


Learning Algorithms
Raghav Sharma
Mayuri Mukewar
Anurag Navale
Asmita Manna

Abstract:- Heart is the key organ of our body as blood Data gathering about such diseases has been part of the
circulation towards other organs depends upon efficient study for a long time. We are considering cholesterol,lung-
working of the heart . Nowadays, Coronary artery function test, treadmill check, etc.. as parameters to predict
diseases diminish the working ability of hearts to a large the risk rate of the heart.
extent, resulting in failure of hearts in many cases. A
survey conducted by WHO reveals that around 29.20% II. LITERATURE REVIEW
of the world’s population i.e 17 million people die due to
various heart diseases each year. For identifying various The paper "Predictive Analysis of Heart Disease using
heart diseases, several pathological procedures and K-Means and Apriori Algorithms" by Hadia Admin[2019]
medical investigations are being done by doctors. With and his team ,proposed a technique for detecting heart
the use of data mining and machine learning techniques, failure in patients using K- Means and Apriori. Their
better insights can be provided from the existing test approach showed that Apriori and K-mean algorithms when
results and the number of pathological procedures can applied together, the infection can be anticipated much
be reduced. A system created using Data Mining and superior than before and it helps the doctor to make
Machine Learning algorithms, can overcome the dearth necessary decisions of diagnosing the patients These
of examining tools for classifying the data and predicting algorithms effectively predicted the cardiac risk stage (low,
the Risk state of Cardiac patients. In this paper, a medium, or high) and assisted clinicians in accurately
comparative survey of such approaches for investigation understanding the patient's condition and providing
of Cardiac diseases using Data Mining techniques is appropriate diagnosis. These algorithms also aided in the
presented. These comparative study results would be hospital's storage and maintenance of the patient's record.
really helpful for researchers in this domain for
channelizing their research in the appropriate direction. The paper "Congestive heart failure detection using
random forest classifier" by Zerina Masetic and his
Keywords:- Comparative Study, Machine Learning, colleagues demonstrates the outstanding performance of the
Investigation, Naive Bayes, K-Nearest Neighbor, Random Random Forest algorithm, demonstrating that it is useful in
Forest, Decision Table, K-Means. determining the defeat of jammed heart and may be a
treasure in conveying information that will be convenient in
I. INTRODUCTION therapy.

Heart Diseases are increasing day-by-day. As per The paper "Heart Disease detection using Naive
WHO survey, approx. 17.9 million people from all over the Bayes Algorithm" by K. Vembandasamy [2015] and his
globe die due to various health diseases, and approx 80% team stated that data mining techniques have acted as one of
of these deaths are due to coronary heart diseases (eg heart the most important and known solutions for many health-
attack) and cardiovascular diseases (eg strokes).Due to this related issues. Among this techniques Naive bayes
development of most of the countries gets affected to some algorithm has given appropriate results regarding the
extent .Predicting heart diseases beforehand and changing prediction of heart disease. The results thus obtained shows
lifestyle accordingly - would help to reduce the number of that Naive Bayes algorithm provides accuracy of 86% with
deaths. Machine learning and data mining ae are used now a minimum execution time.
days in solving the problems related to many health
diseases. Prediction is one of the important problems where The Decision Tree technique, according to Dilip
machine learning techniques are widely utilized. In this Kumar Choubey[2020], relies on the top randomness of
work, we have done a comparative study of existing data input samples. It's a Divide and Conquer (DAC) method in
mining and ML algorithms which helped us to predict heart which the trees are constructed from the top down method.
diseases by processing existing heart patients’ data using The data is first preprocessed by splitting it into training and
machine learning algorithms. Heart diseases are not just test data in this algorithm.
coronary diseases but they vary for more inner parts which
are connected to the heart.

IJISRT21JUN960 www.ijisrt.com 1211


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Edward Choi [2017] found that Deep learning modules the most often used classification algorithms for heart
are adapted to secure relations that have seemed to build on disease research. When implementing KNN, a few
execution of models for investigation of event HF with a assumptions are made, the most common of which are a
short monitoring period of 12-18 months. dataset with low disturbance, labelled, and containing
relevant characteristics. Processing big datasets with KNN
Vishal Dineshkumar Soni [2020],proposed that takes a lengthy time. This algorithm achieves a 63.4 %
Comparison of distinct ways to estimate cardiac illnesses accuracy rate.
utilising data mining methodologies, studying the numerous
contrasts of professional Data Mining algorithms, and Logesh Kannan N[2020] along with his colleagues
assessing those methodologies are significant and fortunate. stated in their paper "Heart Disease Detection using
The most often used Data-Mining techniques for predicting Machine Learning" that machine learning and Data Mining
risk are Naive Bayes, Random Forest, and Decision Tree. methodology have been proven to be one of the important
and appropriate solutions for predicting the risk state of the
Rajesh N, T Maneesha [2018] stated that KNN is a heart of cardiac patients. This paper along with machine
controlled classifier and employs inspection from inside a learning techniques uses python programming to predict the
trail station to identify categorization tags. KNN is one of risk state of heart.

Author Title Method Pros Cons Remark


Rajesh N Machine Naive We may utilise a In some cases, ● If the input data is clean and well
,T Maneesha Learning Bayes combination of ML Naïve Bayes will kept, Nave Bayes produces a
, Shaik Hafeez Algorithms methods such as Naive not give exact more accurate answer.
, Hari Krishna for Heart Bayes and K-means to outputs so we
(2018) Disease achieve appropriate need to consider
Prediction accuracy using a fusion the outputs of
of ML techniques such different ML
as Naive Bayes and K- techniques.
means.
Hadia K-means and K- The K-Mean Algorithm Alone Apriori ● The results of the experiments
Amin,Abita the Apriori Means produces a more perfect can't give suggest that the best approach to
Devi,Nida UI Algorithm and and significant result effective results. predict heart disease is to
Amin (2019) are used to Apriori than the Apriori combine Apriori and K-mean
predict heart algorithm's weighted algorithms.
disease. correlation criteria. ● It helps the doctor to make
necessary decisions to diagnose
the patients

K.Vembandasa Heart Naive A Naive Bayes Decision trees ● The outcomes acquired show that
my,R.Sasipriya Diseases Bayes technique is easy to give less accurate the Naive Bayes technique
PPand Detection build, having no results while provides 86.4198% of
E.Deepa,(2015 Using Naive complex dull variable synthesizing small correctness with least time.
) Bayes evaluation which enables datasets in some
Algorithm it to be helpful cases.
particularly in the area
for determining the risk
rate of heart patients.

Zerina Congestive Random Normal and congestive ● The results of different trials
Masetic,Abdul heart failure Forest heart failure are both were analysed using a variety of
Hamit detection treated with machine statistical metrics (sensitivity,
Subasi(2016) using random learning approaches specificity, accuracy, F-measure,
forest (CHF). and ROC curve), and it was
classifier discovered that the random forest
The random forest method provides 100% accuracy.
algorithm detects CHF
with 100% accuracy.

IJISRT21JUN960 www.ijisrt.com 1212


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Chithambaram Heart Disease K- For minimising the ● The major aim is to determine
T,Logesh Detection Nearest occurrence of heart which algorithm provides the
Kannan Using Neighb disease, the author most accuracy in predicting
N,Gowsalya M Machine or, employed a decision tree future issues that the illness may
(2019) Learning Random method. bring, as well as which algorithm
Classifi provides the most accuracy in
er, The author has determining whether a person has
Correlat implemented Gini index heart disease or not.
ion, method using hyperplane
SVM and decision tree in this
SVN algorithm, which
exhibits the maximum
gain in characteristics
and displays strong
representation of the
decision tree technique.

Keshav Heart Disease Decisio A web app is built using ● In their experiment, they used the
Srivastava, Prediction n Trees, flask, and these packages Cleveland Heart Disease dataset
Dilip Kumar using KNN,N are used to make from the UCI repository to pre-
Choubey(2020 Machine aïve predictions based on the process data with missing values
) Learning and Bayes, data supplied by the and used algorithms such as
Data Mining Random user. Future researchers Decision Tree, K-Nearest
Forest, can enhance their Neighbour, Support Vector
SVM. accuracy by employing Machines, and Random Forest to
data mining techniques achieve accuracy of 79 %, 87%,
to retrieve hidden and 83 %, respectively.
information from
samples.. ● The AUC for Decision Tree, K-
Nearest Neighbour, Support
Vector Machines, and Random
Forest is 71.6 %, 88.5 %, 90.4 %,
and 90.8 %, respectively,
according to the ROC curve.

Algorithmic Survey
Here are some of the algorithms that we have identified for getting more accurate results regarding investigation of Heart Disease.

Parameters K Means Decision tree Random Forest

Definition ● The K Means method is a ● Decision Trees are a supervised ● Random Forest is an
recursive technique that Machine Learning method in extractor that holds
attempts to split a dataset which data is continually several decision trees on
into K separate clusters, divided according to a set of distinct subsets of a
each of which contains rules. dataset and averages
just one data point. ● Decision nodes and leaves are them to increase the
● Data Mining projects are two procedures that may be dataset's prediction
done by mostly using K used to describe the tree. accuracy.
means Quality of clusters ● The ultimate results are ● It is superior than a
remains the same represented by the leaves, while single decision tree
throughout the execution the decision nodes are the because it reduces over-
process for showing the points where the data is split. fitting by averaging the
accurate output. results.

IJISRT21JUN960 www.ijisrt.com 1213


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Algorithmic steps 1. K is the number of clusters 1. S begins the chapter with the 1. Choose K data points at
to specify. root node, which includes the random from the
2. Initialize the centroids by whole dataset. training set. Create
shuffling the dataset and 2. Using the Attribute Selection decision trees for the
then picking K data points Measure, find the best attribute data points you've
at random for the centroids in the dataset (ASM). chosen (Subsets).
without replacing them. 3. Subdivide the S into subsets 2. Choose N for the
3. Continue iterating until the that include the best attribute's number of decision trees
centroids do not change. potential values. you wish to create.
i.e. the clustering of data 4. Create a node in the decision 3. Steps 1 and 2 should be
points does not change. tree that holds the best attribute. repeated.
5. Create new decision trees in a 4. Find the forecasts of
● Calculate the total of all recursive manner using the each decision tree for
data points' squared subsets of the dataset produced new data points, then
distances from all in step -3. allocate the new data
centroids. 6. Continue this procedure until points to the category
● Assign each data point to you can no longer categorise the with the most votes.
the cluster that is closest to nodes any further and refer to
it (centroid). the last node as a leaf node.
● Calculate the cluster
centroids by averaging all
of the data points that
correspond to each cluster.

Accuracy Gives less accurate results Gives less accurate results. Gives more accurate
results.

Dataset Can handle massive amount of Cannot handle large data and noisy Can handle enormous
data and noisy data data amounts of data as well as
noisy data

Speed Faster Faster Faster than Decision tree


and K means

Pros 1. Execution is rather simple. ● Decision trees need less work ● Random Forest can
2. It can handle huge data for data preparation during pre- handle both
sets. processing than other methods. classification and
3. Convergence is guaranteed. ● A decision tree does not need regression problems.
4. It's possible to warm up the data normalisation.
● It can handle big
locations of centroids. ● A decision tree does not need datasets with a lot of
5. Adapts quickly to new data scalability.
situations. dimensionality.
● In addition, missing values in
6. Generalizes to other forms the data have no significant ● It improves the model's
and sizes of clusters, such impact on the decision tree- accuracy and eliminates
as elliptical clusters. building process. the problem of
● A decision tree model is simple overfitting.
to understand and convey to
technical teams and
stakeholders.

Cons ● It necessitates determining ● A little change in the data can ● Despite the fact that
the number of clusters (k) result in a huge change in the random forest may be
ahead of time. decision tree's structure, used for both
● It can't deal with noisy data resulting in instability. classification and
or outliers. ● When compared to other regression tasks, it is not
● Clusters having non- algorithms, a decision tree's better suited to
convex forms are not calculation might get rather regression tasks..
appropriate for detection. complicated at times.
● The training period for decision
trees is frequently longer.

IJISRT21JUN960 www.ijisrt.com 1214


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

● Because of the intricacy and


time required, Tree of Decisions
training is relatively costly.
● When it comes to using
regression and predicting
continuous values, the Decision
Tree method falls short.
Source for database:
https://data.world/informatics-edu/heart-disease-prediction

III. CONCLUSION

After doing the literature survey, we found that among


all the techniques available in machine learning and data
mining, Random Forest with accuracy 100%,K nearest
neighbor with accuracy 88.5% and Naive Bayes algorithm
with accuracy of 86% have proven to be the most important
and efficient algorithms to determine the Risk state of heart
of cardiac patients.

REFERENCES

[1]. Hadia Amin,Abita Devi,Nida UI Amin, "Predictive


Analysis of Heart Disease using K-Means and Apriori
Algorithms",Journal of Applied science and
Computations, ISSN NO:1076-5131,Aug 2019
[2]. Zerina Masetic, AbdulHamit Subasi,"Congestive heart
failure detection using random forest classifier",
Computer Methods and Programs in Biomedicine,
Volume 130, July 2016.
[3]. K.Vembandasamy,R.SasipriyaPPand E.Deepa, "Heart
Disease detection using Naive Bayes Algorithm",
IJISET - International Journal of Innovative Science,
Engineering & Technology, Vol. 2 Issue 9, September
2015.
[4]. Dilip Kumar Choubey,Keshav Srivastava, "Detection
of Heart Disease using Machine Learning
Techniques", International Journal of Recent
Technology and Engineering (IJRTE) ISSN: 2277-
3878, Volume-9 Issue-1, May 2020
[5]. Edward Choi ,Andy Schuetz, Walter F Stewart,
Jimeng Sun "Using recurrent neural network models
for early detection of heart failure onset", Journal of
the American Medical Informatics Association,
Volume 24, Issue 2, March 2017
[6]. Vishal Dineshkumar Soni, "Detection Of Heart
Disease Using Machine Learning Techniques",
INTERNATIONAL JOURNAL OF SCIENTIFIC &
TECHNOLOGY RESEARCH VOLUME 9, ISSUE
08, AUGUST 2020 ,ISSN 2277-8616
[7]. Rajesh N, T Maneesha, Shaik Hafeez, Hari
Krishna,"Prediction of Heart Disease using Machine
Learning Algorithm" , International Journal of
Engineering & Technology, 7 (2.32) (2018) 363-366
[8]. Chithambaram T,Logesh Kannan N,Gowsalya M,
"Heart Disease Detection using Machine Learning"
,DOI: https://doi.org/10.21203/rs.3.rs-97004/v1

IJISRT21JUN960 www.ijisrt.com 1215

You might also like