0% found this document useful (0 votes)

9 views

Mini Report2

Uploaded by

divu.divu02

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Mini Report2

Uploaded by

divu.divu02

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

HEART DISEASE PREDICTION USING ML

MINI PROJECT REPORT

Submitted in partial fulfilment of the requirements for the award of the degree

BACHELOR OF TECHNOLOGY

INFORMATION TECHNOLOGY

Name Divya Name Pooja kumari Name Nisha

Enrollment No: Enrollment No: Enrollment No:
00411507721 00611507721 01311507721

Guided by

Dr Arvind Rehalia

DEPARTMENT OF INFORMATION TECHNOLOGY

BHARATI VIDYAPEETH’S COLLEGE OF ENGINEERING
(AFFILIATED TO GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY, DELHI)
NEW DELHI – 110063
JAN.2023

[1]
CANDIDATE’S DECLARATION

It is hereby certified that the work which is being presented in the B. Tech Minor project Report
entitled " HEART DISEASE PREDICTION USING ML" in partial fulfilment of the
requirements for the award of the degree of Bachelor of Technology and submitted in the
Department of Information Technology of BHARATI VIDYAPEETH’S COLLEGE OF
ENGINEERING, New Delhi (Affiliated to Guru Gobind Singh Indraprastha University,
Delhi) is an authentic record of our own work carried out during a period from September 2022
to January 2023 under the guidance of Dr. Arun Kumar Dubey.

The matter presented in the B. Tech Major Project Report has not been submitted by me for the
award of any other degree of this or any other Institute.

(Divya) (Pooja kumari) (Nisha)

(En.No:00411507721) (En.No:0061150771) (En.No:131150771)

This is to certify that the above statement made by the candidate is correct to the best of my
knowledge. He/She/They are permitted to appear in the External Minor Project Examination

(Dr Arvind Rehalia) Mr. Prakhar Priyadarshi

Head, IT

[2]
Abstract

This report represents the mini-project assigned to seventh semester students for the partial fulfillment
of our aim, Machine Learning, given by the department of computer science and engineering, KU.
Cardiovascular diseases are the most common cause of death worldwide over the last few decades in
the developed as well as underdeveloped and developing countries. Early detection of cardiac diseases
and continuous supervision of clinicians can reduce the mortality rate. However, it is not possible to
monitor patients every day in all cases accurately and consultation of a patient for 24 hours by a doctor
is not available since it requires more sapience, time and expertise. In this project, we have developed
and researched about models for heart disease prediction through the various heart attributes of patient
and detect impending heart disease using Machine learning techniques like backward elimination
algorithm, logistic regression and REFCV on the dataset available publicly in Kaggle Website, further
evaluating the results using confusion matrix and cross validation. The early prognosis of
cardiovascular diseases can aid in making decisions on lifestyle changes in high risk patients and in
turn reduce the complications, which can be a great milestone in the field of medicine.

[3]
Acknowledgement
We express our deep gratitude to Dr. Arvind Rehalia, Department of Information Technology
Engineering for his valuable guidance and suggestion throughout my project work. We are
thankful to Dr. Arun Kumar Dubey & Mahesh Kumar Project Coordinators, for their valuable
guidance.

We would like to extend my sincere thanks to Head of the Department, Mr. Prakhar
Priyadarshi for his time to time suggestions to complete my project work. I am also thankful to
Prof. Dharmender Saini, Principal for providing me the facilities to carry out my project work.

Sign Sign Sign

(Divya) (Pooja kumari) (Nisha)
(En. No: 00411507721) (En. No: 00611507721) (En. No: 01311507721)

[4]
Table of Content

CANDIDATE DECLARATION II
ABSTRACT III
ACKNOWLEDGEMENT IV
TABLE OF CONTENTS V-XX

Chapter1 Introduction 6-7

1.1 Introduction
1.2 Problem Definition
1.3 Objective
1.4 Related Work

Chapter2 Data Exploration 8-14

2.1 Evaluation
2.2 Preparing the tool
2.3 Load Data
2.4 Data Exploration
2.5 Age vs Max heart rate

Chapter3 Algorithm Used 15-24

3.1 Random Forest classifier
3.2 KNN
3.3 Logistic Regresssion

Chapter4 Conclusion and References 25

4.1 Conclusion
4.2 References

[5]
Chapter-1
1.1 Introduction

According to the World Health Organization, every year 12 million deaths occur worldwide due to heart disease. The
load of cardiovascular disease is rapidly increasing all over the world from the past few years. Many researches have
been conducted in attempt to pinpoint the most influential factors of heart disease as well as accurately predict the overall
risk.

Heart Disease is even highlighted as a silent killer which leads to the death of the person without obvious symptoms. The
early diagnosis of heart disease plays a vital role in making decisions on lifestyle changes in high-risk patients and in
turn reduce the complications. This project aims to predict future heart disease by analyzing data of patients which
classifies whether they have heart disease or not using machine-learning algorithms.

1.2 Problem Definition

The major challenge in heart disease is its detection. There are instruments available which can predict heart disease but
either they are expensive or are not efficient to calculate chance of heart disease in human. Early detection of cardiac
diseases can decrease the mortality rate and overall complications.

However, it is not possible to monitor patients every day in all cases accurately and consultation of a patient for 24
hours by a doctor is not available since it requires more sapience, time, and expertise. Since we have a good amount
of data in today’s world, we can use various machine learning algorithms to analyze the data for hidden patterns. The
hidden patterns can be used for health diagnosis in medicinal data.

1.3 Motivation

Machine learning techniques have been around us and has been compared and used for analysis for many kinds of
science applications. The major motivation behind this research-based project was to explore the feature methods, data
preparation and processing behind the training models in the machine learning. With first hand models and libraries,
the challenge we face today is data where beside their abundance, and our cooked models, the accuracy we see
during training, testing and actual validation has a higher variance.

Hence this project is carried out with the motivation to explore behind the models, and further implement Logistic
Regression model to train the obtained data. Furthermore, as the whole machine learning is motivated to develop an
appropriate computer-based system and decision support that can aid to early detection of heart disease, in this project
we have developed a model which classifies if patient will have heart disease in ten years or not based on various
features.

Hence, the early prognosis of cardiovascular diseases can aid in making decisions on lifestyle changes in high-risk
patients and in turn reduce the complications, which can be a great milestone in the field of medicine.

1.3 Objectives

The main objective of developing this project is:

1. To develop machine learning model to predict future possibility of heart disease by implementing Logistic
Regression.
2. To determine significant risk factors based on medical dataset which may lead to heart disease.
3. To analyze feature selection methods and understand their working principle.

[6]
1.4 Related Works

With growing development in the field of medical science alongside machine learning various experiments and
researches have been carried out in these recent years releasing the relevant significant papers. The paper propose
heart disease prediction using KStar, J48, SMO, and Bayes Net and Multilayer perceptron using WEKA software.
Based on performance from different factor SMO (89% of accuracy) and Bayes Net (87% of accuracy) achieve
optimum performance than KStar, Multilayer perceptron and J48 techniques using k-fold cross validation.

The accuracy performance achieved by those algorithms are still not satisfactory. So that if the performance of
accuracy is improved more to give batter decision to diagnosis disease.
In a research conducted using Cleveland dataset for heart diseases which contains 303 instances and used 10-fold
Cross Validation, considering 13 attributes, implementing 4 different algorithms, they concluded Gaussian Naïve
Bayes and Random Forest gave the maximum accuracy of 91.2 percent. Using the similar dataset of Framingham,
Massachusetts, the experiments were carried out using 4 models and were trained and tested with maximum accuracy
K Neighbors Classifier: 87%, Support Vector Classifier: 83%, Decision Tree Classifier: 79% and Random Forest
Classifier:

1.5 Data

What you'll want to do here is dive into the data your problem definition is based on. This may involve, sourcing,
defining different parameters, talking to experts about it and finding out what you should expect.
The original data came from the Cleveland Database from UCI Machine Learning Repository.
Howevever, we've downloaded it in a formatted way from Kaggle.

The original database contains 76 attributes, but here only 14 attributes will be used. Attributes (also called features)
are the variables what we'll use to predict our target variable.
Attributes and features are also referred to as independent variables and a target variable can be referred to as
a dependent variable.

We use the independent variables to predict our dependent variable.

Or in our case, the independent variables are a patients different medical attributes and the dependent variable is
whether or not they have heart disease.

[7]
Chapter-2
2.1 Evaluation

Features
Features are different parts of the data. During this step, you'll want to start finding out what you can about the
data.

One of the most common ways to do this, is to create a data dictionary.

Heart Disease Data Dictionary

A data dictionary describes the data you're dealing with. Not all datasets come with them so this is where
you may have to do your research or ask a subject matter expert (someone who knows about the data) for
more.

The following are the features we'll use to predict our target variable (heart disease or no heart disease).

1. age - age in years

2. sex - (1 = male; 0 = female)
3. cp - chest pain type
 0: Typical angina: chest pain related decrease blood supply to the heart
 1: Atypical angina: chest pain not related to heart
 2: Non-anginal pain: typically esophageal spasms (non heart related)
 3: Asymptomatic: chest pain not showing signs of disease
4. trestbps - resting blood pressure (in mm Hg on admission to the hospital)
 anything above 130-140 is typically cause for concern
5. chol - serum cholestoral in mg/dl
 serum = LDL + HDL + .2 * triglycerides
 above 200 is cause for concern
6. fbs - (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
 '>126' mg/dL signals diabetes
7. restecg - resting electrocardiographic results
 0: Nothing to note
 1: ST-T Wave abnormality
 can range from mild symptoms to severe problems
 signals non-normal heart beat
 2: Possible or definite left ventricular hypertrophy
 Enlarged heart's main pumping chamber

[8]
8. thalach - maximum heart rate achieved
9. exang - exercise induced angina (1 = yes; 0 = no)
10. oldpeak - ST depression induced by exercise relative to rest
 looks at stress of heart during excercise
 unhealthy heart will stress more
11. slope - the slope of the peak exercise ST segment
 0: Upsloping: better heart rate with excercise (uncommon)
 1: Flatsloping: minimal change (typical healthy heart)

 2: Downslopins: signs of unhealthy heart

12. ca - number of major vessels (0-3) colored by flourosopy
 colored vessel means the doctor can see the blood passing through
 the more blood movement the better (no clots)
13. thal - thalium stress result
 1,3: normal
 6: fixed defect: used to be defect but ok now.
 7: reversable defect: no proper blood movement when excercising
14. target - have disease or not (1=yes, 0=no) (= the predicted attribute)

Note: No personal identifiable information (PPI) can be found in the dataset.

It's a good idea to save these to a Python dictionary or in an external file, so we can look at
them later without coming back here.

2.2 Introduction of human heart

The human heart is one of the most important organs responsible for sustaining life. It is a
muscular organ with four chambers. The size of the heart is the size of about a clenched fist.

The human heart functions throughout a person’s lifespan and is one of the most robust and
hardest working muscles in the human body.

Besides humans, most other animals also possess a heart that pumps blood throughout their
bodies. Even invertebrates such as grasshoppers possess a heart like pumping organ, though
they do not function the same way a human heart does

[9]
2.1 Position of heart in human body
The human heart is located between the lungs in the thoracic cavity, slightly towards the left of the
sternum (breastbone). It is derived from the embryonic mesodermal germ layer.

2.1 The Function of heart

The function of the heart in any organism is to maintain a constant flow of blood throughout the
body. This replenishes oxygen and circulates nutrients among the cells and tissues.

Following are the main functions of the heart:

 One of the primary functions of the human heart is to pump blood throughout the body.
 Blood delivers oxygen, hormones, glucose and other components to various parts of the body, including
the human heart.
 The heart also ensures that adequate blood pressure is maintained in the body

There are two types of circulation within the body, namely pulmonary circulation and systemic
circulation.

2.1 Types of circulationt

 Pulmonary circulation is a portion of circulation responsible for carrying deoxygenated blood away
from the heart, to the lungs and then bringing oxygenated blood back to the heart.
 Systemic circulation is another portion of circulation where the oxygenated blood is pumped from
the heart to every organ and tissue in the body, and deoxygenated blood comes back again to the
heart.

Now, the heart itself is a muscle and therefore, it needs a constant supply of oxygenated blood.
This is where another type of circulation comes into play, the coronary circulation.

 Coronary circulation is an essential portion of the circulation, where oxygenated blood is supplied to
the heart. This is important as the heart is responsible for supplying blood throughout the body.
 Moreover, organs like the brain need a steady flow of fresh, oxygenated blood to ensure functionality

[10]
2.2 Structure of the human heart

The human heart is about the size of a human fist and is divided into four chambers, namely two
ventricles and two atria. The ventricles are the chambers that pump blood and the atrium are the
chambers that receive blood. Among these both the right atrium and ventricle make up the “right
heart,” and the left atrium and ventricle make up the “left heart.” The structure of the heart also
houses the biggest artery in the body – the aorta.

2.1 External structure of heart

One of the very first structures which can be observed when the external structure of the heart is
viewed is the pericardium.

Pericardium
The human heart is situated to the left of the chest and is enclosed within a fluid-filled cavity
described as the pericardial cavity. The walls and lining of the pericardial cavity are made up of a
membrane known as the pericardium.

The pericardium is a fibre membrane found as an external covering around the heart. It protects
the heart by producing a serous fluid, which serves to lubricate the heart and prevent friction
between the surrounding organs. Apart from the lubrication, the pericardium also helps by holding
the heart in its position and by maintaining a hollow space for the heart to expand itself when it is
full. The pericardium has two exclusive layers—

 Visceral Layer: It directly covers the outside of the heart.

 Parietal Layer: It forms a sac around the outer region of the heart that contains the fluid in the
pericardial cavity.

2.2 Structure of heart

 Epicardium – Epicardium is the outermost layer of the heart. It is composed of a thin-layered membrane
that serves to lubricate and protect the outer section.

 Myocardium – This is a layer of muscle tissue and it constitutes the middle layer wall of the heart. It
contributes to the thickness and is responsible for the pumping action.

[11]
 Endocardium – It is the innermost layer that lines the inner heart chambers and covers the heart
valves. Furthermore, it prevents the blood from sticking to the inner walls, thereby preventing potentially
fatal blood clots

2.3 Chambers of the heart

Vertebrate hearts can be classified based on the number of chambers present. For instance, most
fish have two chambers, and reptiles and amphibians have three chambers. Avian and
mammalian hearts consists of four chambers. Humans are mammals; hence, we have four
chambers, namely:

 Left atrium
 Right atrium
 Left ventricle
 Right ventricle

Atria are thin and have less muscular walls and are smaller than ventricles. These are the blood-
receiving chambers that are fed by the large veins.

Ventricles are larger and more muscular chambers responsible for pumping and pushing blood
out into circulation. These are connected to larger arteries that deliver blood for circulation.

The right ventricle and right atrium are comparatively smaller than the left chambers. The walls
consist of fewer muscles compared to the left portion, and the size difference is based on their
functions. The blood originating from the right side flows through the pulmonary circulation, while
blood arising from the left chambers is pumped throughout the body.

Blood Vessels

In organisms with closed circulatory systems, the blood flows within vessels of varying sizes. All
vertebrates, including humans, possess this type of circulation. The external structure of the heart
has many blood vessels that form a network, with other major vessels emerging from within the
structure. The blood vessels typically comprise the following:

 Veins supply deoxygenated blood to the heart via inferior and superior vena cava, and it eventually
drains into the right atrium.
 Capillaries are tiny, tube-like vessels which form a network between the arteries to veins.
 Arteries are muscular-walled tubes mainly involved in supplying oxygenated blood away from the heart
to all other parts of the body. Aorta is the largest of the arteries and it branches off into various smaller
arteries throughout the body.

[12]
2.2 Preparing the tools
At the start of any project, it's custom to see the required libraries imported in a big chunk like you can see
below.

However, in practice, your projects may import libraries as you go. After you've spent a couple of hours working
on your problem, you'll probably want to do some tidying up. This is where you may want to consolidate every
library you've used at the top of your notebook (like the cell below).

The libraries you use will differ from project to project. But there are a few which will you'll likely take advantage
of during almost every structured data project.

 Pandas for data analysis.

 NumPy for numerical operations.
 Matplotlib/Seaborn for plotting or data visualization.
 Scikit-Learn for machine learning modelling and evaluation.

2.3 Load Data

There are many different kinds of ways to store data. The typical way of storing tabular data, data similar to what
you'd see in an Excel file is in .csv format. .csv stands for comma seperated values.

Pandas has a built-in function to read .csv files called read_csv() which takes the file pathname of your .csv file

2.4 Data Exploration (exploratory data analysis or EDA)

Once you've imported a dataset, the next step is to explore. There's no set way of doing this. But what you should be
trying to do is become more and more familiar with the dataset.

Compare different columns to each other, compare them to the target variable. Refer back to your data dictionary and
remind yourself of what different columns mean.

Your goal is to become a subject matter expert on the dataset you're working with. So if someone asks you a question
about it, you can give them an explanation and when you start building models, you can sound check them to make
sure they're not performing too well (overfitting) or why they might be performing poorly (underfitting).

[13]
Since EDA has no real set methodolgy, the following is a short check list you might want to walk through:

1. What question(s) are you trying to solve (or prove wrong)?

2. What kind of data do you have and how do you treat different types?
3. What’s missing from the data and how do you deal with it?
4. Where are the outliers and why should you care about them?
5. How can you add, change or remove features to get more out of your data?

Once of the quickest and easiest ways to check your data is with the head() function. Calling it on any dataframe will
print the top 5 rows, tail() calls the bottom 5. You can also pass a number to them like head(10) to show the top 10
rows.

tres slo tar

sex cp chol fbs restecg thalach exang oldpeak ca thal
Age tbps pe get

0 63 1 3 145 233 1 0 150 0 2.3 0 0 1

1 37 1 2 130 250 0 1 187 0 3.5 0 0 1

2 41 0 1 130 204 0 0 172 0 1.4 2 0 1

3 56 1 1 120 236 0 1 178 0 0.8 2 0 1

4 57 0 0 120 354 0 1 163 1 0.6 2 0 1

slop c tha targe

e a l t
old
trest reste thala exan
Age sex cp chol fbs pea
bps cg ch g
k

0 63 1 3 145 233 1 0 150 0 2.3 0 0 1

1 37 1 2 130 250 0 1 187 0 3.5 0 0 1

2 41 0 1 130 204 0 0 172 0 1.4 2 0 1

3 56 1 1 120 236 0 1 178 0 0.8 2 0 1

4 57 0 0 120 354 0 1 163 1 0.6 2 0 1

5 57 1 0 140 192 0 1 148 0 0.4 1 0 1

6 56 0 1 140 294 0 0 153 0 1.3 1 0 1

7 44 1 1 120 263 0 1 173 0 0.0 2 0 1

8 52 1 2 172 199 1 1 162 0 0.5 2 0 1

9 57 1 2 150 168 0 1 174 0 1.6 2 0 1

[14]
value_counts() allows you to show how many times each of the values of
categorical column appear.

# Let's see how many positive (1) and negative (0) samples we have in our dataframe
1 165
0 138
Name: target, dtype: int64

Since these two values are close to even, our target column can be considered balanced. An unbalanced target
column, meaning some classes have far more samples, can be harder to model than a balanced set. Ideally, all of your
target classes have the same number of samples.

If you'd prefer these values in percentages, value_counts() takes a parameter, normalize which can be set to true.
# Normalized value counts
1 0.544554
0 0.455446
We can plot the target column value counts by calling the plot() function and telling it what kind of plot we'd like, in this
case, bar is good.

[15]
Heart Disease Frequency according to Gender

If you want to compare two columns to each other, you can use the function pd.crosstab(column_1, column_2).
This is helpful if you want to start gaining an intuition about how your independent variables interact with your
dependent variables.
Let's compare our target column with the sex column.
Remember from our data dictionary, for the target column, 1 = heart disease present, 0 = no heart disease. And for sex,
1 = male, 0 = female.

# Compare target column with sex column

Sex 0 1

target

0 24 114

1 72 93

Making our crosstab visual

You can plot the crosstab by using the plot() function and passing it a few parameters such as, kind (the type of plot
you want), figsize=(length, width) (how big you want it to be) and color=[colour_1, colour_2] (the different colours

you'd like to use).

Different metrics are represented best with different kinds of plots. In our case, a bar graph is great. We'll see examples of
more later. And with a bit of practice, you'll gain an intuition of which plot to use with different variables.

[16]
2.5 Age vs Max Heart rate for Heart Disease

Let's try combining a couple of independent variables, such as, age and thalach (maximum heart rate) and then
comparing them to our target variable heart disease.

Because there are so many different values for age and thalach, we'll use a scatter plot.

[17]
Heart Disease Frequency per Chest Pain Type
Let's try another independent variable. This time, cp (chest pain).
We'll use the same process as we did before with sex.

target 0 1

0 104 39

1 9 41

2 18 69

3 7 16

[18]
3. cp - chest pain type
 0: Typical angina: chest pain related decrease blood supply to the heart
 1: Atypical angina: chest pain not related to heart
 2: Non-anginal pain: typically esophageal spasms (non heart related)
 3: Asymptomatic: chest pain not showing signs of disease

Model Comparison
Since we've saved our models scores to a dictionary, we can plot them by first converting them to a Data Frame.

[19]
Chapter-3
4. 3. Machine learning
Machine Learning is the field of study that gives computers the capability to learn without being explicitly
programmed. ML is one of the most exciting technologies that one would have ever come across. As it is
evident from the name, it gives the computer that makes it more similar to humans: The ability to learn.
Machine learning is actively being used today, perhaps in many more places than one would expect.

Features of machine learning

Machine learning is data driven technology. Large amount of data generated by organizations on daily bases.
So, by notable relationships in data, organizations makes better decisions.

Machine can learn itself from past data and automatically improve.
From the given dataset it detects various patterns on data.
For the big organizations branding is important and it will become more easy to target relatable customer
base.
It is similar to data mining because it is also deals with the huge amount of data.

Supervised Machine Learning is where you have input variables (x) and an output variable (Y) and you use
an algorithm to learn the mapping function from the input to the output Y = f(X). The goal is to approximate
the mapping function so well that when you have new input data (x) you can predict the output variables (Y)
for that data.

Supervised learning problems can be further grouped into Regression and Classification problems

Regression: Regression algorithms are used to predict a continuous numerical output. For example, a
regression algorithm could be used to predict the price of a house based on its size, location, and other
features.

Classification: Classification algorithms are used to predict a categorical output. For example, a
classification algorithm could be used to predict whether an email is spam or not.

Classification Types
There are two main classification types in machine learning:

Binary Classification

In binary classification, the goal is to classify the input into one of two classes or categories. Example – On
the basis of the given health conditions of a person, we have to determine whether the person has a certain
disease or not.

Multiclass Classification

In multi-class classification, the goal is to classify the input into one of several classes or categories. For
Example – On the basis of data about different species of flowers, we have to determine which specie our
observation belongs to.

[20]
Regression Analysis is a statistical process for estimating the relationships between the dependent variables
or criterion variables and one or more independent variables or predictors. Regression analysis is generally
used when we deal with a dataset that has the target variable in the form of continuous data. Regression
analysis explains the changes in criteria about changes in select predictors.

The conditional expectation of the criteria is based on predictors where the average value of the dependent
variables is given when the independent variables are changed. Three major uses for regression analysis are
determining the strength of predictors, forecasting an effect, and trend forecasting.

There are times when we would like to analyze the effect of different independent features on the target or
what we say dependent features. This helps us make decisions that can affect the target variable in the
desired direction.

Regression analysis is heavily based on statistics and hence gives quite reliable results to this reason only
regression models are used to find the linear as well as non-linear relation between the independent and the
dependent or target variables.

Types of Regression Techniques

Along with the development of the machine learning domain regression analysis techniques have gained
popularity as well as developed manifold from just y = mx + c.

There are several types of regression techniques, each suited for different types of data and different types of
relationships. The main types of regression techniques are:

[21]
Polynomial Regression

This is an extension of linear regression and is used to model a non-linear relationship between the
dependent variable and independent variables. Here as well syntax remains the same but now in the input
variables we include some polynomial or higher degree terms of some already existing features as well.

Linear regression was only able to fit a linear model to the data at hand but with polynomial features, we can
easily fit some non-linear relationship between the target as well as input features.

3.1METHODS AND ALGORITHMS USED

Random Forest is a popular machine learning algorithm that belongs to the supervised learning technique. It
can be used for both Classification and Regression problems in ML. It is based on the concept of ensemble
learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the
performance of the model.

As the name suggests, "Random Forest is a classifier that contains a number of decision trees on various
subsets of the given dataset and takes the average to improve the predictive accuracy of that
dataset." Instead of relying on one decision tree, the random forest takes the prediction from each tree and
based on the majority votes of predictions, and it predicts the final output.

The greater number of trees in the forest leads to higher accuracy and prevents the problem of
overfitting.

The below diagram explains the working of the Random Forest algorithm:

[22]
Assumptions for Random Forest
Since the random forest combines multiple trees to predict the class of the dataset, it is possible that some decision trees
may predict the correct output, while others may not. But together, all the trees predict the correct output. Therefore,
below are two assumptions for a better Random Forest classifier:

o There should be some actual values in the feature variable of the dataset so that the classifier can predict accurate
results rather than a guessed result.

o The predictions from each tree must have very low correlations.

Random Forest algorithm is a powerful tree learning technique in Machine Learning. It works by creating a number of Decision
Trees during the training phase. Each tree is constructed using a random subset of the data set to measure a random subset of
features in each partition. This randomness introduces variability among individual trees, reducing the risk of overfitting and
improving overall prediction performance.

[23]
This randomness introduces variability among individual trees, reducing the risk of overfitting and improving overall prediction
performance. In prediction, the algorithm aggregates the results of all trees, either by voting (for classification tasks) or by
averaging (for regression tasks)

This collaborative decision-making process, supported by multiple trees with their insights, provides an example stable and
precise results. Random forests are widely used for classification and regression functions, which are known for their ability to
handle complex data, reduce overfitting, and provide reliable forecasts in different environments.

What are Ensemble Learning models?

Ensemble learning models work just like a group of diverse experts teaming up to make decisions – think of them as a bunch of
friends with different strengths tackling a problem together. Picture it as a group of friends with different skills working on a
project. Each friend excels in a particular area, and by combining their strengths, they create a more robust solution than any
individual could achieve alone.

How Does Random Forest Work?

The random Forest algorithm works in several steps which are discussed below–>

Ensemble of Decision Trees:

Random Forest leverages the power of ensemble learning by constructing an army of Decision Trees. These trees are like
individual experts, each specializing in a particular aspect of the data. Importantly, they operate independently, minimizing the
risk of the model being overly influenced by the nuances of a single tree.
Random Feature Selection: To ensure that each decision tree in the ensemble brings a unique perspective, Random Forest
employs random feature selection. During the training of each tree, a random subset of features is chosen.

This randomness ensures that each tree focuses on different aspects of the data, fostering a diverse set of predictors within the
ensemble.

Bootstrap Aggregating or Bagging: The technique of bagging is a cornerstone of Random Forest’s training strategy which
involves creating multiple bootstrap samples from the original dataset, allowing instances to be sampled with replacement. This
results in different subsets of data for each decision tree, introducing variability in the training process and making the model
more robust.

Decision Making and Voting: When it comes to making predictions, each decision tree in the Random Forest casts its vote. For
classification tasks, the final prediction is determined by the mode (most frequent prediction) across all the trees. In regression
tasks, the average of the individual tree predictions is taken. This internal voting mechanism ensures a balanced and collective
decision-making process

[24]
Applications of Random Forest in Real-World Scenarios

Finance Wizard: Imagine Random Forest as our financial superhero, diving into the world of credit scoring. Its mission? To
determine if you’re a credit superhero or, well, not so much. With a knack for handling financial data and sidestepping overfitting
issues, it’s like having a guardian angel for robust risk assessments.

Health Detective: In healthcare, Random Forest turns into a medical Sherlock Holmes. Armed with the ability to decode medical
jargon, patient records, and test results, it’s not just predicting outcomes; it’s practically assisting doctors in solving the mysteries
of patient health.

Environmental Guardian: Out in nature, Random Forest transforms into an environmental superhero. With the power to
decipher satellite images and brave noisy data, it becomes the go-to hero for tasks like tracking land cover changes and
safeguarding against potential deforestation, standing as the protector of our green spaces.

Digital Bodyguard: In the digital realm, Random Forest becomes our vigilant guardian against online trickery. It’s like a cyber-
sleuth, analyzing our digital footsteps for any hint of suspicious activity. Its ensemble approach is akin to having a team of cyber-
detectives, spotting subtle deviations that scream “fraud alert!” It’s not just protecting our online transactions; it’s our digital
bodyguard

Why use Random Forest?

Below are some points that explain why we should use the Random Forest algorithm:

o It takes less training time as compared to other algorithms.

o It predicts output with high accuracy, even for the large dataset it runs efficiently.

o It can also maintain accuracy when a large proportion of data is missing.

How Random Forest Classification works

Random Forest Classification is an ensemble learning technique designed to enhance the accuracy and robustness of classification
tasks. The algorithm builds a multitude of decision trees during training and outputs the class that is the mode of the classification
classes. Each decision tree in the random forest is constructed using a subset of the training data and a random subset of features
introducing diversity among the trees, making the model more robust and less prone to overfitting.

The random forest algorithm employs a technique called bagging (Bootstrap Aggregating) to create these diverse subsets.

During the training phase, each tree is built by recursively partitioning the data based on the features. At each split, the algorithm
selects the best feature from the random subset, optimizing for information gain or Gini impurity. The process continues until a
predefined stopping criterion is met, such as reaching a maximum depth or having a minimum number of samples in each leaf
node.

[25]
Once the random forest is trained, it can make predictions, using each tree “votes” for a class, and the class with the most votes
becomes the predicted class for the input data.

Random Forest Classifier Parameters

 n_estimators: Number of trees in the forest.

 More trees generally lead to better performance, but at the cost of computational time.

 Start with a value of 100 and increase as needed.

 max_depth: Maximum depth of each tree.

 Deeper trees can capture more complex patterns, but also risk overfitting.

 Experiment with values between 5 and 15, and consider lower values for smaller datasets.

 max_features: Number of features considered for splitting at each node.

 A common value is ‘sqrt’ (square root of the total number of features).

 Adjust based on dataset size and feature importance.

 criterion: Function used to measure split quality (‘gini’ or ‘entropy’).

 Gini impurity is often slightly faster, but both are generally similar in performance.

 min_samples_split: Minimum samples required to split a node.

 Higher values can prevent overfitting, but too high can hinder model complexity.

 Start with 2 and adjust as needed.

 min_samples_leaf: Minimum samples required to be at a leaf node.

 Similar to min_samples_split, but focused on leaf nodes.

 Start with 1 and adjust as needed.

[26]
What is Random Forest Regression?

Random Forest Regression in machine learning is an ensemble technique capable of performing both regression and classification
tasks with the use of multiple decision trees and a technique called Bootstrap and Aggregation, commonly known as bagging. The
basic idea behind this is to combine multiple decision trees in determining the final output rather than relying on individual
decision trees.

Random Forest has multiple decision trees as base learning models. We randomly perform row sampling and feature sampling
from the dataset forming sample datasets for every model. This part is called Bootstrap.

We need to approach the Random Forest regression technique like any other machine learning technique.

Design a specific question or data and get the source to determine the required data.

Make sure the data is in an accessible format else convert it to the required format.

Specify all noticeable anomalies and missing data points that may be required to achieve the required data.

Create a machine-learning model.

Set the baseline model that you want to achieve

Train the data machine learning model.

Provide an insight into the model with test data

Now compare the performance metrics of both the test data and the predicted data from the model.

If it doesn’t satisfy your expectations, you can try improving your model accordingly or dating your data, or using another data
modeling technique.

How does Random Forest algorithm work?

Random Forest works in two-phase first is to create the random forest by combining N decision tree, and second is to
make predictions for each tree created in the first phase.

The Working process can be explained in the below steps and diagram:

Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points (Subsets).

Step-3: Choose the number N for decision trees that you want to build.

Step-4: Repeat Step 1 & 2.

[27]
Step-5: For new data points, find the predictions of each decision tree, and assign the new data points to the category
that wins the majority votes.

Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the Random
forest classifier. The dataset is divided into subsets and given to each decision tree. During the training phase,
each decision tree produces a prediction result, and when a new data point occurs, then based on the majority of
results, the Random Forest classifier predicts the final decision. Consider the below image:

[28]
Applications of Random Forest
There are mainly four sectors where Random forest mostly used:

1. Banking: Banking sector mostly uses this algorithm for the identification of loan risk.

2. Medicine: With the help of this algorithm, disease trends and risks of the disease can be identified.

3. Land Use: We can identify the areas of similar land use by this algorithm.

4. Marketing: Marketing trends can be identified using this algorithm.

Advantages of Random Forest

o Random Forest is capable of performing both Classification and Regression tasks.
o It is capable of handling large datasets with high dimensionality.
o It enhances the accuracy of the model and prevents the overfitting issue.

K-Nearest Neighbour (KNN) Algorithm for

Machine Learning
o K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on Supervised Learning
technique.

o K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into
the category that is most similar to the available categories.

o K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means
when new data appears then it can be easily classified into a well suite category by using K- NN algorithm.

o K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the
Classification problems.

o K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data.
o It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it
stores the dataset and at the time of classification, it performs an action on the dataset.

o KNN algorithm at the training phase just stores the dataset and when it gets new data, then it classifies that data
into a category that is much similar to the new data.

[29]
o Example: Suppose, we have an image of a creature that looks similar to cat and dog, but we want to know either
it is a cat or dog. So for this identification, we can use the KNN algorithm, as it works on a similarity measure.
Our KNN model will find the similar features of the new data set to the cats and dogs images and based on the
most similar features it will put it in either cat or dog category.

[30]
Why do we need a K-NN Algorithm?
Suppose there are two categories, i.e., Category A and Category B, and we have a new data point x1, so this
data point will lie in which of these categories. To solve this type of problem, we need a K-NN algorithm. With
the help of K-NN, we can easily identify the category or class of a particular dataset. Consider the below
diagram:

How does K-NN work?

The K-NN working can be explained on the basis of the below algorithm:

o Step-1: Select the number K of the neighbors

o Step-2: Calculate the Euclidean distance of K number of neighbors
o Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
o Step-4: Among these k neighbors, count the number of the data points in each category.
o Step-5: Assign the new data points to that category for which the number of the neighbor is maximum.
o Step-6: Our model is ready.

[31]
Suppose we have a new data point and we need to put it in the required category. Consider the below image:

o Firstly, we will choose the number of neighbors, so we will choose the k=5.

[32]
o Next, we will calculate the Euclidean distance between the data points. The Euclidean distance is the distance
between two points, which we have already studied in geometry. It can be calculated as:

o By calculating the Euclidean distance we got the nearest neighbors, as three nearest neighbors in category A and
two nearest neighbors in category B. Consider the below image:

[33]
o As we can see the 3 nearest neighbors are from category A, hence this new data point must belong to category A.

How to select the value of K in the K-NN Algorithm?

Below are some points to remember while selecting the value of K in the K-NN algorithm:

o There is no particular way to determine the best value for "K", so we need to try some values to find the best out
of them. The most preferred value for K is 5.

o A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in the model.
o Large values for K are good, but it may find some difficulties.

Advantages of KNN Algorithm:

o It is simple to implement.
o It is robust to the noisy training data
o It can be more effective if the training data is large.

Disadvantages of KNN Algorithm:

o Always needs to determine the value of K which may be complex some time.
o The computation cost is high because of calculating the distance between the data points for all the training
samples.

[34]
Python implementation of the KNN algorithm
To do the Python implementation of the K-NN algorithm, we will use the same problem and dataset which we
have used in Logistic Regression. But here we will improve the performance of the model. Below is the
problem description:

Problem for K-NN Algorithm: There is a Car manufacturer company that has manufactured a new SUV car.
The company wants to give the ads to the users who are interested in buying that SUV. So for this problem, we
have a dataset that contains multiple user's information through the social network. The dataset contains lots of
information but the Estimated Salary and Age we will consider for the independent variable and
the Purchased variable is for the dependent variable. Below is the dataset:

Logistic Regression in Machine Learning

o Logistic regression is one of the most popular Machine Learning algorithms, which comes under the
Supervised Learning technique. It is used for predicting the categorical dependent variable using a given set of
independent variables.

o Logistic regression predicts the output of a categorical dependent variable. Therefore the outcome must be a
categorical or discrete value. It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.

o Logistic Regression is much similar to the Linear Regression except that how they are used. Linear Regression is
used for solving Regression problems, whereas Logistic regression is used for solving the classification
problems.

o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which predicts
two maximum values (0 or 1).

o The curve from the logistic function indicates the likelihood of something such as whether the cells are cancerous
or not, a mouse is obese or not based on its weight, etc.

o Logistic Regression is a significant machine learning algorithm because it has the ability to provide probabilities
and classify new data using continuous and discrete datasets.

o Logistic Regression can be used to classify the observations using different types of data and can easily determine
the most effective variables used for the classification. The below image is showing the logistic function:

[35]
Note: Logistic regression uses the concept of predictive modeling as regression; therefore, it is
called logistic regression, but is used to classify samples; Therefore, it falls under the
classification algorithm.

Logistic Function (Sigmoid Function):

o The sigmoid function is a mathematical function used to map the predicted values to probabilities.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot go beyond this limit, so it forms a
curve like the "S" form. The S-form curve is called the Sigmoid function or the logistic function.

o In logistic regression, we use the concept of the threshold value, which defines the probability of either 0 or 1.
Such as values above the threshold value tends to 1, and a value below the threshold values tends to 0.

Assumptions for Logistic Regression:

o The dependent variable must be categorical in nature.
o The independent variable should not have multi-collinearity.

Logistic Regression Equation:

The Logistic regression equation can be obtained from the Linear Regression equation. The mathematical steps
to get Logistic Regression equations are given below:

o We know the equation of the straight line can be written as:

[36]
o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above equation by (1-y):

o But we need range between -[infinity] to +[infinity], then take logarithm of the equation it will become

The above equation is the final equation for Logistic Regression.

[37]
Type of Logistic Regression:
On the basis of the categories, Logistic Regression can be classified into three types:

o Binomial: In binomial Logistic regression, there can be only two possible types of the dependent variables, such
as 0 or 1, Pass or Fail, etc.

o Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered types of the
dependent variable, such as "cat", "dogs", or "sheep"

o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of dependent variables,
such as "low", "Medium", or "High".

Python Implementation of Logistic Regression

(Binomial)
To understand the implementation of Logistic Regression in Python, we will use the below example:

Example: There is a dataset given which contains the information of various users obtained from the social
networking sites. There is a car making company that has recently launched a new SUV car. So the company
wanted to check how many users from the dataset, wants to purchase the car.

For this problem, we will build a Machine Learning model using the Logistic regression algorithm. The dataset
is shown in the below image. In this problem, we will predict the purchased variable (Dependent
Variable) by using age and salary (Independent variables).

[38]
Steps in Logistic Regression: To implement the Logistic Regression using Python, we will use the same steps
as we have done in previous topics of Regression. Below are the steps:

o Data Pre-processing step

o Fitting Logistic Regression to the Training set
o Predicting the test result
o Test accuracy of the result(Creation of Confusion matrix)
o Visualizing the test set result.

[39]
Chapter-4
4.1 CONCLUSION
The early prognosis of cardiovascular diseases can aid in making decisions on lifestyle changes in high risk patients
and in turn reduce the complications, which can be a great milestone in the field of medicine. This project resolved the
feature selection i.e. backward elimination and RFECV behind the models and successfully predict the heart disease,
with 85% accuracy. The model used was Logistic Regression. Further for its enhancement, we can train on models
and predict the types of cardiovascular diseases providing recommendations to the users, and also use more enhanced
models.

4.2 REFERENCES
[1] A. H. M. S. U. Marjia Sultana, "Analysis of Data Mining Techniques for Heart Disease
Prediction," 2018.

[2] M. I. K. ,. A. I. ,. S. Musfiq Ali, "Heart Disease Prediction Using Machine Learning

Algorithms".

[3] K. Bhanot, "towarddatascience.com," 13 Feb 2019. [Online]. Available:

https://towardsdatascience.com/predicting-presence-of-heart-diseases-using-machinelearning-36f00f3edb2c.
[Accessed 2 March 2020].

[4] [Online]. Available: https://www.kaggle.com/ronitf/heart-disease-uci#heart.csv.. [Accessed

05 December 2019].

[5] M. A. K. S. H. K. M. a. V. P. M Marimuthu, "A Review on Heart Disease Prediction using

Machine Learning and Data Analytics Approach".

[6] S. Rehman, E. Rehman, M. Ikram, and Z. Jianglin, “Cardiovascular disease (CVD): assessment, prediction and
policy implications,” BMC Public Health, vol. 21, no. 1, p. 1299, 2021, doi: 10.1186/s12889-021-11334-2.

[7] O. Atef, A. B. Nassif, M. A. Talib, and Q. Nassir, “Death/Recovery Prediction for Covid-19 Patients using
Machine Learning,” 2020. [3] A. B. Nassif, I. Shahin, M. Bader, A. Hassan, and N. Werghi, “COVID-19 Detection
Systems Using Deep-Learning Algorithms Based on Speech and Image Data,” Mathematics, 2022.

[8] H. Hijazi, M. Abu Talib, A. Hasasneh, A. Bou Nassif, N. Ahmed, and Q. Nasir, “Wearable Devices, Smartphones,
and Interpretable Artificial Intelligence in Combating COVID-19,” Sensors, vol. 21, no. 24, 2021, doi:
10.3390/s21248424.

[9] O. T. Ali, A. B. Nassif, and L. F. Capretz, “Business intelligence solutions in healthcare a case study:
Transforming OLTP system to BI solution,” in 2013 3rd International Conference on Communications and
Information Technology, ICCIT 2013, 2013, pp. 209–214, doi: 10.1109/ICCITechnology.2013.6579551.

[40]

Final Project Report
33% (3)
Final Project Report
52 pages
Report Heart
No ratings yet
Report Heart
62 pages
Heart Disease Prediction Synopsis
No ratings yet
Heart Disease Prediction Synopsis
36 pages
BT3277 Project Report
No ratings yet
BT3277 Project Report
19 pages
Heart Disease Prediction Final Report
100% (1)
Heart Disease Prediction Final Report
31 pages
Princess Chap 1 To 3
No ratings yet
Princess Chap 1 To 3
52 pages
Heart Disease Predicition
No ratings yet
Heart Disease Predicition
42 pages
HEARTDISEASES Synophis
No ratings yet
HEARTDISEASES Synophis
14 pages
Intelligent Heart Diseases Prediction System Using Datamining Techniques0
50% (6)
Intelligent Heart Diseases Prediction System Using Datamining Techniques0
104 pages
Seminar Report - Shubham.2101229151
No ratings yet
Seminar Report - Shubham.2101229151
21 pages
Predictive Analytics in Healthcare: An Engineering Project in Community Service
No ratings yet
Predictive Analytics in Healthcare: An Engineering Project in Community Service
23 pages
Heart Disease Documentation ML
No ratings yet
Heart Disease Documentation ML
66 pages
Project Report
No ratings yet
Project Report
46 pages
Handwriting Recognition: Chappidi Aswarta Reddy (Urk18Cs080)
No ratings yet
Handwriting Recognition: Chappidi Aswarta Reddy (Urk18Cs080)
27 pages
Lung Disease Prediction System Using Naive Bayes and K Means Clustering
No ratings yet
Lung Disease Prediction System Using Naive Bayes and K Means Clustering
36 pages
REPORT HFP
No ratings yet
REPORT HFP
71 pages
Heart Disease Prediction Using Machine Learning
No ratings yet
Heart Disease Prediction Using Machine Learning
7 pages
Project report (1)
No ratings yet
Project report (1)
35 pages
My
100% (2)
My
59 pages
Jaswanth Narayana R (40738003) Vishesh K (40738007)
100% (1)
Jaswanth Narayana R (40738003) Vishesh K (40738007)
37 pages
Disease Prediction Using Machine Learning
No ratings yet
Disease Prediction Using Machine Learning
6 pages
Lung Disease Prediction - Edited
No ratings yet
Lung Disease Prediction - Edited
35 pages
Project Documentation of Diabetese Detection Using KNN Algorithm
No ratings yet
Project Documentation of Diabetese Detection Using KNN Algorithm
47 pages
Iiver
No ratings yet
Iiver
53 pages
Mini Project On Diabetes Prediction: Information Technology
No ratings yet
Mini Project On Diabetes Prediction: Information Technology
19 pages
Disease Prediction Using Machine Learning: V. Sharon Rose (Urk18Cs178)
No ratings yet
Disease Prediction Using Machine Learning: V. Sharon Rose (Urk18Cs178)
31 pages
Synopsis - HEART DISEASE DETECTION
No ratings yet
Synopsis - HEART DISEASE DETECTION
11 pages
2020 IJ Scopus IJAST Divya Lalitha
No ratings yet
2020 IJ Scopus IJAST Divya Lalitha
10 pages
Jindal 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012072
No ratings yet
Jindal 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012072
11 pages
MULTIPLE DISEASE PRIDICTION USING MACHINE LEARNING1
No ratings yet
MULTIPLE DISEASE PRIDICTION USING MACHINE LEARNING1
48 pages
projectworddoc
No ratings yet
projectworddoc
56 pages
ReportEndSemPresentation (1).Docx (5)
No ratings yet
ReportEndSemPresentation (1).Docx (5)
55 pages
Project Word
No ratings yet
Project Word
58 pages
minipro2[1]
No ratings yet
minipro2[1]
24 pages
FINAL REPORT OF ASK A DOCTOR
No ratings yet
FINAL REPORT OF ASK A DOCTOR
42 pages
Project Report
No ratings yet
Project Report
30 pages
Intelligent Heart Diseases Prediction System Using Datamining Techniques0
No ratings yet
Intelligent Heart Diseases Prediction System Using Datamining Techniques0
104 pages
Heart Disease Prediction Using Machine Learning IJERTV9IS040614
No ratings yet
Heart Disease Prediction Using Machine Learning IJERTV9IS040614
4 pages
First Synopsis of The Project
No ratings yet
First Synopsis of The Project
16 pages
Disease Prediction Application Using Machine Learning
No ratings yet
Disease Prediction Application Using Machine Learning
12 pages
KANAK BLACKBOOK PROJECT (1)
No ratings yet
KANAK BLACKBOOK PROJECT (1)
57 pages
Early Detection of Cardiovascular Diseases Using Machine Learning 2
No ratings yet
Early Detection of Cardiovascular Diseases Using Machine Learning 2
38 pages
Ensemble Nonlinear Support Vector Machine Approach For Predicting Chronic Kidney Diseases - SCI - Computer Systems - Engg
No ratings yet
Ensemble Nonlinear Support Vector Machine Approach For Predicting Chronic Kidney Diseases - SCI - Computer Systems - Engg
15 pages
Proj report
No ratings yet
Proj report
29 pages
Heart Disease Prediction Using CNN, Deep Learning Model
No ratings yet
Heart Disease Prediction Using CNN, Deep Learning Model
9 pages
MINI Project - Report (DR)
No ratings yet
MINI Project - Report (DR)
30 pages
Major Project Report
No ratings yet
Major Project Report
27 pages
A_Smart_Healthcare_Recommendation_System_for_Multi
No ratings yet
A_Smart_Healthcare_Recommendation_System_for_Multi
11 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
83 pages
hearts report final pages
No ratings yet
hearts report final pages
27 pages
Tracking Health Trends on Social Media
No ratings yet
Tracking Health Trends on Social Media
84 pages
Jurnal Penelitian Teknik Informatia 4 (Internasional)
No ratings yet
Jurnal Penelitian Teknik Informatia 4 (Internasional)
11 pages
Diseasereport
No ratings yet
Diseasereport
18 pages
Multi-Disease Prediction Using Machine Learning Algorithm
No ratings yet
Multi-Disease Prediction Using Machine Learning Algorithm
9 pages
FINALreportondiabetesprediction-numbered
No ratings yet
FINALreportondiabetesprediction-numbered
33 pages
Heart Cancer Prediction Using Machine Learning
No ratings yet
Heart Cancer Prediction Using Machine Learning
33 pages
P4 Project Report
No ratings yet
P4 Project Report
28 pages
Heart Disease Detection - Newreport
No ratings yet
Heart Disease Detection - Newreport
57 pages
Medical Expenses Prediction
No ratings yet
Medical Expenses Prediction
51 pages
Diagnostic Chip Technology
From Everand
Diagnostic Chip Technology
Felicia Dunbar
No ratings yet
CTET TET July 2024 ब्रह्मास्त्र Paper 2 (Maths & Science) Complete Course
No ratings yet
CTET TET July 2024 ब्रह्मास्त्र Paper 2 (Maths & Science) Complete Course
9 pages
Cloze
No ratings yet
Cloze
3 pages
Science7 Q2 Module2Week3
No ratings yet
Science7 Q2 Module2Week3
13 pages
Biswas & Raychaudhuri 2024 Eco 5 163 171
No ratings yet
Biswas & Raychaudhuri 2024 Eco 5 163 171
14 pages
Topic 1
No ratings yet
Topic 1
37 pages
Biogeo - Cycles
No ratings yet
Biogeo - Cycles
6 pages
Toluene: Safety Data Sheet
No ratings yet
Toluene: Safety Data Sheet
10 pages
Some People Welcome New Experiences and New People
No ratings yet
Some People Welcome New Experiences and New People
2 pages
Could UV Light Boost THC Production in Cannabis PDF
No ratings yet
Could UV Light Boost THC Production in Cannabis PDF
2 pages
Differential Abundance Analysis For Microbial Marker-Gene Surveys
No ratings yet
Differential Abundance Analysis For Microbial Marker-Gene Surveys
7 pages
Regulation of Glycogen Metabolism Atf
No ratings yet
Regulation of Glycogen Metabolism Atf
5 pages
Part 1C Neuroplasticity
No ratings yet
Part 1C Neuroplasticity
2 pages
Fruit Science - 20191128023722
No ratings yet
Fruit Science - 20191128023722
14 pages
Cellobiase Activity (1ml)
No ratings yet
Cellobiase Activity (1ml)
5 pages
Development 6th Edition Lewis Wolpert
No ratings yet
Development 6th Edition Lewis Wolpert
49 pages
CL 6
No ratings yet
CL 6
6 pages
Thyroid Gland
No ratings yet
Thyroid Gland
112 pages
3 Cardiovascular Physiology
100% (5)
3 Cardiovascular Physiology
79 pages
Download A Dictionary of Genetics 7th Edition Robert C. King ebook All Chapters PDF
100% (8)
Download A Dictionary of Genetics 7th Edition Robert C. King ebook All Chapters PDF
67 pages
Biochemistry I
100% (10)
Biochemistry I
193 pages
Lab 4 Cnidarians, Platyhelminthes, Annelids, Nematodes Write Up
No ratings yet
Lab 4 Cnidarians, Platyhelminthes, Annelids, Nematodes Write Up
4 pages
Instant Download Stiehm's Immune Deficiencies: Inborn Errors of Immunity 2nd Edition Kathleen E. Sullivan PDF All Chapter
100% (3)
Instant Download Stiehm's Immune Deficiencies: Inborn Errors of Immunity 2nd Edition Kathleen E. Sullivan PDF All Chapter
52 pages
SPHA7410
No ratings yet
SPHA7410
94 pages
Mind, Culture, and Activity
No ratings yet
Mind, Culture, and Activity
36 pages
Dahl, H. (1968) - Psychoanalytic Theory of The Instinctual Drives in Relation To Recent Developments
No ratings yet
Dahl, H. (1968) - Psychoanalytic Theory of The Instinctual Drives in Relation To Recent Developments
25 pages
12th Bio Zoology Public Exam March 2024 Answer Key English Medium PDF Download
No ratings yet
12th Bio Zoology Public Exam March 2024 Answer Key English Medium PDF Download
16 pages
Lecture 42 - IDA & Iron Overload
No ratings yet
Lecture 42 - IDA & Iron Overload
50 pages
English Pyq 100
No ratings yet
English Pyq 100
64 pages
Zymobiomics Dna-Rna Miniprep Kit
No ratings yet
Zymobiomics Dna-Rna Miniprep Kit
20 pages
CH 6 and 7
No ratings yet
CH 6 and 7
19 pages