0% found this document useful (0 votes)
7 views

Objective

Uploaded by

saidevanpittu41
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Objective

Uploaded by

saidevanpittu41
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction

 Chronic kidney Disease (CKD) means your kidneys are damaged and not
filtering your blood the way it should.
 The primary role of kidneys is to filter extra water and waste from your
blood to produce urine and if the person has suffered from CKD, it means
that wastes are not collected properly in the body.
 This disease is chronic because of the damage gradually over a long
period.
 Due to CKD may have some health troubles. There are many causes for
CKD like diabetes, high blood pressure, heart disease. Along with these
critical diseases, CKD also depends on age and gender.
 If your kidney is not working, then you may notice one or more symptoms
like abdominal pain, back pain, diarrhoea, fever, nosebleeds, rash,
vomiting.
 There are two main diseases of CKD:(i) diabetes and (ii) high blood
pressure, so that controlling of these two diseases is the prevention of
CKD.
 Usually, CKD does not give any sign till kidney is damaged badly.

Abstract
 Objective: Develop a machine learning-based system for early detection
of Chronic kidney disease (CKD) to aid in timely and effective treatment,
leveraging public CKD datasets for training.
 Data Processing: Ensure consistent input quality by addressing missing
values, scaling features, and encoding categorical data, followed by LASSO
feature selection to identify key predictors of CKD.
 Modeling Approach: Employ XGBoost as the primary classification model
due to its strong performance on imbalanced datasets, complemented by
a Linear Support Vector Machine (LSVM) for comparative analysis.
 Class Imbalance Solution: Use Synthetic Minority Oversampling
Technique (SMOTE) to balance the dataset, enhancing model accuracy and
ensuring fair representation of CKD and non-CKD cases.
 Deployment and User Interface: Utilize Streamli/Flask for real-time
CKD predictions through an accessible interface, providing healthcare
professionals with a reliable tool for early CKD diagnosis, ultimately
contributing to improved patient outcomes.

Chronic Kidney Disease is one of the most critical illness nowadays and
proper diagnosis is required as soon as possible. Machine learning
technique has become reliable for medical treatment.
With the help of a machine learning classifier algorithms, the doctor can
detect the disease on time. For this perspective, Chronic Kidney Disease
prediction has been discussed in this article
Literature review
Methodology
A. DATASET
Chronic Kidney Disease dataset is used for this research work.
Many researchers had also used this dataset. This dataset is being
provided by the UC Irvine Machine Learning Repository and it is
available on the UCI website. This dataset contains 400 instances
and 24 attributes with 1target attribute. The target attribute has
labelled in two-class to represent CKD or non-CKD. The dataset was
collected from various hospitals in 2015. It contains also missing
value.

B. METHODOLOGY

C. PREPROCESSING OF DATA
Data preprocessing could be a strategy that is utilized to
change over the raw information into a clean dataset. It is a basic
step to train every machine learning classifier algorithm. This
technique concludes such actions as handle missing values,
rescaling of the dataset, transform into binary data and standardize
of the dataset.
D. FEATURE SELECTION
Feature selection is needed for trained each machine learning
classifier because without removing unnecessary attributes from
the dataset result may be affected. The classifier algorithm with
feature selection gives better performance and reduce the
execution time of the model.
E. CLASSIFICATION ALGORITHMS
Classification technique is an important feature of supervised
learning. Classifiers learn from the training dataset and apply on
the testing dataset for finding the target attribute.
F. PERFORMANCE EVALUATION MEASURE
1) CONFUSION MATRIX DESCRIPTION
2) CLASSIFICATION ACCURACY
3) CLASSIFICATION ERROR
4) PRECISION
5) RECALL
6) F-MEASURE
7) ROC AND AUC
G. SMOTE
Synthetic Minority Oversampling Technique (SMOTE)
it is used for oversampling the minority class. It is also known as a
balancer. takes the whole dataset as input but works only on
minority class. It increases the percentage of minority class. SMOTE
used KNN for finding new instances. It does not make any change in
the majority cases. Then new examples are not simply duplicating
of existing minority cases. Instead, the calculation takes tests of the
component space for each target class and its closest neighbours
and then produces new models that join attributes of the objective
case with the highlights of its neighbours. This methodology builds
the high lights accessible for each class and makes tests
progressively broad.

You might also like