0% found this document useful (0 votes)

46 views6 pages

Lung Disease Prediction System Using Data Mining Techniques

This document summarizes a research article that proposes using data mining techniques like classification and clustering to build a system for predicting lung disease. Specifically, it examines using Naive Bayes classification and a decision tree approach to analyze patient data and symptoms to detect lung diseases like cancer earlier. The system would work by having users enter symptoms and then mapping those symptoms to the training database to predict the disease state and severity level. Classification techniques like Naive Bayes and neural networks are discussed as methods that could be used to build the predictive models. The goal is to help doctors diagnose and treat lung diseases more quickly to improve patient outcomes.

Uploaded by

KEZZIA MAE ABELLA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views6 pages

Lung Disease Prediction System Using Data Mining Techniques

Uploaded by

KEZZIA MAE ABELLA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/320045271

Lung disease prediction system using data mining techniques

Article in Journal of Advanced Research in Dynamical and Control Systems · January 2017

CITATIONS READS
20 584

2 authors, including:

Kasturi Karuppiah
Vels University
9 PUBLICATIONS 24 CITATIONS

SEE PROFILE

All content following this page was uploaded by Kasturi Karuppiah on 15 June 2021.

The user has requested enhancement of the downloaded file.

Jour of Adv Research in Dynamical & Control Systems, Vol. 9, No. 5, 2017

Lung Disease Prediction System Using Data

Mining Techniques
S. Durga, Research Scholar, M.Phil (CS), Vels University, Chennai. E-mail:[email protected]
K. Kasturi, Assistant Professor, Dept of I.T, Vels University, Chennai. E-mail:[email protected]
Abstract--- Data mining is defined as analyzing very large amount of data for getting some useful information. Data
mining techniques like association rule mining, classification and clustering is implemented to analyze the different
types of disease. Classification is an important problem in Data mining. Given a database contains a collection of
records, each with a single class label, a classifier performs a brief and clear definition for each class that can be
used to classify successive records. Data mining plays an important role in medical systems. It is used to discover
the knowledge out of data and presenting it in the form that human can easily understand. It is a cooperative effort of
humans and computers. There are two primary goals of data mining – Prediction and Description. Prediction
involves some variables or fields in the data set to predict unknown or future values of other variables of interest.
Description focuses on finding patterns describing the data that can be interpreted by humans. It is very useful for
predicting diseases such as Heart disease, Lung disease. Lung cancer is one of the most dangerous diseases in the
world. The early detection of lung cancer can cure the disease completely. Data mining plays an effective role by
using Naïve Bayes and Artificial Neural Network to massive volume of healthcare of data. The health care industry
collects huge amounts of data which unfortunately are not mined to find the hidden data. The Naïve Bayes aims at
delivering robust classifications also when dealing with small or incomplete data sets. The aim of the paper is to
detect and diagnose the lung diesases as early as possible which will help the doctor to save the patient’s life. This
paper describes how lung cancer was predicted and controlled, using data mining techniques.
Keywords--- Data Mining, Lung Cancer, Naïve Bayes, Classification.

I. Introduction
Lung cancer is also known as lung carcinoma is a malignant lung tumor characterized by uncontrolled cell
growth in tissues of the lung(1-2). If it is treated this growth can spread beyond the lung by the process of metastasis
into nearby tissue or other parts of the body. The majority factor of lung cancer are due to tobacco smoking. The
other factors are the combination of genetic factors and exposure to radon gas, asbestos, second-hand smoke, or
other forms of air pollution.
The two main types are:
• Small-cell lung carcinoma (SCLC)
• Non-Small-cell lung carcinoma (NSCLC)
The Symptoms of lung cancer are coughing, coughing up blood, wheezing, weakness, fever, bone pain etc.
Many of the symptoms of cancer such as poor appetite, weight loss are not specific. In many people, the cancer has
already spread beyond the original site by the time they have symptoms and seek medical attention. The lung cancer
spreads on brain, bone, kidneys etc. About 10% people with lung cancer do not have symptoms at diagnosis. These
cancers are found on routine chest radiography.
Treatment and long term outcomes depend on the type of cancer, stage, and person’s health. The common
treatments are surgery, chemotherapy, radiotherapy.
Smoking prevention and smoking cessation are effective ways of preventing the development of lung cancer.

II. Diagnosis
A chest radiograph is one of the steps if a person reports the symptoms for lung cancer. This may reveal on
widening of the media stinum, atelectasis, consolidation or pleural effusion. CT imaging is used to provide more
information about the type and extent of the disease. Bronchoscopy or CT- guided biopsy is often used to sample the
tumor for histopathology. The defective diagnosis of lung cancer is based on histological examination of the
suspicious tissue in the context of the clinical and radiological features. CT imaging should not be used for longer or
more frequently than indicated as extended surveillance exposes people to increased radiation.

ISSN 1943-023X 62
Jour of Adv Research in Dynamical & Control Systems, Vol. 9, No. 5, 2017

Worldwide in 2012, lung cancer occurred in 1.8 million people and resulted in 1.6 million people deaths. This is
the most common cancer- related death in men and 2nd most common in women after “breast cancer”. The most
common age at diagnosis is 70 years.

III. Proposed System

In earlier days and still, predicting the disease as earlier is not possible. If disease is predicted earlier which will
help the doctor to save the life of the people? This paper proposes to predict the disease as early as possible based on
the symptoms. Data mining techniques like classification and clustering are helpful for predicting the disease. We
can predict the disease by using the data mining hybrid approach. This paper will predict the disease state based on
the symptoms.

Workflow for Disease Prediction System

The user will enter the symptoms according to the disease states he is suffering from the disease. The mapping of
user symptoms and the prior database is once done; the result will be generated according to the disease state and
level of affection.

IV. Classification
Classification is the process of finding a set of models (or functions) which describe and distinguish the data
classes or concepts, for the purposes of being able to use the model to predict the class of objects whose class label
is unknown(5,6). The derived model is based on the analysis of a set of training data (i.e., data objects whose class
label is known). The derived model may be represented in various forms, such as classification (IF-THEN) rules,
decision trees, mathematical formulae or neural networks. A decision tree is a chart-like tree structure, where each
node denotes a test on an attribute value, each branch represents an outcome of the test, and tree leaves represent
classes or class distributions. Decision trees can be easily converted to classification rules. A neural network is a
collection of linear threshold units that can be trained to distinguish objects of different classes. Classification can be
used for predicting the class label of data objects. In many applications, one may like to predict some missing or

ISSN 1943-023X 63
Jour of Adv Research in Dynamical & Control Systems, Vol. 9, No. 5, 2017

unavailable data values rather than class labels. When the predicted values are numerical data and are often
specifically referred to as prediction. Prediction may refer to both the data value prediction and class label
prediction; it is usually referred to data value prediction and thus is distinct from classification. Classification is a
data mining machine learning technique used to predict group membership for data instances. Popular classification
techniques include decision tree and neural networks. The Naïve Bayesian classifier is one of the classification
algorithms and is based on Bayes theorem. A Naïve Bayesian algorithm is easy to build, with no complicated
iterative parameter estimation which makes it particularly useful for very large datasets. Bayes theorem provides a
way of calculating the posterior probability, P(c|x), from P(c), P(x) and P(x|c). Naïve Bayes classifier assumes that
the effect of the value of a predictor (x) on a given class(c) is independent of the values of other predictors.

Where,
P(c|x) is the posterior probability of class (target) given predictor(attribute)
P(c) is the prior probability of class
P(x|c) is the likelihood which is the probability of predictor given class
P(x) is the prior probability of predictor

V. Bayesian Classification
It is based on Bayes Theorem. Bayesian classifiers are the statistical classifiers. Bayesian classifiers can predict
the class membership probabilities such as the probability that a tuple belongs to a particular class.(3)
Bayes theorem is named after Thomas Bayes. There are 2 types of probabilities (4)
• Posterior Probability [P(H/X)]
• Prior Probability [P(H)]
Where X is a data tuple and H is some hypothesis.
According to Bayes Theorem,
P(H/X) = P(X/H)P(H)/P(X)
Bayes theorem is the method of finding the converse probability of the unconditional,
P(E/C)=P(C/E)P(E)/P(C) =P(C,E)/P( C)

ISSN 1943-023X 64
Jour of Adv Research in Dynamical & Control Systems, Vol. 9, No. 5, 2017

VI. Decision Tree

A Decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal node denotes a
test on an attribute, each branch denotes the outcome of a test and each leaf node holds a class label.(7)

The top most node in the tree is the root node. Next to top most nodes is the leaf node. The user will enter the
symptoms. It can be classified as low, medium, high Level based on age in the above decision tree structure.

VII. Conclusion and Future Work

Prevention of lung diseases is low in India, especially in rural, did not notice at early stage, because of lack of
awareness. In this paper am proposing a system which can predict the diseases based on the input symptoms
provided by the user and help them to analyze their health status so people can take some precautions as per the
result. It could help doctors to know the health state of the patient and based on that manual diagnosis of the disease
can also be easily possible. In Future work, have planned to conduct experiments on real time large health datasets
to predict all the diseases and compare algorithm with other data mining algorithm. Continuous data can also be
used.

References
[1] Banu, M.N. and Gomathy, B. Disease Predicting System Using Data Mining Techniques. International
Journal of Technical Research and Applications 1 (5) (2013) 41-45.
[2] Ahmed, K., Abdullah-Al-Emran, A.A.E., Jesmin, T., Mukti, R.F., Rahman, M. and Ahmed, F. Early
detection of lung cancer risk using data mining. Asian Pacific Journal of Cancer Prevention 14 (1) (2013)
595-598.

ISSN 1943-023X 65
Jour of Adv Research in Dynamical & Control Systems, Vol. 9, No. 5, 2017

[3] Pradhan, M. and Sahu, R.K. Predict the onset of diabetes disease using Artificial Neural Network (ANN).
International Journal of Computer Science & Emerging Technologies (E-ISSN: 2044-6004) 2 (2) (2011).
[4] Pattekari, S.A. and Parveen, A. Prediction system for heart disease using Naïve Bayes. International
Journal of Advanced Computer and Mathematical Sciences 3 (3) (2012) 290-294.
[5] Vijayarani, S. and Divya, M. An efficient algorithm for generating classification rules. International
Journal of Computer Science and Technology 2 (4) (2011).
[6] Agrawal, A. and Choudhary, A. Association rule mining based hotspot analysis on seer lung cancer data.
International Journal of Knowledge Discovery in Bioinformatics (IJKDB) 2 (2) (2011) 34-54.
[7] Freund, Y. and Mason, L. The alternating decision tree learning algorithm. In ICML, 1999, 124-133.

ISSN 1943-023X 66

View publication stats

Hay Job Evaluation Manual Best Practice PDF
92% (13)
Hay Job Evaluation Manual Best Practice PDF
376 pages
BKK Hospital Medical Certificate
No ratings yet
BKK Hospital Medical Certificate
12 pages
Lord of Misrule The Autobiography of Christopher Lee Christopher Lee - The latest ebook edition with all chapters is now available
No ratings yet
Lord of Misrule The Autobiography of Christopher Lee Christopher Lee - The latest ebook edition with all chapters is now available
50 pages
The National Kidney and Transplant Institute
No ratings yet
The National Kidney and Transplant Institute
55 pages
ML for Air Quality
No ratings yet
ML for Air Quality
11 pages
Lung Cancer Detection Using Machine Learning
No ratings yet
Lung Cancer Detection Using Machine Learning
24 pages
A Critical Study of Classification Algorithms For Lungcancer Disease Detection and Diagnosis
No ratings yet
A Critical Study of Classification Algorithms For Lungcancer Disease Detection and Diagnosis
8 pages
Diagnosis of Lung Cancer Prediction System Using Data Mining Classification Techniques
No ratings yet
Diagnosis of Lung Cancer Prediction System Using Data Mining Classification Techniques
7 pages
Lung Cancer Prediction by Using Machine Learning Models With Distributed System and Weka Visualization Ijariie24170
No ratings yet
Lung Cancer Prediction by Using Machine Learning Models With Distributed System and Weka Visualization Ijariie24170
15 pages
Prediction of Lung Cancer Using Machine Learning Classifier
No ratings yet
Prediction of Lung Cancer Using Machine Learning Classifier
11 pages
An Intelligent Algorithm for Lung Cancer Diagnosis Using Extracted Features
No ratings yet
An Intelligent Algorithm for Lung Cancer Diagnosis Using Extracted Features
16 pages
Paper Mtech Sce
No ratings yet
Paper Mtech Sce
9 pages
Otcon2022 Paper 200
No ratings yet
Otcon2022 Paper 200
6 pages
Documentation
No ratings yet
Documentation
67 pages
Lung Cancer Prediction Using Data Mining Techniques
No ratings yet
Lung Cancer Prediction Using Data Mining Techniques
6 pages
AI Research Paper Final
No ratings yet
AI Research Paper Final
12 pages
Lung Cancer Paper
No ratings yet
Lung Cancer Paper
7 pages
2024
No ratings yet
2024
14 pages
PPT_minor[1]
No ratings yet
PPT_minor[1]
21 pages
Lung Cancer Prediction Model Using Machine Learning Techniques
No ratings yet
Lung Cancer Prediction Model Using Machine Learning Techniques
8 pages
Lung cancer detection_Research Paper-2
100% (1)
Lung cancer detection_Research Paper-2
9 pages
Lung Cancer Risk Prediction and Feature Importance
No ratings yet
Lung Cancer Risk Prediction and Feature Importance
6 pages
Lung Cancer Report
No ratings yet
Lung Cancer Report
55 pages
Lung Cancer Prediction
No ratings yet
Lung Cancer Prediction
4 pages
V5I2N01
No ratings yet
V5I2N01
7 pages
Lung Cancer Diagnosis Using Prewitt & SVM As Hybrid Model
No ratings yet
Lung Cancer Diagnosis Using Prewitt & SVM As Hybrid Model
8 pages
Prediction Lung Cancer in Machine Learning Perspective
No ratings yet
Prediction Lung Cancer in Machine Learning Perspective
5 pages
Predicting Early Stage Lung Cancer Using Advanced Machine Learning Methods
No ratings yet
Predicting Early Stage Lung Cancer Using Advanced Machine Learning Methods
7 pages
Lung Cancer Detection Using Machine Learning Algorithms and Neural Network On A Conducted Survey Dataset Lung Cancer Detection
No ratings yet
Lung Cancer Detection Using Machine Learning Algorithms and Neural Network On A Conducted Survey Dataset Lung Cancer Detection
4 pages
latex_first_project (4)
No ratings yet
latex_first_project (4)
7 pages
Lung Cancer Detection Using Machine Learning IJERTCONV7IS01011
No ratings yet
Lung Cancer Detection Using Machine Learning IJERTCONV7IS01011
6 pages
Lungcancer
No ratings yet
Lungcancer
5 pages
Early Stage Lung Cancer Prediction Using Various Machine Learning Techniques
No ratings yet
Early Stage Lung Cancer Prediction Using Various Machine Learning Techniques
8 pages
Hybrid model detection and classification of lung cancer
No ratings yet
Hybrid model detection and classification of lung cancer
11 pages
1 s2.0 S266660302200015X Main
No ratings yet
1 s2.0 S266660302200015X Main
7 pages
Prediction of Lung Cancer Patient Survival Using Machine Learning Techniques
No ratings yet
Prediction of Lung Cancer Patient Survival Using Machine Learning Techniques
11 pages
Early Prediction of Disease Using Machine Learning: Leveraging Medical Data for Accurate Classification
No ratings yet
Early Prediction of Disease Using Machine Learning: Leveraging Medical Data for Accurate Classification
11 pages
DOI_FINAL
No ratings yet
DOI_FINAL
10 pages
A Novel Method To Detect Lung Cancer Using Deep Learning
No ratings yet
A Novel Method To Detect Lung Cancer Using Deep Learning
9 pages
Icimia48430 2020 9074947
No ratings yet
Icimia48430 2020 9074947
8 pages
Graduation Project Paper
No ratings yet
Graduation Project Paper
8 pages
Nishajenipher 2020
No ratings yet
Nishajenipher 2020
6 pages
Poc 3-1 All Units Notes
No ratings yet
Poc 3-1 All Units Notes
10 pages
597 Icac3n23
No ratings yet
597 Icac3n23
5 pages
M1 DS Project LungCancerPrediction
No ratings yet
M1 DS Project LungCancerPrediction
6 pages
IJRAR22B3053
No ratings yet
IJRAR22B3053
18 pages
1-s2.0-S1746809423007528-main
No ratings yet
1-s2.0-S1746809423007528-main
12 pages
1CD22MC043 Part 2
No ratings yet
1CD22MC043 Part 2
50 pages
Final PPT Lung
100% (4)
Final PPT Lung
21 pages
11
No ratings yet
11
11 pages
Deep Learning Techniques For Lung Cancer Recogniti
No ratings yet
Deep Learning Techniques For Lung Cancer Recogniti
7 pages
lung cancer review
No ratings yet
lung cancer review
39 pages
Deep Learning and Machine Learning Algorithms to Predict Lung Cancer
No ratings yet
Deep Learning and Machine Learning Algorithms to Predict Lung Cancer
5 pages
Lung Cancer Prediction Using Feed Forward Back Propagation Neural Networks With Optimal Features
No ratings yet
Lung Cancer Prediction Using Feed Forward Back Propagation Neural Networks With Optimal Features
8 pages
BDM presentation
No ratings yet
BDM presentation
10 pages
Systematic Review for Lung Cancer Detection and Lu
No ratings yet
Systematic Review for Lung Cancer Detection and Lu
21 pages
Optimal Deep Learning Model for Classification of Lung Cancer
No ratings yet
Optimal Deep Learning Model for Classification of Lung Cancer
31 pages
Short-Term_Lung_Cancer_Survival_Prediction_Combining_Linear_Regression_and_Convolutional_Neural_Network
No ratings yet
Short-Term_Lung_Cancer_Survival_Prediction_Combining_Linear_Regression_and_Convolutional_Neural_Network
6 pages
Disease_Detection_Using_Artificial_Intelligence
No ratings yet
Disease_Detection_Using_Artificial_Intelligence
5 pages
1-s2.0-S2210650224003055-main
No ratings yet
1-s2.0-S2210650224003055-main
15 pages
AI - Deep Learning Article
No ratings yet
AI - Deep Learning Article
6 pages
Updated Lung Format Two
No ratings yet
Updated Lung Format Two
8 pages
Biostatistics Explored Through R Software: An Overview
From Everand
Biostatistics Explored Through R Software: An Overview
Vinaitheerthan Renganathan
3.5/5 (2)
The Logical Structure of Clinical Medicine
From Everand
The Logical Structure of Clinical Medicine
Ulrich Müller-Kolck
No ratings yet
Achilles Revision Decision Tree and Components
No ratings yet
Achilles Revision Decision Tree and Components
3 pages
Study Guide 1st Year Mbbs 2018 19
No ratings yet
Study Guide 1st Year Mbbs 2018 19
43 pages
Meconium Aspiration Syndrome (MAS)
100% (1)
Meconium Aspiration Syndrome (MAS)
12 pages
Flash Cards Biopsy
No ratings yet
Flash Cards Biopsy
23 pages
Jama Murphy 2020 RV 200007 1605896147.19162
No ratings yet
Jama Murphy 2020 RV 200007 1605896147.19162
17 pages
Family Nursing Care Plan Proper Cigarette Smoking As Health Threat
No ratings yet
Family Nursing Care Plan Proper Cigarette Smoking As Health Threat
2 pages
Abnormal Pregnancy
No ratings yet
Abnormal Pregnancy
179 pages
TranslationStrategiesUsedinTranslatingMedical
No ratings yet
TranslationStrategiesUsedinTranslatingMedical
35 pages
Advisory On Leave Certificate
No ratings yet
Advisory On Leave Certificate
4 pages
Enhancing Diabetes Management in Uganda: A Mobile Based Framework To Improving Patient Medication Adherence A Case Study of Mbarara Referral Hospital
No ratings yet
Enhancing Diabetes Management in Uganda: A Mobile Based Framework To Improving Patient Medication Adherence A Case Study of Mbarara Referral Hospital
82 pages
DAFTAR PUSTAKA (Meningitis Otogenik Ec OMSK Maligna)
No ratings yet
DAFTAR PUSTAKA (Meningitis Otogenik Ec OMSK Maligna)
2 pages
6.effect of Oral Care Gel For Burning Mouth Syndrome in A Patient With Hepatitis C
No ratings yet
6.effect of Oral Care Gel For Burning Mouth Syndrome in A Patient With Hepatitis C
3 pages
Chronic Disease in Millenials
No ratings yet
Chronic Disease in Millenials
24 pages
Diagnostic Test
No ratings yet
Diagnostic Test
26 pages
Mariam Affifi
No ratings yet
Mariam Affifi
34 pages
The Importance of Mental Health Awareness and Support_ Breaking the Stigma, Building a Healthier Future
No ratings yet
The Importance of Mental Health Awareness and Support_ Breaking the Stigma, Building a Healthier Future
4 pages
Antifungal Drugs
No ratings yet
Antifungal Drugs
12 pages
128-132+A+Randomized+Controlled+Trial+on+Zinc+Supplementation+for+Prevention+of+Acute
No ratings yet
128-132+A+Randomized+Controlled+Trial+on+Zinc+Supplementation+for+Prevention+of+Acute
5 pages
Human Resource Management During Covid-19 Pandemic: An Insight Onthe Challenges For Human Resource Practitioners
No ratings yet
Human Resource Management During Covid-19 Pandemic: An Insight Onthe Challenges For Human Resource Practitioners
8 pages
Free Powerpoint-WPS Office
No ratings yet
Free Powerpoint-WPS Office
11 pages
CHOLERA
No ratings yet
CHOLERA
8 pages
Chapter 4 Burn Out - Unlocked
No ratings yet
Chapter 4 Burn Out - Unlocked
5 pages
Hombro Congelado 2
No ratings yet
Hombro Congelado 2
16 pages
Concept Map
No ratings yet
Concept Map
9 pages
Chronic Pancreatitis
No ratings yet
Chronic Pancreatitis
8 pages
Quackery-2024-2025
No ratings yet
Quackery-2024-2025
45 pages

Lung Disease Prediction System Using Data Mining Techniques

Uploaded by

Lung Disease Prediction System Using Data Mining Techniques

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Lung disease prediction system using data mining techniques

Article in Journal of Advanced Research in Dynamical and Control Systems · January 2017

The user has requested enhancement of the downloaded file.

Lung Disease Prediction System Using Data

III. Proposed System

Workflow for Disease Prediction System

VI. Decision Tree

VII. Conclusion and Future Work

View publication stats

You might also like