0% found this document useful (0 votes)

21 views

Internship

Internship Ppt

Uploaded by

hmm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Internship

Internship Ppt

Uploaded by

hmm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Vijaya Vittala Institute of Technology

Department of Computer science and Engineering

Internship
on

“EARLY DIABETES PREDICTION USING MACHINE LEARNING ”

Submitted by

MOHAN KRISHNA K
1VJ17CS028

Internship Carried out

At
“Exposys Data Labs”
Internal guide External Guide
Ms. Deepa Pattan Y Visnuvardhan
Asst. Professor, Dept. of CS&E Chief
Director
TABLE OF CONTENTS
• ABOUT THE COMPANY
• ABOUT THE DEPARTMENT
• TASK PERFORMED
• INTRODUCTION
• PROPOSED METHOD WITH ARCHITECTURE
• METHODLOGY
• IMPLEMENTATION

• REFLECTION NOTES

• REFERENCES
ABOUT THE COMPANY
• EXPOSYS DATA LABS AIMS TO SOLVE REAL WORLD BUSINESS PROBLEMS LIKE
AUTOMATION, BIG DATA AND DATA SCIENCE. OUR CORE TEAM OF EXPERTS IN VARIOUS
TECHNOLOGIES HELP BUSINESSES TO IDENTIFY ISSUES, OPPURTUNITIES AND
PROTOTYPE SOLUTIONS USING TRENDING TECHNOLOGIES LIKE AI, ML, DEEP LEARNING
AND DATA SCIENCE. WE FOLLOW A HUMAN-FOCUSSED AND NOT TECHNOLOGY DRIVEN
APPROACH TO ACHIEVE SUCCESS IN OUR CLIENTS ENDEAVOURS.

• “OUR DISCOVERIES ARE BEYOND BELIEF AND IF YOU’RE WITH US, YOU’LL DISCOVER A
NEWER WAY TO THINK!”
ABOUT THE DEPARTMENT
• THE FOLLOWING IS A SUMMARY OF THE DATA SCIENTIST, DJANGO DEVELOPER DEPARTMENT AND ROLES AND
RESPONSIBILITIES, :
• DJANGO DEVELOPERS ARE RESPONSIBLE FOR DEVELOPING CLOUD-BASED PRODUCTS, WORK ON UX WITH FRONT-
END DEVELOPERS
• WORK WITH STAKEHOLDERS TO DETERMINE HOW TO USE BUSINESS DATA FOR VALUABLE BUSINESS SOLUTIONS
• SEARCH FOR WAYS TO GET NEW DATA SOURCES AND ASSESS THEIR ACCURACY
• BROWSE AND ANALYZE ENTERPRISE DATABASES TO SIMPLIFY AND IMPROVE PRODUCT DEVELOPMENT, MARKETING
TECHNIQUES, AND BUSINESS PROCESSES
• CREATE CUSTOM DATA MODELS AND ALGORITHMS
• USE PREDICTIVE MODELS TO IMPROVE CUSTOMER EXPERIENCE, AD TARGETING, REVENUE GENERATION, AND MORE
• DEVELOP THE ORGANIZATION’S TEST MODEL QUALITY AND A/B TESTING FRAMEWORK
• COORDINATE WITH VARIOUS TECHNICAL/FUNCTIONAL TEAMS TO IMPLEMENT MODELS AND MONITOR RESULTS
• DEVELOP PROCESSES, TECHNIQUES, AND TOOLS TO ANALYZE AND MONITOR MODEL PERFORMANCE WHILE
ENSURING DATA ACCURACY
TASK PERFORMED
1.INTRODUCTION
• PROBLEM STATEMENT

• THE NORMAL IDENTIFYING PROCESS IS THAT PATIENTS NEED TO VISIT A DIAGNOSTIC CENTER, CONSULT THEIR
DOCTOR, AND SIT TIGHT FOR A DAY OR MORE TO GET THEIR REPORTS.

• MOREOVER, EVERY TIME THEY WANT TO GET THEIR DIAGNOSIS REPORT, THEY HAVE TO WASTE THEIR MONEY IN VAIN.
DIABETES MELLITUS (DM) IS DEFINED AS A GROUP OF METABOLIC DISORDERS MAINLY CAUSED BY ABNORMAL
INSULIN SECRETION AND/OR ACTION.

• MACHINE LEARNING INTRODUCTION - TOM M. MITCHELL DEFINES MACHINE LEARNING AS A COMPUTER PROGRAM IS
SAID TO LEARN FROM EXPERIENCE E WITH RESPECT TO SOME CLASS OF TASKS T AND PERFORMANCE MEASURE P IF
ITS PERFORMANCE AT TASKS IN T, AS MEASURED BY P, IMPROVES WITH EXPERIENCE E.

• GENERAL DEFINITION - MACHINE LEARNING IS THE TRAINING OF A MODEL FROM DATA THAT GENERALIZES A
DECISION AGAINST A PERFORMANCE MEASURE.
2.1 PROPOSED METHOD

 K -NEAREST NEIGHBOR ALGORITHM: KNN is a method which is used for

classifying objects based on closest training examples in the feature space. KNN is
the most basic type of instance-based learning or lazy learning. It assumes
all instances are points in n-dimensional space. A distance measure is needed to
determine the “closeness” of instances. KNN classifies an instance by finding its
nearest neighbors and picking the most popular class among the neighbors.

Figure 2: KNN model with Two classes

2.2 Architecture of the proposed algorithm

Figure 3: Flow diagram of the proposed system

3. METHODOLOGY

 In this dataset, the missing values are represented by zero values that need to
be replaced. The zero values are replaced by NaN so that missing values can
easily be imputed using the fillna() command. We perform Feature scaling on
the dataset using Minmaxscaler() so that it scales the entire dataset such that
it lies between 0 and 1. It is an important preprocessing step for many
algorithms.
 In the feature correlation heatmap, we can observe that Glucose, Insulin,
Age and BMI are highly correlated with the outcome. So, we select these
features as X and the outcome as Y. The dataset is then split using
train_test_split with an 80:20 ratio.
4. IMPLEMENTATION

 The data is divided into classes, if other data is wanted to classify then it finds the
neighbours of that element based on the majority number of votes for the label.
 The classification report and confusion matrix, accuracy, precision , f1-score are
shown below:

Figure 4.1: Data cells and training classification report of the system
Threshold value

 The threshold value of the model is 0.72 based on ROC curve

Fig 4.2: Diabetes classifier ROC curve

DIABETES EARLY PREDICTION
SYSTEM USING DJANGO

Fig 5: Diabetes prediction deployed to Django framework in which the sdjango server is hosted
locally
DIABETES EARLY PREDICTION
SYSTEM USING DJANGO

Figure 6: Diabetes early prediction system model using

Django framework
Reflection notes

1. Problem Solving Skill

• Handling of Missing data
• Removing rows with them
This method is a simple, but a messy way to handle missing values since in
addition to removing these values, it can potentially remove data that aren’t null.
You can call dropna() on your entire data frame or on specific columns:

Fig 7:dropna() method

• Similar to problem solving skills, there is no one way to increase your
curiosity.
2.WORK EXPERIENCE

 Well, the internship has definitely reaffirmed my passion for Data Science and I am
grateful that my works did leave some traction for future works. The research and
development phase, the communication skills required to talk to different stakeholders, the
curiosity and passion to solve business problems using data (just to name a few) have all
contributed to my interest in this field.
 The Data Science industry is still very young and its job description could somehow seem
vague and ambiguous to job seekers like us. It’s perfectly normal to not possess all the
skills needed as the most job description is idealistically created to align with their best
expectation
REFRERENCES

 [1] W. Xueli, J. Zhiyong and Y. Dahai, "An Improved KNN Algorithm Based on Kernel Methods
and Attribute Reduction," 2015 Fifth International Conference on Instrumentation and Measurement,
Computer, Communication and Control (IMCCC), 2015, pp. 567-570, doi:
10.1109/IMCCC.2015.125.
 [2] D. Shetty, K. Rit, S. Shaikh and N. Patil, "Diabetes disease prediction using data mining," 2017
International Conference on Innovations in Information, Embedded and Communication Systems
(ICIIECS), 2017, pp. 1-5, doi: 10.1109/ICIIECS.2017.8276012.
 [3] V. S. Lakshmi, V. Nithya, K. Sripriya, C. Preethi and K. Logeshwari, "Prediction of Diabetes
Patient Stage Using Ontology Based Machine Learning System," 2019 IEEE International
Conference on System, Computation, Automation and Networking (ICSCAN), 2019, pp. 1-4, doi:
10.1109/ICSCAN.2019.8878831.
 [4] Y. Chang and H. Liu, "Semi-supervised classification algorithm based on the KNN," 2011 IEEE
3rd International Conference on Communication Software and Networks, 2011, pp. 9-12, doi:
10.1109/ICCSN.2011.6014376.

Loan Approval Predictor Using Data Science and Machine Learning Project
100% (1)
Loan Approval Predictor Using Data Science and Machine Learning Project
66 pages
Exposys Data Labs: Internship Report On Data Science Project
No ratings yet
Exposys Data Labs: Internship Report On Data Science Project
23 pages
Diabetes Prediction Using Machine Learning KNN - Algorithm Technique
No ratings yet
Diabetes Prediction Using Machine Learning KNN - Algorithm Technique
4 pages
MANUFINAL
No ratings yet
MANUFINAL
18 pages
final PPT
No ratings yet
final PPT
44 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Data Analysis of Diabetes Using Machine Learning: Dept. of Mechanical Engineering MITS, Madanapalle
No ratings yet
Data Analysis of Diabetes Using Machine Learning: Dept. of Mechanical Engineering MITS, Madanapalle
17 pages
Diabetes
No ratings yet
Diabetes
41 pages
Independent Project
No ratings yet
Independent Project
10 pages
Ads exp 10
No ratings yet
Ads exp 10
10 pages
Vikash Main Presentation
No ratings yet
Vikash Main Presentation
11 pages
A Survey On Medical Diagnosis of Diabetes Using Machine Learning Techniques
No ratings yet
A Survey On Medical Diagnosis of Diabetes Using Machine Learning Techniques
12 pages
Internshippppp Fimnalllll
No ratings yet
Internshippppp Fimnalllll
16 pages
Report
No ratings yet
Report
11 pages
Review 2 Final
No ratings yet
Review 2 Final
27 pages
Diabetes Disease Prediction Using Machine Learning Techniques
No ratings yet
Diabetes Disease Prediction Using Machine Learning Techniques
7 pages
Mini Project
No ratings yet
Mini Project
15 pages
V5i9 0240
No ratings yet
V5i9 0240
4 pages
New Microsoft PowerPoint Presentation (Recovered)
No ratings yet
New Microsoft PowerPoint Presentation (Recovered)
23 pages
Prediction of Diabetes Using R
No ratings yet
Prediction of Diabetes Using R
6 pages
DIABETES DETECTION USING NEURAL NETWORKS[1] [Autosaved]
No ratings yet
DIABETES DETECTION USING NEURAL NETWORKS[1] [Autosaved]
30 pages
FINALreportondiabetesprediction-numbered
No ratings yet
FINALreportondiabetesprediction-numbered
33 pages
Machine Learning
100% (1)
Machine Learning
21 pages
CIEA_Term_Project
No ratings yet
CIEA_Term_Project
19 pages
Supervised Learning Method of Diabetes Prediction
No ratings yet
Supervised Learning Method of Diabetes Prediction
10 pages
PM For Diabetes
No ratings yet
PM For Diabetes
11 pages
Major Project Report 2023-2024
No ratings yet
Major Project Report 2023-2024
33 pages
Ek125 Final Project
No ratings yet
Ek125 Final Project
13 pages
minipro2[1]
No ratings yet
minipro2[1]
24 pages
Diabetes Prediciton Model
100% (1)
Diabetes Prediciton Model
23 pages
Classification of Diabetes Mellitus Using Machine Learning Techniques
No ratings yet
Classification of Diabetes Mellitus Using Machine Learning Techniques
4 pages
Prediction of Diabetes Using Machine Learning: A Modern User-Friendly Model
No ratings yet
Prediction of Diabetes Using Machine Learning: A Modern User-Friendly Model
7 pages
SSE_25_21_114-2
No ratings yet
SSE_25_21_114-2
13 pages
54 Batch Project Documentation-1
No ratings yet
54 Batch Project Documentation-1
82 pages
Applying The Algorithm of Fuzzy K-Nearest Neighbor in Every Class To The Diabetes Mellitus Screening Model
No ratings yet
Applying The Algorithm of Fuzzy K-Nearest Neighbor in Every Class To The Diabetes Mellitus Screening Model
5 pages
Diabetes Prediction - ML
No ratings yet
Diabetes Prediction - ML
29 pages
20BCE7620 AP2021228000397 Experiment-6 Removed
No ratings yet
20BCE7620 AP2021228000397 Experiment-6 Removed
19 pages
Machine Learning and Deep Learning Techniques
No ratings yet
Machine Learning and Deep Learning Techniques
13 pages
Predictive Model For Diabetes Using Machine Learning
No ratings yet
Predictive Model For Diabetes Using Machine Learning
38 pages
DIAPRO - Diabetes Prediction Application
No ratings yet
DIAPRO - Diabetes Prediction Application
18 pages
Projectreport Diabetes Prediction
No ratings yet
Projectreport Diabetes Prediction
22 pages
Dataset
No ratings yet
Dataset
13 pages
Mini Project Report
No ratings yet
Mini Project Report
34 pages
مختار النعيري - The Course Work Submission (1)
No ratings yet
مختار النعيري - The Course Work Submission (1)
31 pages
Disease Prediction Project
No ratings yet
Disease Prediction Project
16 pages
Predicting Diabetes in Medical Datasets Using Machine Learning Techniques
No ratings yet
Predicting Diabetes in Medical Datasets Using Machine Learning Techniques
14 pages
Diabe.pdf
No ratings yet
Diabe.pdf
11 pages
SSE_25_21_114-1
No ratings yet
SSE_25_21_114-1
14 pages
KNN Diabetes Internasional 2
No ratings yet
KNN Diabetes Internasional 2
6 pages
Ijarcce 2020 9712
No ratings yet
Ijarcce 2020 9712
7 pages
Diabetes Prediction
No ratings yet
Diabetes Prediction
15 pages
Bio-Inspired PSO For Improving Neural Based Diabetes Prediction System
No ratings yet
Bio-Inspired PSO For Improving Neural Based Diabetes Prediction System
21 pages
IEEE Paper 1
No ratings yet
IEEE Paper 1
5 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
batch34_diabetis prediction ML_formatted
No ratings yet
batch34_diabetis prediction ML_formatted
81 pages
Data Science Paper
No ratings yet
Data Science Paper
8 pages
Artificial Intelligence Approaches For Predicting Diabetes in Egypt
No ratings yet
Artificial Intelligence Approaches For Predicting Diabetes in Egypt
19 pages
IPL Winning Prediction Intern Report
No ratings yet
IPL Winning Prediction Intern Report
52 pages
Ijs DR 2205103
No ratings yet
Ijs DR 2205103
4 pages
DSPYProjectReport(1) (1)
No ratings yet
DSPYProjectReport(1) (1)
14 pages
Diabetes_Prediction_Presentation
No ratings yet
Diabetes_Prediction_Presentation
12 pages
Is Data Scientist The Highest Paying Job 1631901837
No ratings yet
Is Data Scientist The Highest Paying Job 1631901837
10 pages
Tlads Handbook Master Final
No ratings yet
Tlads Handbook Master Final
141 pages
Role of Data Science in Covid-19
100% (1)
Role of Data Science in Covid-19
13 pages
Ds Intro KK
No ratings yet
Ds Intro KK
11 pages
Hariprasath Conferencepaper
No ratings yet
Hariprasath Conferencepaper
6 pages
DP-100 Study Guide
No ratings yet
DP-100 Study Guide
9 pages
Shivdip Dilip Deshmukh: Data Scientist at TCS
No ratings yet
Shivdip Dilip Deshmukh: Data Scientist at TCS
3 pages
Transformative Impact of Exponential Technologies For Implementation of HR Analytics in India It Sector Bikrantkesari
No ratings yet
Transformative Impact of Exponential Technologies For Implementation of HR Analytics in India It Sector Bikrantkesari
20 pages
The Future of Employment - How AI Jobs Will Reshape The Workforce and Their Solutions
No ratings yet
The Future of Employment - How AI Jobs Will Reshape The Workforce and Their Solutions
3 pages
FYE 105 Career Plan Exercise
No ratings yet
FYE 105 Career Plan Exercise
3 pages
Topic 3 Leading Change at The Executive Level
No ratings yet
Topic 3 Leading Change at The Executive Level
11 pages
Assignment 2 Part 1
No ratings yet
Assignment 2 Part 1
2 pages
BCG GAMMA DS Case Interview Prep
No ratings yet
BCG GAMMA DS Case Interview Prep
24 pages
Girls in Research App Builder Student Guide
No ratings yet
Girls in Research App Builder Student Guide
7 pages
Coursera 24XP2GUYFGQ9
No ratings yet
Coursera 24XP2GUYFGQ9
1 page
AI_Book_8_Ch1
No ratings yet
AI_Book_8_Ch1
2 pages
Big Data-Driven Decision-Making at Domino's Pizza
No ratings yet
Big Data-Driven Decision-Making at Domino's Pizza
4 pages
Introduction To Data Engineering
No ratings yet
Introduction To Data Engineering
8 pages
Machine Learning and Data Science: Fundamentals and Applications 1st Edition Prateek Agrawal (Editor) instant download
100% (3)
Machine Learning and Data Science: Fundamentals and Applications 1st Edition Prateek Agrawal (Editor) instant download
71 pages
Python and Data Structures Roadmap
No ratings yet
Python and Data Structures Roadmap
14 pages
DSPM Notes
No ratings yet
DSPM Notes
21 pages
【23秋】2023-2024-1 (Fall) UndergraduateClass V2 (23.8.14)
No ratings yet
【23秋】2023-2024-1 (Fall) UndergraduateClass V2 (23.8.14)
44 pages
Python Unit 2
No ratings yet
Python Unit 2
81 pages
The Impact of Artificial Intelligence On Business Strategy and Decision-Making Processes
No ratings yet
The Impact of Artificial Intelligence On Business Strategy and Decision-Making Processes
9 pages
Org Context_Module 11_IT and Legal Analytics
No ratings yet
Org Context_Module 11_IT and Legal Analytics
54 pages
Program Name: Master's in
No ratings yet
Program Name: Master's in
19 pages
Modern Business Analytics: Practical Data Science For Decision-Making Matt Taddy Full Chapter Instant Download
100% (5)
Modern Business Analytics: Practical Data Science For Decision-Making Matt Taddy Full Chapter Instant Download
44 pages
University of East London Brochure
100% (1)
University of East London Brochure
16 pages
Data-Science-in-Finance
No ratings yet
Data-Science-in-Finance
9 pages

Internship

Uploaded by

Internship

Uploaded by

Vijaya Vittala Institute of Technology

Department of Computer science and Engineering

“EARLY DIABETES PREDICTION USING MACHINE LEARNING ”

Internship Carried out

 K -NEAREST NEIGHBOR ALGORITHM: KNN is a method which is used for

Figure 2: KNN model with Two classes

Figure 3: Flow diagram of the proposed system

 The threshold value of the model is 0.72 based on ROC curve

Fig 4.2: Diabetes classifier ROC curve

Figure 6: Diabetes early prediction system model using

1. Problem Solving Skill

Fig 7:dropna() method

You might also like