33% found this document useful (9 votes)

13K views

Machine Learning Internship Report

The document provides information about Technofly Solutions, an electronics product design and development company. It discusses the company's expertise in areas like embedded software, automotive systems, ASIC VLSI design, and process quality. The company offers services like embedded software engineering, hardware and software design, product realization, and customization. It has departments for real-time embedded systems, low power VLSI design, verification, and software engineering.

Uploaded by

suchithra Nijaguna

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

33% found this document useful (9 votes)

13K views

Machine Learning Internship Report

Uploaded by

suchithra Nijaguna

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 31

CHAPTER 1

ABOUT THE ORGANISATION

Technofly Solutions is a leading electronics product design, development and services

company. The professionals with industrial experience in embedded technology, real time
software, process control and industrial electronics held the company.
The company is the pioneers in design and development of Single Board Computers,
Compilers for micro-controllers within India. Talented professional in the field of embedded
hardware, software design and development toil to reach its excellence.

Technofly Solutions & Consulting was found in year 2017 by a team with 14+ years of
experience in embedded systems domain. Technofly Solutions focuses globally on
automotive embedded technologies and VLSI Design, Corporate Training & Consulting. Till
now we have delivered more than 15+ Corporate Trainings for companies working in
Embedded Automotive Technologies in India. Also involved in the Development of OBD2
(On Board Diagnose Product for Passenger cars) for clients in India.

Technical Expertise

Expertise in Embedded software development:

1. Microcontroller Drivers

2. Boot loader and System software

3. CAN, LIN and other serial communication software

4. On Board Diagnostics services [ISO-14229 and ISO-15765]

5. Model based software development: Modeling, Simulation, Auto coding and Reverse
Engineering

6. Application software development compliant with MISRA-C

7. UTOSAR Configuration and generation.

Automotive domain expertise, Process quality:

1. Body Control Module

2. Power Electronics, DCDC Convertors

3. HVAC Systems

4. Cluster and Head-Up Display systems

5. Driver Information systems

6. Seat Modules
Expertise in ASIC VLSI:

1. Verilog courses

2. SystemVerilog for design and Verification

3. UVM Methodology for Verification

4. Functional Verification

Process Quality:

1. Experience in SPICE Level 3 development.

2. Functional Safety ISO 26262 - ASIL B products

3. Adaptable to Customer procedures and guidelines

Technologies:

1. Microcontrollers 8, 16, 32 bit

2. Embedded C, Python, Iot (PHP Front End & MY SQL Back End) Wireless –
Bluetooth, GPS, GPRS, Wi-Fi

3. Communication protocols – Spi, I2c, CAN, LIN

4. Mat Lab simlink, Xilinx, Modelsim, LabView

Management:

The Management team as mixture of Technical and Business development expertise with
14+years of experience in the Information Technology Field.

Current status of Technofly solution:

Present the company is involved with developing the GPS Training system for two wheels
with our associated partners also more focusing on Corporate Trainings on AUTOMOTIVE
EMBEDDED and Focused on providing ASIC solutions that involves Design and
Verification IP’s And Functional Verification of Designs.

Company Profile:
TechnoFly was formed by professionals with formal qualifications and industrial experience
in the fields of embedded systems, real-time software, process control and industrial
electronics. The company is professionally managed and supported by qualified experienced
specialists and consultants with experience in embedded systems – including hardware and
software.

Initially, the company Developed system software tools; these include C Compilers for
micro-controllers and other supporting tools such as assembler, linker, simulator and
Integrated Development Environment. Later Single Board Computers (SBCs) – were
developed and are still manufactured. Such hardware boards support a broad range of
processors – including 8 bit, 16 and 32 bit processor.

Since 2015, company also started offering design and development services. This includes a
complete spectrum of activities in product development life cycle that is idea generation,
requirement gathering to prototype making, testing and manufacturing. Company has so far
provided product design services for various sectors which include the Industrial automation,
Instrumentation, Automotive, Consumer and Defense sector.

Services of Technofly:

Embedded Software engineering Services:

When you don’t have enough time, or the right skills on hand, you can supplement your team
with expert embedded engineers from Technofly, who can tackle your projects with
confidence, take out the risk, and hit your milestones. We’ll take as much ownership as you
want us to, and make sure your project is done right, on time and on budget. Go ahead, check
our reputation for on-time, on-budget delivery. We've earned it, time and again.

We can help you cut risk on embedded systems R&D, and accelerate time to market.
Technofly is your best choice for designing and developing embedded products from
concept to delivery. Our team is well-versed in product life cycles. We build complex
software systems for real-time environments and have unique expertise and core
competencies in the following domains: Wireless, Access and IOT/Cloud.

Technofly solution also offer services which includes

1. Developing client / server applications to run on Windows / Linux

2. Develop / Test Internet based applications

3. Test suite development for applications and network protocols

4. Developing Networking tools for the enterprises

5. Verification & Validation of Enterprise applications

6. Software maintenance of enterprise applications

WORKING DEPARTMENT IN THE COMPANY

The team is associated with R&D in Wireless Communication Technologies department in

the company. The team is currently working on 4G-5G technologies associated with
Cognitive Devices such as WLAN, Bluetooth, Zigbee, other Mobile networks etc, for better
achievable network efficiencies. The work involves examining various methodologies
currently available and under development and implementation of the same for further
analysis and in depth understanding of the effects of these methods on network capacities.

The department is currently developing and examining optimal solutions for Network Data
Rate maximization in both co-operative and non-cooperative network users scenarios
involving cognitive(SU’s) and non-cognitive(PU’s) devices. The work is mainly concentrated
on:
1. Resource management (Spectrum management as well as power management),

2. Power Spectral analysis,

3. Detection Test statics computation methodology analysis,

4. Low power VLSI design

5. Efficiency analysis

The department is actively involved in acquiring latest technologies related projects in Low
power VLSI, wireless domain and these projects are well thought out and detailed
implementations are carried out. Projects are mainly done on Verilog, MATLAB platform
(from math works) and may also depend on NS2, NetSim and Xilinx platforms as per the
requirements of the project in progress.

Current internship involves study implementation and analysis of High speed and Energy
Efficient Carry Skip adder (CSKA) with Hybrid model for achieving high speed and reducing
the power consumption.

1. Study Requirements: Low power VLSI design and fundamentals of Digital circuits

2. Implementation Requirements: Verilog code / Modelsim tool

3. Detection Test Static: Simulation results

4. Platform: Verilog and simulated by Modelsim 6.4cand synthesized by Xilinx tool.

Engineering Departments and services:

Technofly solution offers services in the areas of Real-Time Embedded Systems, Low power
VLSI design, Verification and Software Engineering Services. Its strong team of around 30
engineers is equipped with the right tools and right processes to deliver the best. Technofly
solution also offers customization of its products.

Real Time Embedded System and Low power VLSI design Department:

Technofly solution embedded software, hardware, system development, system integration,

verification and product realization services to customers in automotive electronics and
consumer electronics segments worldwide. Technofly solution has more than 14 years of
experience in embedded systems on a variety of platforms such as Microprocessors,
Programmable Logic Devices (PLDs) and ASICs. Accord develops applications based on the
various commercially available real time and embedded operating systems.

Technofly solution provides services in the following areas:

1. Design Services

2. Product Realization

Design Services:

Technofly solution offer services in the areas of:

1. Hardware design and development

2. Software design and development

Hardware Design and Development:

Hardware design and development services are related to:

1. High-speed digital design

2. Mixed signal design

3. Analog and RF design

4. PLD (FPGA/EPLD/CPLD) based design

5. Processor (Micro-controllers, DSP) based design

6. Mechanical enclosure design

The hardware design and development follow stringent life cycle guidelines laid out at
Technofly solution while accomplishing the following –

Design Assurance

1. Signal Integrity

2. Cross-talk

3. Matching and Impedance control

4. Power supply design with due emphasis for Low-power battery operated
5. applications

6. Thermal analysis

7. Clock distribution

8. Timing analysis

9. PCB layer stacking

Design optimization

Selection of components keeping in mind

1. Cost , Size

2. Operating and storage temperature

3. MIL/Industrial/Commercial grades based on application

4. Environmental specifications like vibration, humidity, and radiation

PCB design

1. Optimum number of layers for a given application

2. Material used for PCB

3. Rigid, Flexi and Rigid-Flexi designs based on applications

Pilot production

1. Component sourcing, inward inspection and inventory management

2. PCB assembly

3. Assembled PCB testing

Software Development

Software design and development services are related to

1. Real-time Embedded Application Development

2. Device Driver Development

3. BSP Development

4. Processor/OS Porting Services

5. RTOS based development

6. Board bring-up

7. Digital Signal Processing Algorithms

8. Porting across platforms

ASIC

1. Design IP’s

2. Verification IP’s (VIP’s)

3. Complete verification Solution

Skill Set

1. Language: C, C++, Assembly languages, Verilog and SystemVerilog

2. Hardware Platforms: ADI DSPs, TI DSPs, ARM, PowerPC, Xscale architecture

3. RTOS: Integrity, VDK, DSP OS, Micro C OS and OASYS

4. FPGA: Xilinx (Spartan and Virtex), Actel, Altera

Tools

1. Development Tools: In-circuit emulators of various processor environments

2. Compilers: Compilers/IDEs of various processor environments

FPGA Tools

1. Front End Design: XST, Synplify, SynplifyPro, Precision Synthesis

2. Back End Design: Xilinx ISE 9.1.03i ,Actel’s Libero 6.0 , Altera’s MAXPlusII
Simulation:

1. Xilinx ModelSim SE

2. Actel’s Libero 6.0

3. Altera’s MAXPlusII

Coverage Analysis:

TransEDA VN-Cover

Debugging:

ChipScope

Hardware Tools:

1. Spectrum Analyzer

2. Signal Generators

3. Logic Analyzer

4. Digital Storage Oscilloscopes

5. Multifunction Counters

6. Development Tools and In-circuit Emulators for all ADI DSP’s, TI DSP’s,

7. ARM Processor, PowerPC

8. ORCAD, Allegro, Pspice

9. Temperature and Humidity Chamber

Product Realization

Product Realization services are provided in the areas of:

1. Consumer Electronics

2. Automotive

3. Space
4. Defense

5. Simulation/Emulation

6. Temperature and Humidity Chamber

7. Temperature and Humidity Chamber

Software Engineering Department

Technofly solution has a dedicated group specializing in providing productivity tools

for work group collaboration, which also handles software projects for small and medium
scale enterprises.

Our Work group productivity software suite Smart Works consists of software
applications which can help you plan and track your projects, Manage meetings and Track
various issues to its closures. Smart Works is affordably priced and uses TCP/IP based client
server architecture at its core. Smart Works server runs on all the windowing platforms
(Windows 95/98/NT/2000/ME). Efforts are on to make Smart works available on other
platforms as well.

Technofly solution also offer services which includes

1. Developing client / server applications to run on Windows / Linux

2. Develop / Test Internet based applications

3. Test suite development for applications and network protocols

4. Developing Networking tools for the enterprises

5. Verification & Validation of Enterprise applications

6. Software maintenance of enterprise applications

Following are the skill sets Technofly solution has garnered in the area of software:

1. Programming Languages: C, C++, VC++, Java, C#, ASP.Net, PHP, Lex &Yacc,
Perl, Python, Assembly Language and Ada

2. Operating Environments: Real Time Operating Systems such as, GreenHills

Integrity and Micro C-OS. DSP OS, VDK, OASYS and MS-WINCE, MS-
Windows, Unix/Linux and MPE/iX are the operating systems that Accord
provides services.

Abstract
Heart disease is a major life threatening disease that can cause either death or a serious long
term disability. However, there is lack of effective tools to discover hidden relationships
and trends in e-health data. Medicaldiagnosis is a complicated task and plays a vital
role in saving human lives so it needs to be executed accurately and efficiently. An
appropriate and accurate computer based automated decision support system is required
to reduce cost for achieving clinical tests. This paper provides an insight into machine
learning techniques used in diagnosing various diseases. Various data mining classifiers
have been discussed which has emerged in recent years for efficient and effective
disease diagnosis.

However using data mining technique can reduce the number of test that are
required. In orderto reduce from heart diseases there have to be a quick and efficient
detection technique. Decision Tree is one of the effective data mining methods used.
This research compares different algorithms of Decision Tree classification seeking
better performance in heart disease diagnosis. The algorithms which are tested are
SVM algorithm, K Nearest Neighbour algorithm and Random Forest algorithm .

Decision Tree is one of the effective data mining methods used. This datasets
consists of 303 instances and 76 attributes. Subsequently, the classification algorithm
that has optimal potential will be suggested for use in sizeable data. The goal of this
study is to extract hidden patterns by applying data mining techniques, which are
noteworthy to heart diseases and to predict the presence of heart disease in patients
where this presence is valued from no presence to likely presence.

Keywords: Machine Learning, Data Mining, Heart Disease, Diagnosis, Classification

Introduction toMachine Learning
Machine learning involves computer to get trained using a given data set, and use
this training to predict the properties of a given new data. For example, we can train computer
by feeding it 1000 images of cats and 1000 more images which are not of a cat, and tell each
time to computer whether a picture is cat or not. Then if we show the computer a new image,
then from the above training, computer should be able to tell whether this new image is cat or
not.

Process of training and prediction involves use of specialized algorithms. We feed the
training data to an algorithm, and the algorithm uses this training data to give predictions on a
new test data.

There are various machine learning algorithms like Decision trees, Naive Bayes, Random
forest, Support vector machine, K-nearest neighbour, K-means clustering, etc.

Machine Learning is the art (and science) of enabling machines to learn things which are not
explicitly programmed.

It involves as much mathematics as much it involves computer science. Most often, people
(also read as “sometimes me too”) are put off by the sheer amount of mathematical equations
and concepts in machine learning papers or articles that we ditch the entire article without
reading.

In this series, I will (possibly with the help of my friends, which will be duly noted in the
respective articles) talk about machine learning and deep learning math-free.

Purists might argue, learning is incomplete without the math behind it. I AGREE. But this is
not intended to be a complete reference to the machine learning concepts, this series intends to
start a conversation, or encourage thought in this direction.

 Machine learning is one of the applications of python

Python is widely used general purpose,high level programming language. It was initially
designed by Guido van Rossum in 1991and developed by Python Software Foundation.It was
mainly developed for emphasis on code readability, and its syntax allows programmers to
express concepts in fewer lines of code.
Python is a programming language that lets you work quickly and integrate systems
more efficiently. Python was designed for readability, and has some similarities to the English
language with influence from mathematics. Python uses new lines to complete a command as
opposed to other programming language which often use semicolons or parentheses.

The most recent major version of Python is Python 3, which we shall be using in this
tutorial. Python can be used on a server to create web applications. It can be used along side
software to create workflows. It can connect to database system and also read and modify
files.

Python can be used to handle big data and perform complex mathematics. It can be
used for rapid prototyping, or for production ready software development. It works on
different platforms (Windows, Mac, Linux, Raspberry Pi, etc). It runs on an interpreter
system, meaning that code can be executed as soon as it is written. This means that
prototyping can be very quick.

 Features of Python programming language

1. Readable: Python is a very readable language.

2. Easy to Learn: Learning python is easy as this is a expressive and high level
programming language, which means it is easy to understand the language and thus easy to
learn.
3. Cross platform: Python is available and can run on various operating systems such as
Mac, Windows, Linux, Unix etc. This makes it a cross platform and portable language.

4. Open Source: Python is a open source programming language.

5. Large standard library: Python comes with a large standard library that has some handy
codes and functions which we can use while writing code in Python.

6. Free: Python is free to download and use. This means you can download it for free and use
it in your application. See: Open Source Python License. Python is an example of a FLOSS
(Free/Libre Open Source Software), which means you can freely distribute copies of this
software, read its source code and modify it.
7. Supports exception handling: If you are new, you may wonder what is an exception?
exception is an event that can occur during program exception and can disrupt the normal
flow of program. Python supports exception handling which means we can write less error
prone code and can test various scenarios that can cause an exception later on.

8. Advanced features: Supports generators and list comprehensions. We will cover these
features later.

9. Automatic memory management: Python supports automatic memory management

which means the memory is cleared and freed automatically. You do not have to bother
clearing the memory.

 Applications of Python
1. Web development – Web framework like Django and Flask are based on Python. They
help you write server side code which helps you manage database, write backend
programming logic, mapping urls etc.

2. Machine learning – There are many machine learning applications written in Python.
Machine learning is a way to write a logic so that a machine can learn and solve a particular
problem on its own. For example, products recommendation in websites like Amazon,
Flipkart, eBay etc. is a machine learning algorithm that recognises user’s interest. Face
recognition and Voice recognition in your phone is another example of machine learning.

3. Data Analysis – Data analysis and data visualisation in form of charts can also be
developed using Python.4. Scripting – Scripting is writing small programs to automate
simple tasks such as sending automated response emails etc. Such type of applications can
also be written in Python programming language.

5. Game development – You can develop games using Python.

6. You can develop Embedded applications in Python.

7. Desktop applications – You can develop desktop application in Python using library like
TKinter or QT.

Python is increasingly being used as a scientific language. Matrix and vector manipulations
are extremely important for scientific computations. Both NumPy and Pandas have emerged
to be essential libraries for any scientific computation, including machine learning, in python
due to their intuitive syntax and high-performance matrix computation capabilities.

In this post, we will provide an overview of the common functionalities of NumPy and
Pandas. We will realize the similarity of these libraries with existing toolboxes in R and
MATLAB. This similarity and added flexibility have resulted in wide acceptance of python in
the scientific community lately. Topic covered in the blog are:

 Overview of NumPy
 Overview of Pandas
 Using Matplotlib

This post is an excerpt from a live hands-on training conducted by CloudxLab on 25th
Nov 2017. It was attended by more than 100 learners around the globe. The participants were
from countries namely; United States, Canada, Australia, Indonesia, India, Thailand,
Philippines, Malaysia, Macao, Japan, Hong Kong, Singapore, United Kingdom, Saudi Arabia,
Nepal, & New Zealand.

NumPy:NumPy stands for ‘Numerical Python’ or ‘Numeric Python’. It is an open source

module of Python which provides fast mathematical computation on arrays and matrices.

examples:A[2:5] will print items 2 to 4. Index in NumPy arrays starts from 0

A[2::2] will print items 2 to end skipping 2 items

A[::-1] will print the array in the reverse order

A[1:] will print from row 1 to end

Pandas:
1

Similar to NumPy, Pandas is one of the most widely used python libraries in data science. It
provides high-performance, easy to use structures and data analysis tools.

Matplotlib:

Matplotlib is a 2d plotting library which produces publication quality figures in a variety of

hardcopy formats and interactive environments. Matplotlib can be used in Python scripts,
Python and IPython shell, Jupyter Notebook, web application servers and GUI toolkits.

Example 1: Plotting a line graph

Example 2: Plotting a histogram

1 >>>import matplotlib.pyplot asplt

2 >>>x=[21,22,23,4,5,6,77,8,9,10,31,32,33,34,3
5,36,37,18,49,50,100]
3
>>>num_bins=5
4
>>>plt.hist(x,num_bins,facecolor='blue')
5
>>>plt.show()

Objectives :
The Heart Disease Prediction application is an end user support and online consultation
project. Here, we propose a web application that allows users to get instant guidance on their
heart disease through an intelligent system online. The application is fed with various details
and the heart disease associated with those details. The application allows user to share their
heart related issues. It then processes user specific details to check for various illness that
could be associated with it. Here we use some intelligent data mining techniques to guess the
most accurate illness that could be associated with patient’s details. Based on result, the can
contact doctor accordingly for further treatment. The system allows user to view doctor’s
details too. The system can be used for free heart disease consulting online.
Heart disease is the leading cause of death in the world over the past 10 years
(World Health Organization 2007). The European Public Health Alliance reported that
heart attacks, strokes and other circulatory diseases account for 41% of all deaths
(European Public Health Alliance 2010). Several different symptoms are associated
with heart disease, which makes it difficult to diagnose it quicker and better.

Working on heart disease patients databases can be compared to real-life application.

Doctors knowledge to assign the weight to each attribute. More weight is assigned to
the attribute having high impact on disease prediction. Therefore it appears reasonable to
try utilizing the knowledge and experience of several specialists collected in databases
towards assisting the Diagnosisprocess. It also provides healthcare professionals an extra
source of knowledge for making decisions.

Methodology:
1.SVM (Support Vector Machine)

It is a classification method. In this algorithm, we plot each data item as a point in n-

dimensional space (where n is number of features you have) with the value of each feature
being the value of a particular coordinate.

For example, if we only had two features like Height and Hair length of an individual, we’d
first plot these two variables in two dimensional space where each point has two co-ordinates
(these co-ordinates are known as Support Vectors)
Now, we will find some line that splits the data between the two differently classified groups
of data. This will be the line such that the distances from the closest point in each of the two
groups will be farthest away.

For instance, orange frontier is closest to blue circles. And the closest blue circle is 2 units
away from the frontier. Once we have these distances for all the frontiers, we simply choose
the frontier with the maximum distance (from the closest support vector). Out of the three
shown frontiers, we see the black frontier is farthest from nearest support vector (i.e. 15
units).

2.Decision Tree

This is one of my favorite algorithm and I use it quite frequently. It is a type of supervised
learning algorithm that is mostly used for classification problems. Surprisingly, it works for
both categorical and continuous dependent variables. In this algorithm, we split the
population into two or more homogeneous sets. This is done based on most significant
attributes/ independent variables to make as distinct groups as possible. For more details, you
can read: Decision Tree Simplified.

In the image above, you can see that population is classified into four different groups based
on multiple attributes to identify ‘if they will play or not’. To split the population into
different heterogeneous groups, it uses various techniques like Gini, Information Gain, Chi-
square, entropy.

The best way to understand how decision tree works, is to play Jezzball – a classic game from
Microsoft (image below). Essentially, you have a room with moving walls and you need to
create walls such that maximum area gets cleared off without the balls.

So, every time you split the room with a wall, you are trying to create 2 different populations
with in the same room. Decision trees work in very similar fashion by dividing a population
in as different groups as possible.
3.kNN (k- Nearest Neighbours)

It can be used for both classification and regression problems. However, it is more widely
used in classification problems in the industry. K nearest neighbors is a simple algorithm that
stores all available cases and classifies new cases by a majority vote of its k neighbors. The
case being assigned to the class is most common amongst its K nearest neighbors measured
by a distance function.

These distance functions can be Euclidean, Manhattan, Minkowski and Hamming distance.
First three functions are used for continuous function and fourth one (Hamming) for
categorical variables. If K = 1, then the case is simply assigned to the class of its nearest
neighbor. At times, choosing K turns out to be a challenge while performing kNN modeling.

More: Introduction to k-nearest neighbors : Simplified.

KNN can easily be mapped to our real lives. If you want to learn about a person, of whom
you have no information, you might like to find out about his close friends and the circles he
moves in and gain access to his/her information!

Things to consider before selecting kNN:

KNN is computationally expensive

Variables should be normalized else higher range variables can bias it

Works on pre-processing stage more before going for kNN like outlier, noise removal

4.Random Forest

Random Forest is a trademark term for an ensemble of decision trees. In Random Forest,
we’ve collection of decision trees (so known as “Forest”). To classify a new object based on
attributes, each tree gives a classification and we say the tree “votes” for that class. The forest
chooses the classification having the most votes (over all the trees in the forest).

Each tree is planted & grown as follows:

If the number of cases in the training set is N, then sample of N cases is taken at random but
with replacement. This sample will be the training set for growing the tree.

If there are M input variables, a number m<<M is specified such that at each node, m
variables are selected at random out of the M and the best split on these m is used to split the
node. The value of m is held constant during the forest growing.

Each tree is grown to the largest extent possible. There is no pruning.

Task Performed:
Dataset Description

We performed computer simulation on one dataset. Dataset is a Heart dataset. The dataset is
available in UCI Machine Learning Repository [10]. Dataset contains 303 samples and 14
input features as well as 1 output feature. The features describe financial, personal, and social
feature of loan applicants. The output feature is the decision class which has value 1 for Good
credit and 2 for Bad credit. The dataset-1 contains 700 instances shown as Good credit while
300 instances as bad credit. The dataset contains features expressed on nominal, ordinal, or
interval scales. A list of all those features is given in Table .

Feature No. Feature Name

1 Age
2 Sex
3 Cp
4 Trestbps
5 Choi
6 Fbs
7 Restesg
8 Thalach
9 Exang
10 Oldpeak
11 Slop
12 Ca
13 Thal
14 Num

TABLE: SELECTED HEART DISEASE ATTRIBUTES

Name Type Description

Age continuous Age in years
Sex discrete 0=female,1=male
Cp discrete Chest pain type:
1=typical angina,
2=a typical angina,
3=non anginal pain,
4=asymptom
trestbpc continuous Resting blood
Pressure
(in mm Hg)
Chol continuous Serum cholesterol
In mg/dl
Fbs discrete Fasting blood
Suger > 120 mg/dl:
1=true,0=false
Exang Continuous Maximum heart rate achieved discrete Exercise induced
angina: 1=Yes,
0=No
Thalach continuous Maximum heart
rate achieved
Oldpeak ST continuous Depression induced
by exercise relative
to rest
slope discrete The slope of the
Peak excersise
Segment : 1=up
sloping, 2=flat,
3=down slopping
Ca continuous No.of measure
Vessels colored by
Fluoroscopy that
Ranged between 0 &
3
Thal discrete 3 = Normal,
6 = Fixed Defect,
7=ReversibleDefect
class discrete Diagnosis classes:
0=No presence,
1=least likely to
have heart disease,
2=>1,3=>2,4=more
likely have heart
disease

 Snapshot of Correlation Matrix

 Snapshot of dataset. History
 Snapshot of Target Classes

 Snapshot of K Neighbors Classifier scores

 Snapshot of Support Vector Classifier scores

 Snapshot of Decission Tree Classifier scores

 Snapshot of Random Forest Classifier scores

Conclusion:
The project involved analysis of the heart disease patient dataset with proper data processing.
Then, 4 models were trained and tested with maximum scores as follows:

K Neighbours Classifier: 87%

Support Vector Classifier: 83%

Decision Tree Classifier: 79%

Random Forest Classifier: 84%

K Neighbours Classifier scored the best score of 87% with 8 neighbours.

Reference:
[1]C. S. Dangare and S. S. Apte, “Improved study of heart disease prediction system
using data mining classification techniques,” International Journal of Computer
Applications, vol. 47, no. 10, pp. 44–48, 2012.

[2] S. Palaniappan and R. Awang, “Intelligent heart disease prediction systemusing data
mining techniques,” pp. 108–115, 2008.

[3] Y. E. Shao, C.-D. Hou, and C.-C. Chiu, “Hybrid intelligent modelling schemes for
heart disease classification,” Applied Soft Computing,vol. 14, pp. 47–52, 2014.

[4] M. Shouman, T. Turner, and R. Stocker, “Using data mining techniquesin heart
disease diagnosis and treatment,” pp. 173–177, 2012.;3

[5] P. V. Ankur Makwana, “Identify the patients at high risk of re-admissionin

hospital in the next year,” International Journal of Science andResearch, vol. 4, pp. 2431–
2434, 2015.

[6] J. Nahar, T. Imam, K. S. Tickle, and Y.-P. P. Chen, “Computationalintelligence for heart
disease diagnosis: A medical Knowledge driven approach,” Expert Systems with
Applications, vol. 40, no. 1, pp. 96–104,2013.

[7] Y. Xing, J. Wang, Z. Zhao, and Y. Gao, “Combination data miningmethods with
new medical data to predicting outcome of coronary heartdisease,” pp. 868–872, 2007.

[8] Combination data mining methods with new medical data to predictingoutcome of
coronary heart disease,” in Convergence InformationTechnology, 2007. International
Conference on. IEEE, 2007, pp. 868–872.

[9] Y. E. Shao, C.-D. Hou, and C.-C. Chiu, “Hybrid intelligent modelling, schemes
for heart disease classification,” Applied Soft Computing,vol. 14, pp. 47–52, 2014.

Internshala Summer Training Report On Data Science
77% (22)
Internshala Summer Training Report On Data Science
70 pages
Ai Chatbot Using Python Report
100% (8)
Ai Chatbot Using Python Report
30 pages
Internship Report Pages
57% (7)
Internship Report Pages
14 pages
Python Full Stack Development Summer Internship Report
No ratings yet
Python Full Stack Development Summer Internship Report
43 pages
Internship - Report - On - Ai - and - ML - 23P15A0513 SARATH - Final
No ratings yet
Internship - Report - On - Ai - and - ML - 23P15A0513 SARATH - Final
32 pages
Summer Training Report ML
79% (14)
Summer Training Report ML
48 pages
Python ML Internship-Report
100% (4)
Python ML Internship-Report
35 pages
A Project Report On "House Price Prediction": Prepared by
100% (2)
A Project Report On "House Price Prediction": Prepared by
15 pages
Image Processing Report
0% (1)
Image Processing Report
18 pages
Uber Data Analysis
100% (4)
Uber Data Analysis
37 pages
Seminar Report On Chat GPT
100% (4)
Seminar Report On Chat GPT
8 pages
Diabetics Prediction Using Machine Learning
100% (1)
Diabetics Prediction Using Machine Learning
18 pages
Flight Fare
No ratings yet
Flight Fare
15 pages
Object Detection Project Report
67% (6)
Object Detection Project Report
86 pages
Internship Report
No ratings yet
Internship Report
20 pages
Flight Fare Prediction Final
No ratings yet
Flight Fare Prediction Final
65 pages
internship-REPORT Python
0% (1)
internship-REPORT Python
40 pages
Chat Bot Final
100% (1)
Chat Bot Final
48 pages
Internship Report On Machine Learning
0% (1)
Internship Report On Machine Learning
25 pages
Stress Detection in It Professional by Image Processing and Machine Learning
No ratings yet
Stress Detection in It Professional by Image Processing and Machine Learning
91 pages
College Enquiry Chat Bot
100% (2)
College Enquiry Chat Bot
47 pages
Faculty Python Lab Manual
100% (1)
Faculty Python Lab Manual
33 pages
Internship Report
100% (4)
Internship Report
16 pages
"House Price Prediction": Internship Project Report On
No ratings yet
"House Price Prediction": Internship Project Report On
34 pages
Sentimental Analysis Project Documentation
83% (6)
Sentimental Analysis Project Documentation
67 pages
Training Report
86% (7)
Training Report
88 pages
Python Mini Report PDF
100% (2)
Python Mini Report PDF
13 pages
A Project Report On: Chat Application
100% (1)
A Project Report On: Chat Application
44 pages
IOT Internship Report
61% (18)
IOT Internship Report
22 pages
CPT T Training
No ratings yet
CPT T Training
54 pages
ZKTeco F8-T-ID Spec
No ratings yet
ZKTeco F8-T-ID Spec
2 pages
Ybi Foubdation Report
100% (4)
Ybi Foubdation Report
37 pages
Data Science Internship Summary Presentation: Vikas Gupta June 2021
100% (1)
Data Science Internship Summary Presentation: Vikas Gupta June 2021
27 pages
Internship Presentation
No ratings yet
Internship Presentation
16 pages
Final Year Project Report
50% (2)
Final Year Project Report
53 pages
Fianl Year Project Report
No ratings yet
Fianl Year Project Report
62 pages
NLP Mini Project Report
No ratings yet
NLP Mini Project Report
27 pages
Internship Report. - AI&Ml With Python
0% (1)
Internship Report. - AI&Ml With Python
46 pages
Ybi Python Final Internship Report
100% (6)
Ybi Python Final Internship Report
29 pages
Lab Manual - 18CSL76 - 7th Sem
100% (5)
Lab Manual - 18CSL76 - 7th Sem
62 pages
Project Report
No ratings yet
Project Report
67 pages
Machine Learning Internshala: Mini Project / Internship Report
100% (1)
Machine Learning Internshala: Mini Project / Internship Report
28 pages
Machine Learning With Python Report
100% (1)
Machine Learning With Python Report
41 pages
Internship Report On Machine Learning With Python
100% (1)
Internship Report On Machine Learning With Python
50 pages
Python - Lab - Manual 2
100% (1)
Python - Lab - Manual 2
37 pages
A Summer Internship Report On Python (Django) Web Development
No ratings yet
A Summer Internship Report On Python (Django) Web Development
16 pages
20131a0249 Ai-Ml
100% (1)
20131a0249 Ai-Ml
45 pages
Summer Training Report On Data Science
No ratings yet
Summer Training Report On Data Science
47 pages
Anush J Internship Report
No ratings yet
Anush J Internship Report
15 pages
Six Weeks Summer Training Report-123
67% (3)
Six Weeks Summer Training Report-123
10 pages
Internship of Python-Django
100% (2)
Internship of Python-Django
10 pages
Internship Report - Parmod
No ratings yet
Internship Report - Parmod
41 pages
Data Science-Lab Manual
100% (1)
Data Science-Lab Manual
15 pages
AIML Lab Manual
67% (3)
AIML Lab Manual
31 pages
Emotion Based Music Player: Graduate Project Report
50% (2)
Emotion Based Music Player: Graduate Project Report
53 pages
Kodnest Virtual Internship Report On Python Full Stack Development
No ratings yet
Kodnest Virtual Internship Report On Python Full Stack Development
34 pages
Seminar Report On Machine Learing
33% (3)
Seminar Report On Machine Learing
30 pages
PYTHON Training Report PDF
No ratings yet
PYTHON Training Report PDF
47 pages
Machine Learning Internship Report Fire Detection
No ratings yet
Machine Learning Internship Report Fire Detection
41 pages
5 6098091079070908613
No ratings yet
5 6098091079070908613
65 pages
Mahi (1 9)
No ratings yet
Mahi (1 9)
9 pages
content
No ratings yet
content
23 pages
fb126 Pniodiag en PDF
No ratings yet
fb126 Pniodiag en PDF
19 pages
QT Whitepaper Beyond The Code
No ratings yet
QT Whitepaper Beyond The Code
6 pages
Openerp CRM Sales Management Book
No ratings yet
Openerp CRM Sales Management Book
180 pages
181 Sites For BTS Swap
No ratings yet
181 Sites For BTS Swap
20 pages
A Method For Trust Management in Cloud Computing: Data Coloring by Cloud Watermarking
No ratings yet
A Method For Trust Management in Cloud Computing: Data Coloring by Cloud Watermarking
7 pages
How To Use The Linux Command Line On Android With Termux
No ratings yet
How To Use The Linux Command Line On Android With Termux
11 pages
Basic Access Control Lists
No ratings yet
Basic Access Control Lists
10 pages
Mantra: Applications
No ratings yet
Mantra: Applications
2 pages
Background of Dish Home TV
No ratings yet
Background of Dish Home TV
5 pages
Stream Control Transmission Protocol
No ratings yet
Stream Control Transmission Protocol
3 pages
Teamcenter Installation Document PDF
No ratings yet
Teamcenter Installation Document PDF
530 pages
Explain Physical Database Design in PDF
No ratings yet
Explain Physical Database Design in PDF
2 pages
Medgroup Packet Tracer Skills Integration Challenge: Part 5 - Configure The Urban Clinic
No ratings yet
Medgroup Packet Tracer Skills Integration Challenge: Part 5 - Configure The Urban Clinic
4 pages
Communication Protocols Augmentation in VLSI Design Applications
No ratings yet
Communication Protocols Augmentation in VLSI Design Applications
5 pages
D 03832533
No ratings yet
D 03832533
9 pages
512K (64K X 8, Chip Erase) FLASH MEMORY: Figure 1. Logic Diagram Description
No ratings yet
512K (64K X 8, Chip Erase) FLASH MEMORY: Figure 1. Logic Diagram Description
20 pages
Implementation of Digital Clock On FPGA: Industrial Training Report ON
No ratings yet
Implementation of Digital Clock On FPGA: Industrial Training Report ON
26 pages
Microprocessor File
No ratings yet
Microprocessor File
15 pages
Software Development Life Cycle
No ratings yet
Software Development Life Cycle
6 pages
QPST
No ratings yet
QPST
6 pages
Specctra
100% (1)
Specctra
4 pages
Course Descrip - Archer - ILT - Archer 6 Admin II
No ratings yet
Course Descrip - Archer - ILT - Archer 6 Admin II
2 pages
M110 Firmware Upgrade Setup Guide - C - Nov-2012
No ratings yet
M110 Firmware Upgrade Setup Guide - C - Nov-2012
9 pages
Impresora Citizen
No ratings yet
Impresora Citizen
50 pages
System PDF
No ratings yet
System PDF
1 page
VCommander Cloud Automation Planning Guide
No ratings yet
VCommander Cloud Automation Planning Guide
30 pages
BOTU Optical Transmission PDF
No ratings yet
BOTU Optical Transmission PDF
4 pages
DH-IPC-HDBW1431R-ZS-S4: 4MP Entry IR Vari-Focal Dome Netwok Camera
No ratings yet
DH-IPC-HDBW1431R-ZS-S4: 4MP Entry IR Vari-Focal Dome Netwok Camera
3 pages