0% found this document useful (0 votes)

8 views

Unit-16

Uploaded by

sandipmondal88

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Unit-16

Uploaded by

sandipmondal88

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

UNIT 16

MACHINE LEARNING – Clustering

PROGRAMMING USING PYTHON

Structure
16.0 Introduction
16.1 Objectives
16.2 Classification Algorithms
16.2.1 Naïve Bayes

16.2.2 K-Nearest Neighbour (K-NN)

16.2.3 Decision Trees

16.2.4 Logistic Regression

16.2.5 Support Vector Machines

16.3 Regression Algorithms

16.3.1 Linear Regresssion

16.3.2 Polynomial Regression

16.4 Feature Selection and Extraction

16.4.1 Principal Component Analysis

16.5 Association Rules

16.5.1 Apriori Algorithm

16.6 Clustering Algorithms

16.6.1 K-Means,

16.7 Summary
16.9 Solutions/ Answers
16.10 Further Readings

16.0 INTRODUCTION
In this unit we will see the implementation of various machine learning algorithms,
learned in this course. To understand the codes you need to have understanding
of the respective Machine learning algorithms along with that understanding of
Python programming is must. The codes are readily using various libraries of
Python programming language viz. Scikit Learn, Matplotlib, numpy etc., you
can execute these codes through any of the Python programming tools. Most of
the machines learning algorithms, you learned in this course, are implemented
here, just try to execute them and analyse the results.

16.1 OBJECTIVES
After going through this unit, you should be able to:
● Understand the implementation aspect of various machine learning
471
algorithms
Machine Learning - II 16.2 CLASSIFICATION ALGORITHMS
The starting units of this course primarily focused on the various classification
algorithms viz. Naïve Bayes classifiers, K-Nearest Neighbour (K-NN), Decision
Trees, Logistic Regression and Support Vector Machines.The theoretical
aspects of the same is already discussed in the respective units, now we will see
the implementation part of the mentioned classifiers, in Python programming
language.
16.2.1 Naive Bayes
It is a method of classification that is founded on Bayes' Theorem and makes the
assumption that predictors are free to act independently of one another. A Naive
Bayes classifier, to put it in layman's words, makes the assumption that the
existence of one particular characteristic in a class is unrelated to the presence
of any other feature.
We have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
The following procedures need to be carried out in order to classify data using
the Naive Bayes method.
• In the first step, we will begin by importing the dataset as well as any
necessary dependencies...
• The second step is to get the prior probability of each class using the formula
P(y).
• The Third Step is to Determine the likelihood of each characteristic using
the table you just created...
• Final and the Fourth Step is to Calculate the Posterior Probability for each
class by applying the Naive Bayesian equation.
Implementation code in Python
The screenshot of the executed code is given below

472
OUTPUT : Machine Learning –
Programming Using
Gaussian Naive Bayes model accuracy(in %): 95.0 Python

16.2.2 K-Nearest Neighbour (K-NN)

You have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
We learned that Suppose the value of K is 3. The KNN algorithm starts by
calculating the distance of point X from all the points. It then finds the 3 nearest
points with least distance to point X
In the example shown below following steps are performed:
• In Step 1, the scikit-learn package is used to import the k-nearest neighbour
algorithm.
• Step 2. is to create the feature variables and the target variables.
• Step 3. Separate the data into the test data and the training data.
• Step 4.Generate a k-NN model using neighbours value.
• Step 5. Train the model using the data or adjust the model based on the data.
• Proceed to Step 6, which is to make a forecast.
Now, in this section, we will see how Python's Scikit-Learn library can be used
to implement the KNN algorithm
Implementation code in Python
The screenshot of the executed code is given below

473
Machine Learning - II 16.2.3 Desicion Tree Implementation
A decision tree is a type of supervised machine learning algorithm that may be
used for both regression and classification tasks. It is one of the most popular
and widely used machine learning techniques.
In this case, the decision tree method creates a node for each attribute present
in the dataset, with the attribute that is considered to be the most significant
being placed at the top of the tree. When we first get started, we will think of
the entire training set as the root. There must be a categorical breakdown of the
feature values.
Before beginning to develop the model, the values are discretized in order to
determine whether or not they are continuous. A recursive process distributes
records according to the attribute values of each record. A statistical method is
utilised in order to determine which qualities should be placed at the tree's root
and which should be placed at internal nodes.
You have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below

474
Machine Learning –
Programming Using
Python

475
Machine Learning - II

476
16.2.4 Logistic Regression Machine Learning –
Programming Using
Logistic Regression (LR) is a classification algorithm that is used in Machine Python
Learning to predict the likelihood of a categorical dependent variable. It is also
known as "logistic regression." The dependent variable in logistic regression is
a binary variable, which means that it comprises data that is either recorded as
1 (yes, success, etc.) or 0. (no, failure, etc.).
It should be brought to your attention that the Naive Bayes model is a generative
model, whereas the LR model is a discriminative model. LR performs better
than naive bayes when it comes to colinearity. This is because naive bayes
expects all of the characteristics to be independent, while LR does not. Naive
bayes works well with small datasets.
You have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below

477
Machine Learning - II

16.2.5 Support Vector Machine

Support Vector Machine, more usually referred to as SVM, is a technique for
supervised and linear machine learning that is most frequently utilised for the
purpose of addressing classification issues. Support Vector Classification is
another name for SVM. In addition, there is a subset of SVM known as SVR,
478 which stands for Support Vector Regression. SVR applies the similar concepts
to the problem-solving process when addressing regression issues. SVM also Machine Learning –
offers a method known as the kernel method, which is also known as the kernel Programming Using
Python
SVM. This method enables us to deal with non-linearity.
The following are the steps involved in implementation:
• Import the Libraries
• Make sure the Dataset is loaded.
• Dataset will be divided into X and Y.
• Create a Training set and a Test set from the X and Y Datasets.
• Scaling the features should be done.
• Ensure that the SVM is adjusted to the Training set.
• Make a guess about the results of the test set.
• Construct the Matrix of Confusion.
You have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below

479
Machine Learning - II

OUTPUT:

Check Your Progress - 1

1. Make Suitable assumptions and modify the python code of following
Classification algorithms:
a. K-NN
b. Decision Tree
c. Logistic Regression
d. Support Vector Machines

16.3 REGRESSION ALGORITHMS

We learned about the basic concept of regression in the respective unit of this
course, in this unit we will implement the Linear regression and Polynomial
regression in Python language. Lets start with the Linear regression.

16.3.1 Linear Regression

The purpose of a linear regression model is to determine whether or not there is
a connection between one or more characteristics (also known as independent
variables) and a target variable that is continuous (dependent variable). Linear
Regression is referred to as Uni-variate Linear Regression when there is only
one feature, and it is referred to as Several Linear Regression when there are
multiple features.

480
Following are the stages involved in the implementation of a linear regression Machine Learning –
model: Programming Using
Python
• Firstly, initialise the parameters.
• Given the value of an independent variable, predict what the value of a
dependent variable will be.
• Determine the amount of error that each forecast has for each data point.
• Using a0 and a1, perform the calculation for the partial derivative.
• Add up the individual costs that you have determined for each of the
numbers.
You have already discussed this classifier in detail in Block 3 Unit 11 of this
course, you may refer to Block 3 Unit 11 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below

481
Machine Learning - II

482
Machine Learning –
Programming Using
Python

OUTPUT:

16.3.2 Polynomial Regression

Polynomial Regression is a type of linear regression in which the relationship
between the independent variable x and the dependent variable y is described as
an nth degree polynomial. This type of regression is also known as "extended"
linear regression. Polynomial regression is used to model a nonlinear relationship
between the value of an independent variable x and the conditional mean of a
dependent variable y. This relationship is represented by the notation E(y |x).
solely as a result of the non-linear relationship that exists between the dependent
and the independent variables When we want to transform linear regression into
polynomial regression, we just add some polynomial terms.
You have already discussed this classifier in detail in Block 3 Unit 11 of this
course, you may refer to Block 3 Unit 11 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below

483
Machine Learning - II OUTPUT:

Check Your Progress - 2

2. Make Suitable assumptions and modify the python code of following
Regression algorithms:
a. Linear regression
b. Polynomial egression

16.4 FEATURE SELECTION AND EXTRACTION

Feature selection and extraction are one of the most important steps that must
be performed in order for machine learning to be successful. While we covered
the theoretical aspects of this process in the earlier units of this course, it is now
time to understand the implementation part of the mechanisms that we have
learned for Feature selection and extraction. Let's begin with dimensionality
reduction, which is the process of lowering the number of random variables that
are being considered by generating a set of primary variables. Dimensionality
reduction may be seen of as a way to streamline the analysis process.

16.4.1 Principal Component Analysis (PCA)

You have already discussed this classifier in detail in Block 4 Unit 13 of this
course, you may refer to Block 4 Unit 13 to understand the concept.
Among the various techniques the Principal Component Analysis (PCA) is
most frequently used, and the implementation of PCA is given below:
Implementation code in Python
The screenshot of the executed code is given below

484
Machine Learning –
Programming Using
Python

OUTPUT :

Check Your Progress - 3

3. Make Suitable assumptions and modify the python code of Principal
Component Analysis, for dimensionality reduction.

16.5 ASSOCIATION RULES

We discussed Apriori algorithm and FP Growth algorithm, while studying the
topic of Association Rules. These algorithms are frequently used in pattern
matching. Since FP Growth is a step ahead to Apriori Algorithm, we are
discussing the implementation of Apriori algorithm only.

16.5.1 Apriori Algorithm

The Apriori algorithm is a data mining technique that is used for mining frequent
item sets and appropriate association rules. It does this by using a mathematical 485
Machine Learning - II formula. We focused on the definitions of association rule mining and Apriori
algorithms, as well as the application of an Apriori algorithm, in the area of this
class that was most pertinent to the topic. In this section, we will construct one
Apriori model utilising the Python programming language and a hypothetical
situation involving a small firm. However, it does have some limits, the effects
of which can be mitigated using a variety of different approaches. Data mining
and pattern recognition are two of the many applications that see widespread
use of the method.
The candidate set is produced by the model that is described further down below
by merging the set of frequent items from the step before it.
Conduct testing on subsets, and if the candidate set contains infrequent item
sets, remove them. And then calculate the final frequent itemset by obtaining
the items that meet the minimal support requirement.
You have already discussed this classifier in detail in Block 4 Unit 14 of this
course, you may refer to Block 4 Unit 14 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below

486
Machine Learning –
Programming Using
Python

487
Machine Learning - II

16.6 CLUSTERING ALGORITHMS

We learned about the theoretical aspects of various clustering algorithms like
K-Means, DBSCAN etc. in the respective unit of this course. The K-Means
algorithm was quite simple and hence its implementation is given below:

16.6.1 K-Means - Implementation code in Python

You have already discussed this classifier in detail in Block 4 Unit 15 of this
course, you may refer to Block 4 Unit 15 to understand the concept.
Implementation code in Python

488 The screenshot of the executed code is given below

Machine Learning –
Programming Using
Python

Check Your Progress - 4

4. Make Suitable assumptions and modify the python code of K-Means
algorithm 489
Machine Learning - II 16.8 SUMMARY
In this unit we understood the implementation of various machine learning
algorithms for Classification, Regression, Dimension Reductionality and
clustering. The theoretical aspects of the respective algorithm were already
discussed in the respective units of this course.

16.9 SOLUTIONS/ANSWERS
Check Your Progress - 1
1. Make Suitable assumptions and modify the python code of following
Classification algorithms:
a. K-NN
b. Decision Tree
c. Logistic Regression
d. Support Vector Machines
Solution : Refer to section 16.2
Check Your Progress - 2
2. Make Suitable assumptions and modify the python code of following
Regression algorithms:
a. Linear regression
b. Polynomial egression
Solution : Refer to section 16.3
Check Your Progress - 3
3. Make Suitable assumptions and modify the python code of Principal
Component Analysis, for dimensionality reduction.
Solution : Refer to section 16.4
Check Your Progress - 4
4. Make Suitable assumptions and modify the python code of K-Means
algorithm
Solution : Refer to section 16.6

16.10 FURTHER READINGS

● https://www.kaggle.com/
● https://www.github.com/
● https://towardsdatascience.com
● https://machinelearningmastery.com

490

ASME Y14.5 2009 Vs 2018 Comparison Chart
No ratings yet
ASME Y14.5 2009 Vs 2018 Comparison Chart
1 page
Data Science Machine Leraning222
No ratings yet
Data Science Machine Leraning222
11 pages
Classification Algorithms I
No ratings yet
Classification Algorithms I
14 pages
Machine Learning
100% (6)
Machine Learning
115 pages
MLP Unit-2
No ratings yet
MLP Unit-2
102 pages
Top 10 Machine Learning Algorithms With Their Use
100% (1)
Top 10 Machine Learning Algorithms With Their Use
12 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
ML ASS ppt
No ratings yet
ML ASS ppt
16 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
OR forecasting tool
No ratings yet
OR forecasting tool
39 pages
Unit 1 Machine Learning - PDF Lands
No ratings yet
Unit 1 Machine Learning - PDF Lands
5 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
ML
No ratings yet
ML
8 pages
Machine Learning For Beginners PDF
No ratings yet
Machine Learning For Beginners PDF
29 pages
Machine Learning
100% (3)
Machine Learning
46 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
12 pages
Module 01- ML-21EC744
No ratings yet
Module 01- ML-21EC744
20 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Scikit Learn
No ratings yet
Scikit Learn
107 pages
Lab Manual
No ratings yet
Lab Manual
17 pages
L03 The Regression Pipeline - 2
No ratings yet
L03 The Regression Pipeline - 2
58 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
Deep Learning
No ratings yet
Deep Learning
25 pages
ML and Deploying It Using Flask and Docker.
No ratings yet
ML and Deploying It Using Flask and Docker.
30 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
ML_Industry_Lab_File_With_Code_and_IO
No ratings yet
ML_Industry_Lab_File_With_Code_and_IO
8 pages
Module 3 (1)
No ratings yet
Module 3 (1)
63 pages
ML Practical File
No ratings yet
ML Practical File
24 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
11 pages
ML Report 1
No ratings yet
ML Report 1
23 pages
Record
No ratings yet
Record
23 pages
ML Lab Record
No ratings yet
ML Lab Record
27 pages
ML Lab Manual1
No ratings yet
ML Lab Manual1
23 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
AI overview Simplified
No ratings yet
AI overview Simplified
17 pages
Artificial Intellegence Lab Practical
No ratings yet
Artificial Intellegence Lab Practical
48 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
28 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Introduction to ML
No ratings yet
Introduction to ML
15 pages
Machine Learning
100% (5)
Machine Learning
56 pages
3.popular Machine Learning Algorithm
No ratings yet
3.popular Machine Learning Algorithm
11 pages
Supervised - ML Complete Book
No ratings yet
Supervised - ML Complete Book
153 pages
The KNN
No ratings yet
The KNN
31 pages
LAB MANUAL For Machine Learning
No ratings yet
LAB MANUAL For Machine Learning
15 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
17 pages
Mechine Learning
No ratings yet
Mechine Learning
106 pages
Machine Learning Laboratory: Manual
No ratings yet
Machine Learning Laboratory: Manual
52 pages
Machine Learning Toolbox
No ratings yet
Machine Learning Toolbox
10 pages
Seminar Presentation
No ratings yet
Seminar Presentation
25 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
8 Machine Learning Algorithms in Python
100% (3)
8 Machine Learning Algorithms in Python
16 pages
MLT - MKC
No ratings yet
MLT - MKC
10 pages
Aws ML PDF
No ratings yet
Aws ML PDF
74 pages
Machine Learning 2
No ratings yet
Machine Learning 2
21 pages
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
From Everand
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
Yuxi (Hayden) Liu
No ratings yet
FD Ffi T: I?utt.-Aff (S, F
No ratings yet
FD Ffi T: I?utt.-Aff (S, F
8 pages
WK 1 Appendix Review
No ratings yet
WK 1 Appendix Review
26 pages
Lecture 7 - 1
No ratings yet
Lecture 7 - 1
10 pages
Space Physics
No ratings yet
Space Physics
19 pages
DPP - Straight Lines
No ratings yet
DPP - Straight Lines
2 pages
Modeling and Simulation PDF
No ratings yet
Modeling and Simulation PDF
657 pages
Oop in Java Lab Manualdr.rahk
No ratings yet
Oop in Java Lab Manualdr.rahk
58 pages
U4 l4 Notes
No ratings yet
U4 l4 Notes
6 pages
Ece - VR20 29082022 2
No ratings yet
Ece - VR20 29082022 2
125 pages
Lte
No ratings yet
Lte
12 pages
Computational Acoustic Methods For The Design of Woodwind Instruments PDF
No ratings yet
Computational Acoustic Methods For The Design of Woodwind Instruments PDF
166 pages
Practice Papers MTH 401
No ratings yet
Practice Papers MTH 401
7 pages
Signature Verification Approach Using Fusion of Hybrid Texture Features
No ratings yet
Signature Verification Approach Using Fusion of Hybrid Texture Features
12 pages
Lab Manual: Microprocessor Lab (8086) Sub Code: 06CSL48
No ratings yet
Lab Manual: Microprocessor Lab (8086) Sub Code: 06CSL48
100 pages
Triangles PDF
No ratings yet
Triangles PDF
30 pages
Solving the P-median Problem Using PuLP in Python | by Dr. Soumen Atta, Ph.D. |
No ratings yet
Solving the P-median Problem Using PuLP in Python | by Dr. Soumen Atta, Ph.D. |
24 pages
Quantum Computers 2015 PDF
No ratings yet
Quantum Computers 2015 PDF
5 pages
Count like an Egyptian a hands on introduction to ancient mathematics 1st Edition Reimer all chapter instant download
100% (10)
Count like an Egyptian a hands on introduction to ancient mathematics 1st Edition Reimer all chapter instant download
85 pages
A Discussion of Methods of Real-Time Airplane Flight Simulation
0% (1)
A Discussion of Methods of Real-Time Airplane Flight Simulation
50 pages
Homework Class
100% (1)
Homework Class
11 pages
Module 3
No ratings yet
Module 3
8 pages
2 Mtech I Sem Regular & Supply R21 May 2022
No ratings yet
2 Mtech I Sem Regular & Supply R21 May 2022
51 pages
UNIT 1(2)
No ratings yet
UNIT 1(2)
30 pages
Week 7
No ratings yet
Week 7
12 pages
MATH L4 M-A 2025- 30 COPIES
No ratings yet
MATH L4 M-A 2025- 30 COPIES
4 pages
Computer Science Paper 02
No ratings yet
Computer Science Paper 02
10 pages
Electrical Notes
No ratings yet
Electrical Notes
3 pages
Viscosity Temperature Pressure
No ratings yet
Viscosity Temperature Pressure
11 pages
YZIgxp
No ratings yet
YZIgxp
6 pages

Unit-16

Uploaded by

Unit-16

Uploaded by

UNIT 16

MACHINE LEARNING – Clustering

PROGRAMMING USING PYTHON

16.2.2 K-Nearest Neighbour (K-NN)

16.2.3 Decision Trees

16.2.4 Logistic Regression

16.2.5 Support Vector Machines

16.3 Regression Algorithms

16.3.2 Polynomial Regression

16.4 Feature Selection and Extraction

16.5 Association Rules

16.6 Clustering Algorithms

16.2.2 K-Nearest Neighbour (K-NN)

16.2.5 Support Vector Machine

Check Your Progress - 1

16.3 REGRESSION ALGORITHMS

16.3.1 Linear Regression

16.3.2 Polynomial Regression

Check Your Progress - 2

16.4 FEATURE SELECTION AND EXTRACTION

16.4.1 Principal Component Analysis (PCA)

Check Your Progress - 3

16.5 ASSOCIATION RULES

16.5.1 Apriori Algorithm

16.6 CLUSTERING ALGORITHMS

16.6.1 K-Means - Implementation code in Python

488 The screenshot of the executed code is given below

Check Your Progress - 4

16.10 FURTHER READINGS

You might also like