Unit-16
Unit-16
16.7 Summary
16.9 Solutions/ Answers
16.10 Further Readings
16.0 INTRODUCTION
In this unit we will see the implementation of various machine learning algorithms,
learned in this course. To understand the codes you need to have understanding
of the respective Machine learning algorithms along with that understanding of
Python programming is must. The codes are readily using various libraries of
Python programming language viz. Scikit Learn, Matplotlib, numpy etc., you
can execute these codes through any of the Python programming tools. Most of
the machines learning algorithms, you learned in this course, are implemented
here, just try to execute them and analyse the results.
16.1 OBJECTIVES
After going through this unit, you should be able to:
● Understand the implementation aspect of various machine learning
471
algorithms
Machine Learning - II 16.2 CLASSIFICATION ALGORITHMS
The starting units of this course primarily focused on the various classification
algorithms viz. Naïve Bayes classifiers, K-Nearest Neighbour (K-NN), Decision
Trees, Logistic Regression and Support Vector Machines.The theoretical
aspects of the same is already discussed in the respective units, now we will see
the implementation part of the mentioned classifiers, in Python programming
language.
16.2.1 Naive Bayes
It is a method of classification that is founded on Bayes' Theorem and makes the
assumption that predictors are free to act independently of one another. A Naive
Bayes classifier, to put it in layman's words, makes the assumption that the
existence of one particular characteristic in a class is unrelated to the presence
of any other feature.
We have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
The following procedures need to be carried out in order to classify data using
the Naive Bayes method.
• In the first step, we will begin by importing the dataset as well as any
necessary dependencies...
• The second step is to get the prior probability of each class using the formula
P(y).
• The Third Step is to Determine the likelihood of each characteristic using
the table you just created...
• Final and the Fourth Step is to Calculate the Posterior Probability for each
class by applying the Naive Bayesian equation.
Implementation code in Python
The screenshot of the executed code is given below
472
OUTPUT : Machine Learning –
Programming Using
Gaussian Naive Bayes model accuracy(in %): 95.0 Python
473
Machine Learning - II 16.2.3 Desicion Tree Implementation
A decision tree is a type of supervised machine learning algorithm that may be
used for both regression and classification tasks. It is one of the most popular
and widely used machine learning techniques.
In this case, the decision tree method creates a node for each attribute present
in the dataset, with the attribute that is considered to be the most significant
being placed at the top of the tree. When we first get started, we will think of
the entire training set as the root. There must be a categorical breakdown of the
feature values.
Before beginning to develop the model, the values are discretized in order to
determine whether or not they are continuous. A recursive process distributes
records according to the attribute values of each record. A statistical method is
utilised in order to determine which qualities should be placed at the tree's root
and which should be placed at internal nodes.
You have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below
474
Machine Learning –
Programming Using
Python
475
Machine Learning - II
476
16.2.4 Logistic Regression Machine Learning –
Programming Using
Logistic Regression (LR) is a classification algorithm that is used in Machine Python
Learning to predict the likelihood of a categorical dependent variable. It is also
known as "logistic regression." The dependent variable in logistic regression is
a binary variable, which means that it comprises data that is either recorded as
1 (yes, success, etc.) or 0. (no, failure, etc.).
It should be brought to your attention that the Naive Bayes model is a generative
model, whereas the LR model is a discriminative model. LR performs better
than naive bayes when it comes to colinearity. This is because naive bayes
expects all of the characteristics to be independent, while LR does not. Naive
bayes works well with small datasets.
You have already discussed this classifier in detail in Block 3 Unit 10 of this
course, you may refer to Block 3 Unit 10 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below
477
Machine Learning - II
479
Machine Learning - II
OUTPUT:
480
Following are the stages involved in the implementation of a linear regression Machine Learning –
model: Programming Using
Python
• Firstly, initialise the parameters.
• Given the value of an independent variable, predict what the value of a
dependent variable will be.
• Determine the amount of error that each forecast has for each data point.
• Using a0 and a1, perform the calculation for the partial derivative.
• Add up the individual costs that you have determined for each of the
numbers.
You have already discussed this classifier in detail in Block 3 Unit 11 of this
course, you may refer to Block 3 Unit 11 to understand the concept.
Implementation code in Python
The screenshot of the executed code is given below
481
Machine Learning - II
482
Machine Learning –
Programming Using
Python
OUTPUT:
483
Machine Learning - II OUTPUT:
484
Machine Learning –
Programming Using
Python
OUTPUT :
486
Machine Learning –
Programming Using
Python
487
Machine Learning - II
16.9 SOLUTIONS/ANSWERS
Check Your Progress - 1
1. Make Suitable assumptions and modify the python code of following
Classification algorithms:
a. K-NN
b. Decision Tree
c. Logistic Regression
d. Support Vector Machines
Solution : Refer to section 16.2
Check Your Progress - 2
2. Make Suitable assumptions and modify the python code of following
Regression algorithms:
a. Linear regression
b. Polynomial egression
Solution : Refer to section 16.3
Check Your Progress - 3
3. Make Suitable assumptions and modify the python code of Principal
Component Analysis, for dimensionality reduction.
Solution : Refer to section 16.4
Check Your Progress - 4
4. Make Suitable assumptions and modify the python code of K-Means
algorithm
Solution : Refer to section 16.6
490