0% found this document useful (0 votes)
5 views

Lab Manual

The document outlines the implementation of various machine learning algorithms including FIND-S, Candidate Elimination, ID3, and Backpropagation for Artificial Neural Networks. Each section provides the aim, algorithm, program, output, and results of the implementation using Python and CSV datasets. The successful execution of these algorithms demonstrates their effectiveness in finding hypotheses and building decision trees.

Uploaded by

vjay2003
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lab Manual

The document outlines the implementation of various machine learning algorithms including FIND-S, Candidate Elimination, ID3, and Backpropagation for Artificial Neural Networks. Each section provides the aim, algorithm, program, output, and results of the implementation using Python and CSV datasets. The successful execution of these algorithms demonstrates their effectiveness in finding hypotheses and building decision trees.

Uploaded by

vjay2003
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 55

AD3461 ML LAB Manual

B.Tech Artificial Intelligence and Data Science

II yr/IV Sem
EX.NO.1 Implementation of FIND_S algorithm

AIM:
To Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis
based on a given set of training data samples. Read the training data from a .CSV file.

ALGORITHM:
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance
x For each attribute constraint ai in h
If the constraint ai is satisfied by x
Then do nothing
Else replace ai in h by the next more general constraint that is satisfied by x
3. Output hypothesis h

Training Examples:
sky airtemp humidity wind water forecast enjoysport
sunny warm normal strong warm same yes
sunny warm hign strong warm same yes
rainy cold high strong warm change no
sunny warm high strong cool change yes
Program:
import csv
num_attributes = 6
a=[]
print("\n The Given Training Data Set \n")
with open('C:\\New folder\\enjoysport.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
a.append (row)
print(row)
print("\n The initial value of hypothesis: ")
hypothesis = ['0'] * num_attributes
print(hypothesis)
for j in range(0,num_attributes):
hypothesis[j] = a[1][j];
print("\n Find S: Finding a Maximally Specific Hypothesis\n")
for i in range(0,len(a)):
if a[i][num_attributes]=='yes':
for j in range(0,num_attributes):
if a[i][j]!=hypothesis[j]:
hypothesis[j]='?'
else :
hypothesis[j]= a[i][j]
print(" For Training instance No:{0} the hypothesis is ".format(i),hypothesis)
print("\n The Maximally Specific Hypothesis for a given Training Examples :\n")
print(hypothesis)

OUTPUT:
The Given Training Data Set
['sunny', 'warm', 'normal', 'strong', 'warm', 'same', 'yes']
['sunny', 'warm', 'high', 'strong', 'warm', 'same', 'yes']
['rainy', 'cold', 'high', 'strong', 'warm', 'change', 'no']
['sunny', 'warm', 'high', 'strong', 'cool', 'change', 'yes']
The initial value of hypothesis:
['0', '0', '0', '0', '0', '0']
Find S: Finding a Maximally Specific Hypothesis
For Training Example No:0 the hypothesis is : ['sunny', 'warm', 'normal', 'strong', 'warm', 'same']
For Training Example No:1 the hypothesis is : ['sunny', 'warm', '?', 'strong', 'warm', 'same']
For Training Example No:2 the hypothesis is : 'sunny', 'warm', '?', 'strong', 'warm',
'same'] For Training Example No:3 the hypothesis is : 'sunny', 'warm', '?', 'strong', '?', '?']
The Maximally Specific Hypothesis for a given Training Examples:
['sunny', 'warm', '?', 'strong', '?', '?']
RESULT:
Thus the Python program to Implement and demonstrate the FIND-S algorithm for finding the most
specific hypothesis based on a given set of training data samples has been implemented and executed
successfully.
EX.NO:2 IMPLEMENTATION OF CANDIDATE ELIMINATION ALGORITHM

AIM:
To implement and demonstrate the Candidate-Elimination algorithm for a given set of training data
examples stored in a .CSV file and to output a description of the set of all hypotheses consistent with the
training examples.

ALGORITHM:

For each training example d,


do: If d is positive example
Remove from G any hypothesis h inconsistent with d
For each hypothesis s in S not consistent with d:
Remove s from S
Add to S all minimal generalizations of s consistent with d and having a generalization in G
Remove from S any hypothesis with a more specific h in S
If d is negative example
Remove from S any hypothesis h inconsistent with d
For each hypothesis g in G not consistent with d:
Remove g from G
Add to G all minimal specializations of g consistent with d and having a specialization in S
Remove from G any hypothesis having a more general hypothesis in G

DATASET:

Sky AirTemp Humidity Wind Water Forecast EnjoySport


Sunny Warm Normal Strong Warm Same yes
Sunny Warm High Strong Warm Same yes
Rainy Cold High Strong Warm Change no
Sunny Warm High Strong Cool Change yes
PROGRAM:
import numpy as
np import pandas
as pd

data = pd.read_csv("C:\\Users\\sride\\OneDrive\\Desktop\\dataset.csv")
concepts = np.array(data.iloc[:,0:-1])
print("\nInstances are:\n",concepts)
target = np.array(data.iloc[:,-1])
print("\nTarget Values are: ",target)

def learn(concepts, target):


specific_h = concepts[0].copy()
print("\nInitialization of specific_h and genearal_h") print("\
nSpecific Boundary: ", specific_h)
general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))] print("\
nGeneric Boundary: ",general_h)

for i, h in enumerate(concepts):
print("\nInstance", i+1 , "is ", h)
if target[i] == "yes":
print("Instance is Positive ")
for x in range(len(specific_h)):
if h[x]!=
specific_h[x]: specific_h[x] ='?'
general_h[x][x] ='?'

if target[i] == "no":
print("Instance is Negative ")
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'

print("Specific Bundary after ", i+1, "Instance is ", specific_h)


print("Generic Boundary after ", i+1, "Instance is ",
general_h) print("\n")

indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
general_h.remove(['?', '?', '?', '?', '?', '?'])
return specific_h, general_h

s_final, g_final = learn(concepts, target)


print("Final Specific_h: ", s_final, sep="\n")
print("Final General_h: ", g_final, sep="\n")
OUTPUT:
Instances are:
[['sunny' 'warm' 'Normal' 'Strong' 'Warm' 'Same']
['sunny' 'warm' 'High' 'Strong' 'Warm' 'Same']
['rainy' 'cold' 'High' 'Strong' 'Warm' 'Change']
['sunny' 'warm' 'High' 'Strong' 'Cool' 'Change']] Target Values are:

['yes' 'yes' 'no' 'yes'] Initialization of

specific_h and genearal_h

Specific Boundary: ['sunny' 'warm' 'Normal' 'Strong' 'Warm' 'Same']

Generic Boundary: [['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?
'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '
?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

Final Specific_h:
['sunny' 'warm' '?' 'Strong' '?' '?'] Final General_h:
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]
RESULT:
Thus the Candidate-Elimination algorithm for a given set of training data examples stored in a
.CSV file and a description of the set of all hypotheses consistent with the training examples has been
implemented and the output has been obtained.
EX.NO:3 ID3 ALOGORITHM USING DECISION TREE

AIM:
To write a python program to demonstrate the working of the decision tree based ID3
algorithm, using an appropriate data set for building the decision tree and apply this knowledge to classify a
new sample.

ALGORITHM:
ID3(Examples, Target_attribute, Attributes)
Examples are the training examples.
Target_attribute is the attribute whose value is to be predicted by the tree. Attributes is a
list of other attributes that may be tested by the learned decision tree. Returns a decision
tree that correctly classifies the given Examples.

Create a Root node for the tree


If all Examples are positive, Return the single-node tree Root, with label = + If
all Examples are negative, Return the single-node tree Root, with label = - If
Attributes is empty, Return the single-node tree Root,
with label = most common value of Target_attribute in Examples

Otherwise Begin
A ← the attribute from Attributes that best* classifies Examples
The decision attribute for Root ← A
For each possible value, vi, of A,
Add a new tree branch below Root, corresponding to the test A = vi Let
Examples vi, be the subset of Examples that have value vi for A If
Examples vi , is empty
Then below this new branch add a leaf node with
label = most common value of Target_attribute in Examples
Else
below this new branch add the subtree ID3(Examples
vi, Targe_tattribute, Attributes – {A}))
End
Return Root
DESCRIPTION:
The best attribute is the one with highest information gain

ENTROPY:
Entropy measures the impurity of a collection of examples.

Where, p+ is the proportion of positive examples in S


p– is the proportion of negative examples in S.
INFORMATION GAIN:
Information gain, is the expected reduction in entropy caused by partitioning the examples
according to this attribute.
The information gain, Gain(S, A) of an attribute A, relative to a collection of examples S, is defined as

Dataset:
PlayTennis Dataset is saved as .csv (comma separated values) file in the current working directory otherwise
use the complete path of the dataset set in the program:

TRAINING DATASET:
Outlook Temperature Humidity Wind PlayTennis

Sunny Hot High Weak No

Sunny Hot High Strong No

Overcast Hot High Weak Yes

Rain Mild High Weak Yes

Rain Cool Normal Weak Yes

Rain Cool Normal Strong No

Overcast Cool Normal Strong Yes

Sunny Mild High Weak No

Sunny Cool Normal Weak Yes

Rain Mild Normal Weak Yes

Sunny Mild Normal Strong Yes

Overcast Mild High Strong Yes

Overcast Hot Normal Weak Yes

Rain Mild High Strong No


TEST DATASET:

Outlook Temperature Humidity Wind


rain cool normal strong
sunny mild normal strong

Program:

import math import


csv
def load_csv(filename): lines=csv.reader(open(filename,"r"));
dataset = list(lines)
headers = dataset.pop(O) return
dataset,headers

class Node:
def init (self,attribute):
self.attribute=attribute self.children=[]
self.answer=""

def subtables(data,col,delete):
dic={}
coldata=[row[col] for row in data] attr=list(set(coldata))

counts=[O]*len(attr) r=len(data)
c=len(data[O])
for x in range(len(attr)): for y in
range(r):
if data[y][col]==attr[x]: counts[x]+=1

for x in range(len(attr)):
dic[attr[x]]=[[O for i in range(c)] for j in range(counts[x])] pos=O
for y in range(r):
if data[y][col]==attr[x]: if delete:
del data[y][col] dic[attr[x]][pos]=data[y]
pos+=1
return attr,dic

def entropy(S):
attr=list(set(S))
if len(attr)==1: return O

counts=[O,O]
for i in range(2):
counts[i]=sum([1 for x in S if attr[i]==x])/(len(S)*1.O) sums=O
for cnt in counts:
sums+=-1*cnt*math.log(cnt,2) return
sums

def compute_gain(data,col):
attr,dic = subtables(data,col,delete=False)

total_size=len(data)
entropies=[O]*len(attr)
ratio=[O]*len(attr)

total_entropy=entropy([row[-1] for row in data]) for x in


range(len(attr)):
ratio[x]=len(dic[attr[x]])/(total_size*1.O) entropies[x]=entropy([row[-1] for row
in dic[attr[x]]])
total_entropy-=ratio[x]*entropies[x] return
total_entropy

def build_tree(data,features): lastcol=[row[-1]


for row in data]
if(len(set(lastcol)))==1: node=Node("")
node.answer=lastcol[O]
return node

n=len(data[O])-1 gains=[O]*n
for col in range(n): gains[col]=compute_gain(data,col)
split=gains.index(max(gains)) node=Node(features[split])
fea = features[:split]+features[split+1:]

attr,dic=subtables(data,split,delete=True) for x in

range(len(attr)):
child=build_tree(dic[attr[x]],fea) node.children.append((attr[x],child))
return node

def print_tree(node,level): if
node.answer!="":
print(" "*level,node.answer) return

print(" "*level,node.attribute) for value,n


in node.children:
print(" "*(level+1),value)
print_tree(n,level+2)

def classify(node,x_test,features): if node.answer!


="":
print(node.answer) return
pos=features.index(node.attribute) for value, n in
node.children:
if x_test[pos]==value: classify(n,x_test,features)

'''Main program''' dataset,features=load_csv("c:\\New folder\\id3.csv")


node1=build_tree(dataset,features)

print("The decision tree for the dataset using ID3 algorithm is") print_tree(node1,O)
testdata,features=load_csv("c:\\New folder\\id3_test_1.csv")

for xtest in testdata:


print("The test instance:",xtest)
print("The label for test instance:",end=" ")
classify(node1,xtest,features)

OUTPUT:

The decision tree for the dataset using ID3 algorithm is Outlook
rain
Wind
weak
yes
strong
no
sunny
Humidity high
no
normal
yes
overcast
yes
The test instance: ['rain', 'cool', 'normal', 'strong'] The label for test instance:
no
The test instance: ['sunny', 'mild', 'normal', 'strong'] The label for test
instance: yes
RESULT:
Thus thepython program to demonstrate the working of the decision tree based ID3
algorithm, using an appropriate data set for building the decision tree has been implemented and executed
successfully,
EX.NO.4 ARTIFICIAL NEURAL NETWORK USING BACKPROPAGATION ALGORITHM

AIM:
To Build an Artificial Neural Network by implementing the Backpropagation algorithm and
test the same using appropriate data sets.

ALGORITHM:

BACKPROPAGATION Algorithm

Each training example is a pair of the form (𝑥, ⃗ ), where (𝑥 ) is the vector of network
BACKPROPAGATION (training_example, ƞ, nin, nout, nhidden )

input values, (𝑡 ) and is the vector of target network output values.


ƞ is the learning rate (e.g., .05). ni, is the number of network inputs, nhidden the number
of units in the hidden layer, and nout the number of output units.
The input from unit i into unit j is denoted xji, and the weight from unit i to unit j is
denoted wji
 Create a feed-forward network with ni inputs, nhidden hidden units, and nout
output units.
 Initialize all network weights to small random numbers

 For each (⃗𝑥, 𝑡 ), in training examples, Do


 Until the termination condition is met, Do

Propagate the input forward through the

1. Input the instance ⃗𝑥, to the network and compute the output ou of every
network:

unit u in the network.


Propagate the errors backward through the network:

Training Examples:

Expected %in
Example Sleep Study
Exams
1 2 9 92
2 1 5 86
3 3 6 89

Normalizetheinput
Expected
Example Sleep Study
%inExam
s
1 2/3=0.66666667 9/9 =1 0.92
2 1/3=0.33333333 5/9=0.55555556 0.86
3 3/3 =1 6/9=0.66666667 0.89
PROGRAM:
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float) # two inputs [sleep,study]
y = np.array(([92], [86], [89]), dtype=float) # one output [Expected % in Exams]
X = X/np.amax(X,axis=O) # maximum of X array longitudinally y = y/1OO

#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function


def derivatives_sigmoid(x): return x * (1 -
x)

#Variable initialization
epoch=5OOO #Setting training iterations lr=O.1
#Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons output_neurons
= 1 #number of neurons at output layer

#weight and bias initialization


wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons)) #weight of the link from
input node to hidden node bh=np.random.uniform(size=(1,hiddenlayer_neurons)) # bias of the
link from input node to hidden node
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons)) #weight of the link from
hidden node to output node bout=np.random.uniform(size=(1,output_neurons)) #bias of the link
from hidden node to output node
#draws a random range of numbers uniformly of dim x*y
for i in range(epoch):

#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout) outinp=
outinp1+ bout
output = sigmoid(outinp)

#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output) d_output =
EO* outgrad
EH = d_output.dot(wout.T)

#how much hidden layer weights contributed to error hiddengrad =


derivatives_sigmoid(hlayer_act) d_hiddenlayer = EH * hiddengrad

# dotproduct of nextlayererror and currentlayerop wout +=


hlayer_act.T.dot(d_output) *lr
wh += X.T.dot(d_hiddenlayer) *lr

print("Input: \n" + str(X)) print("Actual Output: \n"


+ str(y)) print("Predicted Output: \n" ,output)

OUTPUT:

Input:
[[O.66666667 1. ] [O.33333333
O.55555556] [1. O.66666667]]

Actual Output: [[O.92]


[O.86]
[O.89]]

Predicted Output:
[[O.9O64192 ]
[O.892O576 ]
[O.91O3O512]]
RESULT:

Thus the python program to Build an Artificial Neural Network by implementing the Backpropagation
algorithm has been implemented and tested the same using appropriate data sets.
EX.NO:5 LOCALLY WEIGHTED REGRESSION ALGORITHM
DATE:

AIM:
To implement the non-parametric Locally Weighted Regression algorithm in order to fit
datapoints. Select appropriate data set for your experiment and draw graphs.
Regression:
 Regression is a technique from statistics that is used to predict values of a desired target quantity
when the target quantity is continuous.
 In regression, we seek to identify (or estimate) a continuous variable y associated with
 a given input vector x.
 y is called the dependent variable.
 x is called the independent variable.

Loess/Lowess Regression:
Loess regression is a nonparametric technique that uses local weighted regression to fit a
smooth curve through points in a scatter plot.

Lowess Algorithm:
 Locally weighted regression is a very powerful nonparametric model used in statistical
 learning.
 Given a dataset X, y, we attempt to find a model parameter β(x) that minimizes
 residual sum of weighted squared errors.
 The weights are given by a kernel function (k or w) which can be chosen arbitrarily
Algorithm
1. Read the Given data Sample to X and the curve (linear or non linear) to Y
2. Set the value for Smoothening parameter or Free parameter say τ
3. Set the bias /Point of interest set x0 which is a subset of X
4. Determine the weight matrix using :

5. Determine the value of model term parameter β using :

6. Prediction = x0*β

DATASET(5a) :
YearsESalary
xp
eri
en
ce
1.1 3

1.3 4

2.9 5

3 6

3.2 5

4 5

4.1 5

5.1 6

6 9

6.8 9

7.1 9

8.7 1

9.5 1

10.5 1
PROGRAM (5A):
import numpy as np
from matplotlib import pyplot as plt import pandas as
pd
dataset=pd.read_csv('C:\\New folder\\salary_data.csv') x=dataset.iloc[:,:-1].values
y=dataset.iloc[:,1].values
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=1/3,random_stat e=O)
from sklearn.linear_model import LinearRegression regressor=LinearRegression()
regressor.fit(x_train,y_train) y_pred=regressor.predict(x_test)
plt.scatter(x_train,y_train,color='red')
plt.plot(x_train,regressor.predict(x_train),color='blue')
plt.scatter(x_test,y_test,color='red')
plt.plot(x_train,regressor.predict(x_train),color='blue') plt.title('Salary vs
Experience(Test data)') plt.xlabel('Years of Experience')
plt.ylabel('Salary') plt.show()

OUTPUT:

PROGRAM(5B) :
import numpy as np
from bokeh.plotting import figure, show, output_notebook from bokeh.layouts
import gridplot
from bokeh.io import push_notebook

def local_regression(xO, X, Y, tau):# add bias term


xO = np.r_[1, xO] # Add one to avoid the loss in information
X = np.c_[np.ones(len(X)), X]
# fit model: normal equations with kernel
xw = X.T * radial_kernel(xO, X, tau) # XTranspose * W
beta = np.linalg.pinv(xw @ X) @ xw @ Y #@ Matrix Multiplication or Dot Product
# predict value
return xO @ beta # @ Matrix Multiplication or Dot Product for prediction
def radial_kernel(xO, X, tau):
return np.exp(np.sum((X - xO) ** 2, axis=1) / (-2 * tau * tau)) # Weight or Radial
Kernal Bias Function

n = 1OOO
# generate dataset
X = np.linspace(-3, 3, num=n)
print("The Data Set ( 1O Samples) X :\n",X[1:1O]) Y =
np.log(np.abs(X ** 2 - 1) + .5)
print("The Fitting Curve Data Set (1O Samples) Y :\n",Y[1:1O]) # jitter X
X += np.random.normal(scale=.1, size=n) print("Normalised (1O
Samples) X :\n",X[1:1O])

domain = np.linspace(-3, 3, num=3OO)


print(" Xo Domain Space(1O Samples) :\n",domain[1:1O]) def plot_lwr(tau):
# prediction through regression
prediction = [local_regression(xO, X, Y, tau) for xO in domain] plot =
figure(plot_width=4OO, plot_height=4OO)
plot.title.text='tau=%g' % tau plot.scatter(X, Y, alpha=.3)
plot.line(domain, prediction, line_width=2, color='red') return plot

show(gridplot([ [plot_lwr(1O.),
plot_lwr(1.)],
[plot_lwr(O.1), plot_lwr(O.O1)]]))

OUTPUT:
The Data Set ( 1O Samples) X :
[-2.99399399 -2.98798799 -2.98198198 -2.97597598 -2.96996997 -2.96396396
-2.95795796 -2.95195195 -2.94594595]

The Fitting Curve Data Set (1O Samples) Y :


[2.13582188 2.131568O6 2.1273O467 2.123O3166 2.11874898 2.11445659
2.11O15444 2.1O584249 2.1O152O68]

Normalised (1O Samples) X :


[-3.OO563O87 -3.O151O269 -3.O3979575 -2.92953593 -3.O3247972 -3.O5165O18
-2.9O745257 -2.84983O92 -3.11986743]

Xo Domain Space(1O Samples) :


[-2.97993311 -2.95986622 -2.93979933 -2.91973244 -2.89966555 -2.87959866
-2.85953177 -2.83946488 -2.81939799]
RESULT:
Thus the non-parametric Locally Weighted Regression algorithm in order to fit datapoints.has
been implemented and graphs are drawn successfully.
Ex.No.6 Naïve Bayesian Classifier

AIM:
To write a python program to implement the naïve Bayesian classifier for a sample training data set
stored as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

DESCRIPTION:

𝑷(𝑫⁄𝒉)𝑷(𝒉
Bayes’ Theorem is stated as:

𝑷(𝒉⁄𝑫) =
𝑷(𝑫)
)
Where,
P(h|D) is the probability of hypothesis h given the data D. This is called the posterior
probability.
P(D|h) is the probability of data d given that the hypothesis h was true.
P(h) is the probability of hypothesis h being true. This is called the prior probability of h.
P(D) is the probability of the data. This is called the prior probability of D

interested in finding the most probable hypothesis h ∈ H given the observed data D. Any such
After calculating the posterior probability for a number of different hypotheses h, and is

maximally probable hypothesis is called a maximum a posteriori (MAP) hypothesis.


Bayes theorem to calculate the posterior probability of each candidate hypothesis is hMAP is a MAP
hypothesis provided

(Ignoring P(D) since it is a constant)


Gaussian Naive Bayes
A Gaussian Naive Bayes algorithm is a special type of Naïve Bayes algorithm. It’s specifically
used when the features have continuous values. It’s also assumed that all the features are
following a Gaussian distribution i.e., normal distribution.

Representation for Gaussian Naive Bayes


We calculate the probabilities for input values for each class using a frequency. With realvalued inputs,
we can calculate the mean and standard deviation of input values (x) for each class to summarize the
distribution.
This means that in addition to the probabilities for each class, we must also store the mean and
standard deviations for each input variable for each class.
Gaussian Naive Bayes Model from Data
The probability density function for the normal distribution is defined by two parameters (mean
and standard deviation) and calculating the mean and standard deviation values of each input
variable (x) for each class value.
Examples:
The data set used in this program is the Pima Indians Diabetes problem.
This data set is comprised of 768 observations of medical details for Pima Indians
patents. The records describe instantaneous measurements taken from the patient such
as their age, the number of times pregnant and blood workup. All patients are women
aged 21 or older. All attributes are numeric, and their units vary from attribute to
attribute.
The attributes are Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin,
BMI, DiabeticPedigreeFunction, Age, Outcome
Each record has a class value that indicates whether the patient suffered an onset of
diabetes within 5 years of when the measurements were taken (1) or not (0)

DATASET:
num_preg glucose_conc diastolic_bp thickness insulin bmi diab_pred age diabetes
6 148 72 35 0 33.6 0.627 50 1
1 85 66 29 0 26.6 0.351 31 0
8 183 64 0 0 23.3 0.672 32 1
1 89 66 23 94 28.1 0.167 21 0
0 137 40 35 168 43.1 2.288 33 1
5 116 74 0 0 25.6 0.201 30 0
3 78 50 32 88 31 0.248 26 1
10 115 0 0 0 35.3 0.134 29 0
2 197 70 45 543 30.5 0.158 53 1
PROGRAM:
import csv
import random
import math

def loadcsv(filename):
lines = csv.reader(open(filename, "r"))
dataset = list(lines)
for i in range(len(dataset)):
#converting strings into numbers for processing
dataset[i] = [float(x) for x in dataset[i]]

return dataset

def splitdataset(dataset, splitratio):


#67% training size
trainsize = int(len(dataset) * splitratio);
trainset = []
copy = list(dataset);
while len(trainset) <trainsize:
#generate indices for the dataset list randomly to pick ele for training data
index = random.randrange(len(copy));
trainset.append(copy.pop(index))
return [trainset, copy]

def separatebyclass(dataset):
separated = {} #dictionary of classes 1 and 0
#creates a dictionary of classes 1 and 0 where the values are
#the instances belonging to each class
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in
separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated

def mean(numbers):
return sum(numbers)/float(len(numbers))

def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)

def summarize(dataset): #creates a dictionary of classes


summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)];
del summaries[-1] #excluding labels +ve or -ve
return summaries

def summarizebyclass(dataset):
separated = separatebyclass(dataset);
#print(separated)
summaries = {}
for classvalue, instances in separated.items():
#for key,value in dic.items()
#summaries is a dic of tuples(mean,std) for each class value
summaries[classvalue] = summarize(instances) #summarize is used to cal to mean and std
return summaries

def calculateprobability(x, mean, stdev):


exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent

def calculateclassprobabilities(summaries, inputvector):


probabilities = {} # probabilities contains the all prob of all class of test data
for classvalue, classsummaries in summaries.items():#class and attribute information as mean and sd
probabilities[classvalue] = 1
for i in range(len(classsummaries)):
mean, stdev = classsummaries[i] #take mean and sd of every attribute for class 0 and 1 seperaely
x = inputvector[i] #testvector's first attribute
probabilities[classvalue] *= calculateprobability(x, mean, stdev);#use normal dist
return probabilities

def predict(summaries, inputvector): #training and test data is passed


probabilities = calculateclassprobabilities(summaries, inputvector)
bestLabel, bestProb = None, -1
for classvalue, probability in probabilities.items():#assigns that class which has he highest prob
if bestLabel is None or probability >bestProb:
bestProb = probability
bestLabel = classvalue
return bestLabel

def getpredictions(summaries, testset):


predictions = []
for i in range(len(testset)):
result = predict(summaries, testset[i])
predictions.append(result)
return predictions

def getaccuracy(testset, predictions):


correct = 0
for i in range(len(testset)):
if testset[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testset))) * 100.0

def main():
filename = 'C:\\New folder\\naivedata1.csv'
splitratio = 0.67
dataset = loadcsv(filename);

trainingset, testset = splitdataset(dataset, splitratio)


print('Split {0} rows into train={1} and test={2} rows'.format(len(dataset), len(trainingset), len(testset
)))
# prepare model
summaries =
summarizebyclass(trainingset);
#print(summaries)
# test model
predictions = getpredictions(summaries, testset) #find the predictions of test data with the training dat
a
accuracy = getaccuracy(testset, predictions)
print('Accuracy of the classifier is : {0}%'.format(accuracy))

main()

OUTPUT:

Split 20 rows into train=13 and test=7 rows


Accuracy of the classifier is : 42.857142857142854%
RESULT:
Thus the python program for naïve Bayesian classifier for a sample training data set has been
implemented and accuracy of classifier has been computed successfully.
Ex.No.7 NAIVE BAYESIAN TEXT CLASSIFIER

AIM:
To classify set of documents that using the naïve Bayesian Classifier model and to calculate the
accuracy, precision, and recall for the sample data set.

ALGORITHM:
Naive Bayes algorithms for learning and classifying text
LEARN_NAIVE_BAYES_TEXT (Examples, V)
Examples is a set of text documents along with their target values. V is the set of all possible
target values. This function learns the probability terms P(wk |vj,), describing the probability
that a randomly drawn word from a document in class vj will be the English word wk. It
also learns the class prior probabilities P(vj).
1. collect all words, punctuation, and other tokens that occur in Examples
Vocabulary ← c the set of all distinct words and other tokens occurring in any text
document from Examples
2. calculate the required P(vj) and P(wk|vj) probability
terms For each target value vj in V do
• docsj ← the subset of documents from Examples for which the target value is vj
• P(vj) ← | docsj | / |Examples|
• Textj ← a single document created by concatenating all members of docsj
• n ← total number of distinct word positions in Textj
• for each word wk in Vocabulary
 nk ← number of times word wk occurs in Textj
 P(wk|vj) ← ( nk + 1) / (n + | Vocabulary| )

CLASSIFY_NAIVE_BAYES_TEXT (Doc)
Return the estimated target value for the document Doc. ai denotes the word found in the
ith position within Doc.
• positions ← all word positions in Doc that contain tokens found in Vocabulary
• Return VNB, where
DATASET:
Text documents Label
I love this sandwich pos
This is an amazing place pos
I feel very good about these beers pos
This is my best work pos
What an awesome view pos
I do not like this restaurant neg
I am tired of this stuff neg
I can't deal with this neg
He is my sworn enemy neg
My boss is horrible neg
This is an awesome place pos
I do not like the taste of this juice neg
I love to dance pos
I am sick and tired of this place neg
What a great holiday pos
That is a bad locality to stay neg
We will have good fun tomorrow pos
I went to my enemy's house today neg

PROGRAM:
import pandas as pd msg=pd.read_csv('naivetext.csv',names=['message','label']) print('The
dimensions of the dataset',msg.shape) msg['labelnum']=msg.label.map({'pos':1,'neg':O})
X=msg.message
y=msg.labelnum
print(X) print(y)

#splitting the dataset into train and test data from


sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print ('\n The total number of Training Data :',ytrain.shape) print ('\n The total
number of Test Data :',ytest.shape)

#output of count vectoriser is a sparse matrix


from sklearn.feature_extraction.text import CountVectorizer count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(xtrain) xtest_dtm=count_vect.transform(xtest)
print('\n The words or Tokens in the text documents \n') print(count_vect.get_feature_names())
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_na mes())
# Training Naive Bayes (NB) classifier on training data.
from sklearn.naive_bayes import MultinomialNB clf =
MultinomialNB().fit(xtrain_dtm,ytrain) predicted =
clf.predict(xtest_dtm)

#printing accuracy, Confusion matrix, Precision and Recall


from sklearn import metrics print('\n Accuracy of
the classifer
is',metrics.accuracy_score(ytest,predicted)) print('\n Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted)) print('\n The value of
Precision'
,metrics.precision_score(ytest,predicted))
print('\n The value of Recall' ,metrics.recall_score(ytest,predicted))

OUTPUT:
The dimensions of the dataset (18, 2)
O I love this sandwich
1 This is an amazing place
2 I feel very good about these beers
3 This is my best work
4 What an awesome view
5 I do not like this restaurant
6 I am tired of this stuff
7 I can't deal with this
8 He is my sworn enemy
9 My boss is horrible
1O This is an awesome place
11 I do not like the taste of this juice
12 I love to dance
13 I am sick and tired of this place
14 What a great holiday
15 That is a bad locality to stay
16 We will have good fun tomorrow
17 I went to my enemy's house today Name:
message, dtype: object
O 1
1 1
2 1
3 1
4 1
5 O
6 O
7 O
8 O
9 O
1O 1
11 O
12 1
13 O
14 1
15 O
16 1
17 O
Name: labelnum, dtype: int64

The total number of Training Data : (13,) The total

number of Test Data : (5,)

The words or Tokens in the text documents

['about', 'am', 'amazing', 'an', 'and', 'awesome', 'bad', 'beers',


'best', 'boss', 'can', 'deal', 'do', 'enemy', 'feel', 'fun', 'good',
'have', 'he', 'horrible', 'house', 'is', 'like', 'locality', 'love',
'my', 'not', 'of', 'place', 'restaurant', 'sandwich', 'sick', 'stay',
'sworn', 'that', 'these', 'this', 'tired', 'to', 'today', 'tomorrow',
'very', 'view', 'we', 'went', 'what', 'will', 'with', 'work'] Accuracy of the classifer is O.8

Confusion matrix [[2 O]


[1 2]]

The value of Precision 1.O

The value of Recall O.6666666666666666

DESCRIPTION:
Confusion Matrix
True positives: data points labelled as positive that are actually positive
False positives: data points labelled as positive that are actually negative
True negatives: data points labelled as negative that are actually negative
False negatives: data points labelled as negative that are actually positive
Example:
Accuracy: how often is the classifier correct?
Example: Movie Review
Doc Text Class
1 I loved the movie +
2 I hated the movie -
3 a great movie. good movie +
4 poor acting -
5 great acting. good movie
+ Unique word
< I, loved, the, movie, hated, a, great, good, poor, acting>
Doc I loved the movie hated a great good poor acting Class
11111+
21111-
32111+
411-
51111+
Doc I loved the movie hated a great good poor acting Class
11111+
32111+

𝑃(+)
51111+

=3
5

𝑃(𝐼 |+)
= 0.6

=1+1

𝑃(𝑙𝑜𝑣𝑒𝑑 |+)
14 + 10 = 0.0833

=1+1

𝑃(𝑡ℎ𝑒 |+)
14 + 10 = 0.0833

=1+1

𝑃(𝑚𝑜𝑣𝑖𝑒 |+)
14 + 10 = 0.0833

=4+1

𝑃(ℎ𝑎𝑡𝑒𝑑 |+)
14 + 10 = 0.2083

=0+1

𝑃(𝑎 |+)
14 + 10 = 0.0416

=1+1

𝑃(𝑔𝑟𝑒𝑎𝑡 |+)
14 + 10 = 0.0833

=2+1

𝑃(𝑔𝑜𝑜𝑑 |+)
14 + 10 = 0.125

=2+1

𝑃(𝑝𝑜𝑜𝑟 |+)
14 + 10 = 0.125

=0+1

𝑃(𝑎𝑐𝑡𝑖𝑛𝑔 |+)
14 + 10 = 0.0416

=1+1
14 + 10 = 0.0833
Doc I loved the movie hated a great good poor acting Class
21111-

𝑃(−) =
411-

2
5

𝑃(𝐼 |−) =
= 0.4

1+1
6 + 10 = 0.125
𝑃(𝑙𝑜𝑣𝑒𝑑 |−) =
0+1

𝑃(𝑡ℎ𝑒 |−) =
6 + 10 = 0.0625

1+1

𝑃(𝑚𝑜𝑣𝑖𝑒|−) =
6 + 10 = 0.125

1+1

𝑃(ℎ𝑎𝑡𝑒𝑑 |−) =
6 + 10 = 0.125

1+1

𝑃(𝑎 |−) =
6 + 10 = 0.125

0+1

𝑃(𝑔𝑟𝑒𝑎𝑡 |−) =
6 + 10 = 0.0625

0+1

𝑃(𝑔𝑜𝑜𝑑 |−) =
6 + 10 = 0.0625

0+1

𝑃(𝑝𝑜𝑜𝑟|−) =
6 + 10 = 0.0625

1+1

𝑃(𝑎𝑐𝑡𝑖𝑛𝑔|−) =
6 + 10 = 0.125

1+1
6 + 10 = 0.125
Let’s classify the new document
I hated the poor acting
If Vj = +
then,
= P(+) P(I | +) P(hated | +) P(the | +) P(poor | +) P(acting | +)
= 0.6 * 0.0833 * 0.0416 * 0.0833 * 0.0416 * 0.0833
= 6.03 X 10−2
If Vj = −
then,
= P(−) P(I | −) P(hated | −) P(the | −) P(poor | −) P(acting | −)
= 0.4 * 0.125 * 0.125 * 0.125 * 0.125 * 0.125
= 1.22 X 10−5
= 1.22 X 10−5 > 6.03 X 10−2
So, the new document belongs to ( − ) class
RESULT:
Thus the set of documents has been classified using the naïve Bayesian Classifier model and the
accuracy, precision, and recall for the sample data set has been calculated.
EX.No:8 CONSTRUCTION OF BAYESIAN NETWORK

AIM:
To Write a Python program to construct a Bayesian network considering medical data and use
this model to demonstrate the diagnosis of heart patients using standard Heart Disease Data Set.

DESCRIPTION:
A Bayesian network is a directed acyclic graph in which each edge corresponds to a conditional
dependency, and each node corresponds to a unique random variable.
Bayesian network consists of two major parts: a directed acyclic graph and a set of conditional
probability distributions
• The directed acyclic graph is a set of random variables represented by nodes.
• The conditional probability distribution of a node (random variable) is defined for every
possible outcome of the preceding causal node(s).
For illustration, consider the following example. Suppose we attempt to turn on our computer, but the
computer does not start (observation/evidence). We would like to know which of the possible causes of
computer failure is more likely. In this simplified illustration, we assume only two possible causes of
this misfortune: electricity failure and computer malfunction.
The corresponding directed acyclic graph is depicted in below figure.

Fig: Directed acyclic graph representing two independent possible causes of a computer
failure. The goal is to calculate the posterior conditional probability distribution of each of the
possible unobserved causes given the observed evidence, i.e. P [Cause | Evidence].

DATA SET:
age sex cp trestbps chol fbs restecg Thalach exang oldpeak slope ca thal heartdisease
63 1 1 145 233 1 2 150 0 2.3 3 0 6 0
67 1 4 160 286 0 2 108 1 1.5 2 3 3 2
67 1 4 120 229 0 2 129 1 2.6 2 2 7 1
37 1 3 130 250 0 0 187 0 3.5 3 0 3 0
41 0 2 130 204 0 2 172 0 1.4 1 0 3 0
56 1 2 120 236 0 0 178 0 0.8 1 0 3 0
62 0 4 140 268 0 2 160 0 3.6 3 2 3 3
57 0 4 120 354 0 0 163 1 0.6 1 0 3 0
63 1 4 130 254 0 2 147 0 1.4 2 1 7 2

Attribute Information:
1. age: age in years
2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type
• Value 1: typical angina
• Value 2: atypical angina
• Value 3: non-anginal pain
• Value 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
• Value 0: normal
• Value 1: having ST-T wave abnormality (T wave inversions and/or ST
elevation or depression of > 0.05 mV)
• Value 2: showing probable or definite left ventricular hypertrophy by Estes'
criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10. oldpeak = ST depression induced by exercise relative to rest
11. slope: the slope of the peak exercise ST segment
• Value 1: upsloping
• Value 2: flat
• Value 3: downsloping
12. thal: 3 = normal; 6 = fixed defect; 7 = reversable
defect 13.Heartdisease: It is integer valued from 0 (no
presence) to 4.
PROGRAM:
import numpy as np
import pandas as pd
import csv
from pgmpy.estimators import MaximumLikelihoodEstimator from
pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination #read Cleveland
Heart Disease data heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan) #display the data
print('Sample instances from the dataset are given below') print(heartDisease.head())
#display the Attributes names and datatyes print('\n
Attributes and datatypes') print(heartDisease.dtypes)
#Creat Model- Bayesian Network
model =BayesianModel([('age','heartdisease'),('sex','heartdisease'), ('exang','heartdisease'),('cp','heartdisease'),
('heartdisease','restec g'),('heartdisease','chol')])
#Learning CPDs using Maximum Likelihood Estimators print('\n Learning CPD
using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator) # Inferencing with
Bayesian Network
print('\n Inferencing with Bayesian Network:') HeartDiseasetest_infer =
VariableElimination(model) #computing the Probability of HeartDisease given
restecg
print('\n 1.Probability of HeartDisease given evidence=restecg :2')
q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={' restecg':2})
print(q1)
#computing the Probability of HeartDisease given cp
print('\n 2.Probability of HeartDisease given evidence= cp:2 ')
q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={' cp':2})
print(q2)
OUTPUT:
Attributes and datatypes age
int64
sex int64
cp int64
trestbps int64
chol int64
fbs int64
restecg int64
thalach int64
exang int64
oldpeak float64
slope int64
ca int64
thal int64
heartdisease int64
dtype: object

Learning CPD using Maximum likelihood estimators Inferencing with


Bayesian Network:

1. Probability of HeartDisease given evidence=restecg :1


+ + +
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(O) | O.12O9 |
+ + +
| heartdisease(1) | O.2714 |
+ + +
| heartdisease(2) | O.3363 |
+ + +
| heartdisease(3) | O.2714 |
+ + +

2. Probability of HeartDisease given evidence= cp:2


+ + +
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(O) | O.3O56 |
+ + +
| heartdisease(1) | O.2315 |
+ + +
| heartdisease(2) | O.2315 |
+ + +
| heartdisease(3) | O.2315 |
+ + +
RESULT:
Thus the Python program to construct a Bayesian network considering medical data and using
this model to demonstrate the diagnosis of heart patients using standard Heart Disease Data Set has been
implemented and executed successfully.
Ex.No.9 EM ALGORITHM AND K-MEANS ALGORITHM FOR CLUSTERING

AIM:
To Apply EM algorithm to cluster a set of data stored in a .CSV file and using the
same data set for clustering using k-Means algorithm and then to Compare the results of
these two algorithms and comment on the quality of clustering.

Description:
 Clustering is an important means of data mining and of algorithms that separate data of similar
nature. Unlike the classification algorithm, clustering belongs to the unsupervised type of
algorithms.
 Two representatives of the clustering algorithms are the K-means algorithm and the expectation
maximization (EM) algorithm.
 EM and K-means are similar in the sense that they allow model refining of an iterative process to
find the best congestion.
 However, the K-means algorithm differs in the method used for calculating the Euclidean
distance while calculating the distance between each of two data items; and EM uses statistical
methods.
 The EM algorithm is often used to provide the functions more effectively.
 Clustering means to split a large data set into a plurality of clusters of data, which share some
trait of each subset.
 It is carried out by calculating the similarity or proximity based on the distance measurement
method. The two can be divided into partial clustering and hierarchical clustering in the data.
 Hierarchical clustering can be agglomerative or divisive, i.e. bottom–up or top–down,
respectively. It begins from each element and is intended to form a hierarchical cluster structure.
The elements form a tree structure, which is a single cluster with all the elements on the other
end
Algorithm: K-means clustering
The cluster analysis procedure is analyzed to determine the properties of the data set and the target
variable. It is typically used to determine how to measure similarity distance. Basically, it functions as
follows:
 Input: The number of k and a database containing n objects.
 Output: A set of k-clusters that minimize the squared-error criterion.
 Method:
1. arbitrarily choose k objects as the initial cluster centre’s;
2. repeat;
3. (re)assign each object to the cluster to which the object is the most similar based on
the mean value of the objects in the cluster;
4. update the cluster mean, i.e. calculate the mean value of the object for each cluster;
5. until no change.

Algorithm: EM clustering

The concept of the EM algorithm stems from the Gaussian mixture model (GMM). The GMM method is
one way to improve the density of a given set of sample data modelled as a function of the probability
density of a single-density estimation method with multiple Gaussian probability density function to
model the distribution of the data. In general, to obtain the estimated parameters of each Gaussian blend
component if given a sample data set of the log-likelihood of the data, the maximum is determined by
the
EM algorithm to estimate the optimal model. Principally, the EM clustering method uses the following
algorithm:
Input: Cluster number k, a database, stopping tolerance.
Output: A set of k-clusters with weight that maximize log-likelihood function.
1. Expectation step: For each database record x, compute the membership probability of x in each
cluster h = 1,…, k.
2. Maximization step: Update mixture model parameter (probability weight).
3. Stopping criteria: If stopping criteria are satisfied stop, else set j = j +1 and go to (1).
In the analytical methods available to achieve probability distribution parameters, in all probability the
value of the variable is given. The iterative EM algorithm uses a random variable and, eventually, is a
general method to find the optimal parameters of the hidden distribution function from the given data,
when the data are incomplete or has missing values.

Dataset: (iris_dataset.csv)

5.1 3.5 1.4 0.2 Iris−setosa


4.9 3 1.4 0.2 Iris−setosa
4.7 3.2 1.3 0.2 Iris−setosa
4.6 3.1 1.5 0.2 Iris−setosa
5 3.6 1.4 0.2 Iris−setosa
7 3.2 4.7 1.4 Iris−versicolor
6.4 3.2 4.5 1.5 Iris−versicolor
6.9 3.1 4.9 1.5 Iris−versicolor
5.5 2.3 4 1.3 Iris−versicolor
6.5 2.8 4.6 1.5 Iris−versicolor
6.7 3 5.2 2.3 Iris−virginica
6.3 2.5 5 1.9 Iris−virginica
6.5 3 5.2 2 Iris−virginica
6.2 3.4 5.4 2.3 Iris−virginica
5.9 3 5.1 1.8 Iris−virginica
PROGRAM:
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture import
sklearn.metrics as metrics
import pandas as pd import
numpy as np
import matplotlib.pyplot as plt

names = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width', 'Class'] dataset = pd.read_csv("iris_data.csv",

names=names)
X = dataset.iloc[:, :-1]
label = {'Iris-setosa': O,'Iris-versicolor': 1, 'Iris-virginica': 2} y = [label[c] for c in
dataset.iloc[:, -1]] plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])

# REAL PLOT
plt.subplot(1,3,1) plt.title('Real')
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y])

# K-PLOT
model=KMeans(n_clusters=3, random_state=O).fit(X) plt.subplot(1,3,2)
plt.title('KMeans') plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[model.labels_])

print('The accuracy score of K-Mean: ',metrics.accuracy_score(y, model.labels_))


print('The Confusion matrixof K-Mean:\n',metrics.confusion_matrix(y, model.labels_))

# GMM PLOT
gmm=GaussianMixture(n_components=3, random_state=O).fit(X) y_cluster_gmm=gmm.predict(X)
plt.subplot(1,3,3) plt.title('GMM
Classification')
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm]) print('The accuracy score of EM:
',metrics.accuracy_score(y, y_cluster_gmm)) print('The Confusion matrix of EM:\n
',metrics.confusion_matrix(y, y_cluster_gmm))

OUTPUT:
The accuracy score of K-Mean: O.O
The Confusion matrixof K-Mean:
[[O 5 O]
[4 O 1]
[5 O O]]
The accuracy score of EM: O.O
The Confusion matrix of EM:
[[O 5 O]
[O O 5]
[5 O O]]
RESULT:

Thus the implementation of EM algorithm and K-Means algorithm to clustering of a set of


data stored in a .CSV file and the results of these two algorithms has been compared and the
quality of clustering has been commented successfully.
EX.NO.10 IMPLEMENTATION OF K-NEAREST NEIGHBOUR CLASSIFICATION

AIM:
To write a python program to implement K-Nearest Neighbour classification algorithm to
classify the iris data set and to print both correct and wrong predictions.

ALGORITHM:
K-Nearest Neighbor Algorithm
Training algorithm:

 For each training example (x, f (x)), add the example to the list training examples
Classification algorithm:
 Given a query instance xq to be classified,
 Let x1 . . .xk denote the k instances from training examples that are nearest to xq
 Return

 Where, f(xi) function to calculate the mean value of the k nearest training examples.

DESCRIPTION

Confusion Matrix

True positives: data points labelled as positive that are actually positive
False positives: data points labelled as positive that are actually negative
True negatives: data points labelled as negative that are actually negative
False negatives: data points labelled as negative that are actually positive
Accuracy: how often is the classifier correct?

F1-Score:

Support: Total Predicted of Class.

Support = TP + FN
Example:
Support _ A = TP_A + FN_A
= 30 + (20 + 10)
= 60

DATASET:
5.1 3.5 1.4 0.2 Iris−setosa
4.9 3 1.4 0.2 Iris−setosa
4.7 3.2 1.3 0.2 Iris−setosa
4.6 3.1 1.5 0.2 Iris−setosa
5 3.6 1.4 0.2 Iris−setosa
7 3.2 4.7 1.4 Iris−versicolor
6.4 3.2 4.5 1.5 Iris−versicolor
6.9 3.1 4.9 1.5 Iris−versicolor
5.5 2.3 4 1.3 Iris−versicolor
6.5 2.8 4.6 1.5 Iris−versicolor
6.7 3 5.2 2.3 Iris−virginica
6.3 2.5 5 1.9 Iris−virginica
6.5 3 5.2 2 Iris−virginica
6.2 3.4 5.4 2.3 Iris−virginica
5.9 3 5.1 1.8 Iris−virginica

Program:
import numpy as np import
pandas as pd
from sklearn.neighbors import KNeighborsClassifier from
sklearn.model_selection import train_test_split from sklearn import metrics

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

# Read dataset to pandas dataframe


dataset = pd.read_csv("9-dataset.csv", names=names)
X = dataset.iloc[:, :-1] y =
dataset.iloc[:, -1] print(X.head())
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=O.1O) classifier =

KNeighborsClassifier(n_neighbors=5).fit(Xtrain, ytrain) ypred = classifier.predict(Xtest)

i = O print
("\n
")
print ('%-25s %-25s %-25s' % ('Original Label', 'Predicted Label', 'Correct/Wrong'))
print ("
")
for label in ytest:
print ('%-25s %-25s' % (label, ypred[i]), end="") if (label ==
ypred[i]):
print (' %-25s' % ('Correct')) else:
print (' %-25s' % ('Wrong')) i = i + 1
print ("
")
print("\nConfusion Matrix:\n",metrics.confusion_matrix(ytest, ypred)) print ("
")
print("\nClassification Report:\n",metrics.classification_report(ytest, ypred))
print ("
")
print('Accuracy of the classifer is %O.2f' % metrics.accuracy_score(ytest,ypred))
print ("
")

OUTPUT:

sepal-length sepal-width petal-length petal-width


O 5.1 3.5 1.4 O.2
1 4.9 3.O 1.4 O.2
2 4.7 3.2 1.3 O.2
3 4.6 3.1 1.5 O.2
4 5.O 3.6 1.4 O.2

--
Original Label Predicted Label Correct/Wrong

--
Iris-versicolor Iris-versicolor Correct
Iris-virginica Iris-versicolor Wrong
Iris-virginica Iris-virginica Correct
Iris-versicolor Iris-versicolor Correct
Iris-setosa Iris-setosa Correct
Iris-versicolor Iris-versicolor Correct
Iris-setosa Iris-setosa Correct
Iris-setosa Iris-setosa Correct
Iris-virginica Iris-virginica Correct
Iris-virginica Iris-versicolor Wrong
Iris-virginica Iris-virginica Correct
Iris-setosa Iris-setosa Correct
Iris-virginica Iris-virginica Correct
Iris-virginica Iris-virginica Correct
Iris-versicolor Iris-versicolor Correct
--

Confusion Matrix:
[[4 O O]
[O 4 O]
[O 2 5]]

--

Classification Report:
precision recall f1-score support

Iris-setosa 1.OO 1.OO 1.OO 4


Iris-versicolor O.67 1.OO O.8O 4
Iris-virginica 1.OO O.71 O.83 7

avg / total O.91 O.87 O.87 15

--
Accuracy of the classifer is O.87

--
RESULT:
Thus the python program to implement k-Nearest Neighbour algorithm to classify the iris data set
has been written and executed successfully.

You might also like