0% found this document useful (0 votes)

152 views

ML Lab Manual

This document provides instructions for two experiments in a Machine Learning laboratory course. The first experiment involves implementing the FIND-S algorithm to find the most specific hypothesis based on a training data set read from a CSV file. The training data contains examples of weather conditions and whether people play tennis. The algorithm updates the hypothesis for each example until reaching the final hypothesis. The second experiment involves implementing the Candidate-Elimination algorithm to output all hypotheses consistent with the training examples for a given CSV data set. It initializes the most specific and general hypotheses and compares each example to update the hypothesis space, removing inconsistent hypotheses and adding new ones when needed.

Uploaded by

Khairunnisa Safi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

152 views

ML Lab Manual

Uploaded by

Khairunnisa Safi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

BET’S

BASAVAKALYAN ENGINEERING
COLLEGE, BASAVAKALYAN
(Approved by AICTE New Delhi, Affiliated to VTU Belagavi & Recognized by
Govt. of Karnataka-ISO: 9001:2015 Certified)
NH-65, Basavakalyan, Bidar District-585327(Karnataka)

Department Of Computer Science

& Engineering
Subject: Machine Learning Laboratory
Subject Code: 15CSL76

Prepared By:
Mr. Allamaprabhu Vastrad Mrs. Sangeeta K Mr. Deepak G
Asst. Professor Instructor Instructor
Machine Learning Laboratory-15CSL76

MACHINE LEARNING LABORATORY

[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2016 -2017)
SEMESTER – VII
Subject Code 15CSL76 IA Marks 20
Number of Lecture Hours/Week 01I + 02P Exam Marks 80
Total Number of Lecture Hours 40 Exam Hours 03
CREDITS – 02
Course objectives: This course will enable students to
1. Make use of Data sets in implementing the machine learning algorithms
2. Implement the machine learning concepts and algorithms in any suitable language of
choice.
Description (If any):
1. The programs can be implemented in either JAVA or Python.
2. For Problems 1 to 6 and 10, programs are to be developed without using the built-in
classes or APIs of Java/Python.
3. Data sets can be taken from standard repositories
(https://archive.ics.uci.edu/ml/datasets.html) or constructed by the students.
Lab Experiments:
1. Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training data from a
.CSV file.
2. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of the set
of all hypotheses consistent with the training examples.
3. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge
to classify a new sample.
4. Build an Artificial Neural Network by implementing the Back propagation algorithm
and test the same using appropriate data sets.
5. Write a program to implement the naïve Bayesian classifier for a sample training data
set stored as a .CSV file. Compute the accuracy of the classifier, considering few test
data sets.
6. Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to write
the program. Calculate the accuracy, precision, and recall for your data set.
7. Write a program to construct a Bayesian network considering medical data. Use this
model to demonstrate the diagnosis of heart patients using standard Heart Disease Data
Set. You can use Java/Python ML library classes/API.
8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data
set for clustering using k-Means algorithm. Compare the results of these two
algorithms and comment on the quality of clustering. You can add Java/Python ML
library classes/API in the program.
9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be
used for this problem.
10. Implement the non-parametric Locally Weighted Regression algorithm in order to
fit data points. Select appropriate data set for your experiment and draw graphs.

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

Study Experiment / Project:

NIL
Course outcomes: The students should be able to:
1. Understand the implementation procedures for the machine learning algorithms.
2. Design Java/Python programs for various Learning algorithms.
3. Apply appropriate data sets to the Machine Learning algorithms.
4. Identify and apply Machine Learning algorithms to solve real world problems.
Conduction of Practical Examination:
 All laboratory experiments are to be included for practical examination.
 Students are allowed to pick one experiment from the lot.
 Strictly follow the instructions as printed on the cover page of answer script
 Marks distribution: Procedure + Conduction + Viva:20 + 50 +10 (80)
Change of experiment is allowed only once and marks allotted to the procedure part to
be made zero.

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

1. Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis
based on a given set of training data samples. Read the training data from a .CSV file.

import random
import csv
attributes=[['Sunny','Rainy'],
['Warm','Cold'],
['Normal','High'],
['Strong','Weak'],
['Warm','Cool'],
['Same','Change']]
num_attributes=len(attributes)
print("\nThe most general hypothesis:[?,?,?,?,?,?]\n")
print("\nThe most specific hypothesis:[0,0,0,0,0,0]\n")
a=[]
print("\nThe given Training Data Set\n")
with open('CSVFile.csv','r') as csvFile:
reader=csv.reader(csvFile)
for row in reader:
a.append(row)
print(row)
print("\nThe initial value of hypothesis:")
hypothesis=['0']*num_attributes
print(hypothesis)
for j in range(0,num_attributes):
hypothesis[j]=a[0][j] #fill the hypothesis with the a's first row
print("\nFind-S: Finding a maximally Specific Hypothesis\n")
for i in range(0,len(a)):
if a[i][num_attributes]=='Yes':
for j in range(0,num_attributes):
if a[i][j] != hypothesis[j]:
hypothesis[j]='?'
print("For training example No:{}".format(i),hypothesis)
print("\nThe final hypothesis is:")
print(hypothesis)

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

Dataset:

Sunny Warm Normal Strong Warm Same Yes

Sunny Warm High Strong Warm Same Yes
Rainy Cool High Strong Warm Change No
Sunny Warm High Strong Cool Change Yes

Output:

The most general hypothesis:[?,?,?,?,?,?]

The most specific hypothesis:[0,0,0,0,0,0]

The given Training Data Set

['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same', 'Yes']

['Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same', 'Yes']
['Rainy', 'Cool', 'High', 'Strong', 'Warm', 'Change', 'No']
['Sunny', 'Warm', 'High', 'Strong', 'Cool', 'Change', 'Yes']

The initial value of hypothesis:

['0', '0', '0', '0', '0', '0']

Find-S: Finding a maximally Specific Hypothesis

For training example No:3 ['Sunny', 'Warm', '?', 'Strong',

'?', '?']

The final hypothesis is:

['Sunny', 'Warm', '?', 'Strong', '?', '?']

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

2. For a given set of training data examples stored in a .CSV file, implement and demonstrate
the Candidate-Elimination algorithm to output a description of the set of all hypotheses
consistent with the training examples.

import csv
a=[]
print("\n The Given Training Data Set \n")

with open('ws.csv', 'r') as csvFile:

reader = csv.reader(csvFile)
for row in reader:
a.append (row)
print(row)
num_attributes = len(a[0])-1 # we don't want last col which is target concet ( yes/no)

print("\n The initial value of hypothesis: ")

S = ['0'] * num_attributes
G = ['?'] * num_attributes
print ("\n The most specific hypothesis S0 : [0,0,0,0,0,0]\n")
print (" \n The most general hypothesis G0 : [?,?,?,?,?,?]\n")

for j in range(0,num_attributes):
S[j] = a[0][j];

# Comparing with Remaining Training Examples of Given Data Set

print("\n Candidate Elimination algorithm Hypotheses Version Space Computation\n")

temp=[]

for i in range(0,len(a)):
if a[i][num_attributes]=='Yes':
for j in range(0,num_attributes):
if a[i][j]!=S[j]:
S[j]='?'

for j in range(0,num_attributes):
for k in range(0,len(temp)):
if temp[k][j] != '?' and temp[k][j] != S[j]:
del temp[k] #remove it if it's not matching with the specific hypothesis

print(" For Training Example No :{0} the hypothesis is S{0} ".format(i+1),S)

if (len(temp)==0):
print(" For Training Example No :{0} the hypothesis is G{0} ".format(i+1),G)
else:
print(" For Training Example No :{0} the hypothesis is G{0}".format(i+1),temp)

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

if a[i][num_attributes]=='No':
for j in range(0,num_attributes):
if S[j] != a[i][j] and S[j]!= '?': #if not matching with the specific Hypothesis take it
seperately and store it
G[j]=S[j]
temp.append(G) # this is the version space to store all Hypotheses
G = ['?'] * num_attributes

print(" For Training Example No :{0} the hypothesis is S{0} ".format(i+1),S)

print(" For Training Example No :{0} the hypothesis is G{0}".format(i+1),temp)

Dataset:
Sunny Warm Normal Strong Warm Same Yes
Sunny Warm High Strong Warm Same Yes
Rainy Cold High Strong Warm Change No
Sunny Warm High Strong Cool Change Yes

Output:
The Given Training Data Set

['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same', 'Yes']

['Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same', 'Yes']
['Rainy', 'Cold', 'High', 'Strong', 'Warm', 'Change', 'No']
['Sunny', 'Warm', 'High', 'Strong', 'Cool', 'Change', 'Yes']

The initial value of hypothesis:

The most specific hypothesis S0 : [0,0,0,0,0,0]
The most general hypothesis G0 : [?,?,?,?,?,?]
Candidate Elimination algorithm Hypotheses Version Space
Computation

For Training Example No :1 the hypothesis is S1 ['Sunny',

'Warm', 'Normal', 'Strong', 'Warm', 'Same']
For Training Example No :1 the hypothesis is G1 ['?', '?',
'?', '?', '?', '?']
For Training Example No :2 the hypothesis is S2 ['Sunny',
'Warm', '?', 'Strong', 'Warm', 'Same']
For Training Example No :2 the hypothesis is G2 ['?', '?',
'?', '?', '?', '?']
For Training Example No :3 the hypothesis is S3 ['Sunny',
'Warm', '?', 'Strong', 'Warm', 'Same']
For Training Example No :3 the hypothesis is G3 [['Sunny',
'?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?'],
['?', '?', '?', '?', '?', 'Same']]
For Training Example No :4 the hypothesis is S4 ['Sunny',
'Warm', '?', 'Strong', '?', '?']
For Training Example No :4 the hypothesis is G4 [['Sunny',
'?', '?', '?', '?', '?'], ['?', 'Warm', '?', '?', '?', '?']]

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

3. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge toclassify
a new sample.

import sys
import numpy as np
from numpy import *
import csv

class Node:
def __init__(self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""

def read_data(filename):
""" read csv file and return header and data """
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
metadata = next(datareader)
traindata=[]
for row in datareader:
traindata.append(row)

return (metadata, traindata)

def subtables(data, col, delete):

dict = {}
items = np.unique(data[:, col]) # get unique values in a particular column

count = np.zeros((items.shape[0], 1), dtype=np.int32) #number of row = number of

values

for x in range(items.shape[0]):
for y in range(data.shape[0]):
if data[y, col] == items[x]:
count[x] += 1
#count has the data of number of times each value is present in

for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|S32")

pos = 0
for y in range(data.shape[0]):
if data[y, col] == items[x]:
dict[items[x]][pos] = data[y]
pos += 1

if delete:
dict[items[x]] = np.delete(dict[items[x]], col, 1)
return items, dict

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

def entropy(S):
""" calculate the entropy """
items = np.unique(S)
if items.size == 1:
return 0

counts = np.zeros((items.shape[0], 1))

sums = 0

for x in range(items.shape[0]):
counts[x] = sum(S == items[x]) / (S.size)

for count in counts:

sums += -1 * count * math.log(count, 2)
return sums

def gain_ratio(data, col):

items, dict = subtables(data, col, delete=False)
#item is the unique value and dict is the data corresponding to it
total_size = data.shape[0]
entropies = np.zeros((items.shape[0], 1))

for x in range(items.shape[0]):
ratio = dict[items[x]].shape[0]/(total_size)
entropies[x] = ratio * entropy(dict[items[x]][:, -1])

total_entropy = entropy(data[:, -1])

for x in range(entropies.shape[0]):
total_entropy -= entropies[x]

return total_entropy

def create_node(data, metadata):

if (np.unique(data[:, -1])).shape[0] == 1: #to check how many rows in last col(yes,no

column). shape[0] gives no. of rows
''' if there is only yes or only no then reutrn a node containing the value '''
node = Node("")
node.answer = np.unique(data[:, -1])
return node

gains = np.zeros((data.shape[1] - 1, 1)) # data.shape[1] - 1 returns the no of columns in the

dataset, minus one to remove last column
#size of gains= number of attribute to calculate gain
#gains is one dim array (size=4) to store the gain of each attribute

for col in range(data.shape[1] - 1):

gains[col] = gain_ratio(data, col)

split = np.argmax(gains) # argmax returns the index of the max value

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

node = Node(metadata[split])
metadata = np.delete(metadata, split, 0)

items, dict = subtables(data, split, delete=True)

for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata)
node.children.append((items[x], child))

return node

def empty(size):
""" To generate empty space needed for shaping the tree"""
s = ""
for x in range(size):
s += " "
return s

def print_tree(node, level):

if node.answer != "":

print(empty(level), node.answer.item(0).decode("utf-8"))
return

print(empty(level), node.attribute)

for value, n in node.children:

print(empty(level + 1), value.tobytes().decode("utf-8"))
print_tree(n, level + 2)

metadata, traindata = read_data("tennis.csv")

data = np.array(traindata) # to convert the traindata to numpy array
node = create_node(data, metadata)
print_tree(node, 0)

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

Dataset:
outlook temp humidity windy play
sunny hot high Weak no
sunny hot high Strong no
overcast hot high Weak yes
rainy mild high Weak yes
rainy cool normal Weak yes
rainy cool normal Strong no
overcast cool normal Strong yes
sunny mild high Weak no
sunny cool normal Weak yes
rainy mild normal Weak yes
sunny mild normal Strong yes
overcast mild high Strong yes
overcast hot normal Weak yes
rainy mild high Strong no

Output:
outlook
overcast
yes
rainy
windy
Strong
no
Weak
yes
sunny
humidity
high
no
normal
yes

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

4. Build an Artificial Neural Network by implementing the Backpropagation algorithm and

test the same using appropriate data sets.

import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6])) # Hours Studied, Hours Slept
y = np.array(([92], [86], [89])) # Test Score

y = y/100 # max test score is 100

#Sigmoid Function
def sigmoid(x): #this function maps any value between 0 and 1
return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function

def derivatives_sigmoid(x):
return x * (1 - x)

#Variable initialization
epoch=10000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons of output layer

#weight and bias initialization

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
#bias matrix to the hidden layer
bias_hidden=np.random.uniform(size=(1,hiddenlayer_neurons))
#weight matrix to the output layer
weight_hidden=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bias_output=np.random.uniform(size=(1,output_neurons)) # matrix to the output layer

for i in range(epoch):
#Forward Propogation
hinp1=np.dot(X,wh)
hinp= hinp1 + bias_hidden #bias_hidden GRADIENT DISCENT
hlayer_activation = sigmoid(hinp)

outinp1=np.dot(hlayer_activation,weight_hidden)
outinp= outinp1+ bias_output
output = sigmoid(outinp)

#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)

#Compute change factor(delta) at output layer, dependent on the gradient of error multiplied
by the slope of output layer activation
d_output = EO * outgrad

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

#At this step, the error will propagate back into the network which means error at hidden
layer. we will take the dot product of output layer delta with weight parameters of edges
between the hidden and output layer (weight_hidden.T).

EH = d_output.dot(weight_hidden.T)
#how much hidden layer weight contributed to error

hiddengrad = derivatives_sigmoid(hlayer_activation)
d_hiddenlayer = EH * hiddengrad

#update the weights

weight_hidden += hlayer_activation.T.dot(d_output) *lr
bias_hidden += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr

wh += X.T.dot(d_hiddenlayer) *lr
bias_output += np.sum(d_output, axis=0,keepdims=True) *lr

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)

Output:
Input:
[[2 9]
[1 5]
[3 6]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.8921829 ]
[0.88212774]
[0.89429156]]

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

5. Write a program to implement the naïve Bayesian classifier for a sample training data set
stored as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.

import numpy as np
import math
import csv
import pdb
def read_data(filename):

with open(filename,'r') as csvfile:

datareader = csv.reader(csvfile)
metadata = next(datareader)
traindata=[]
for row in datareader:
traindata.append(row)

return (metadata, traindata)

def splitDataset(dataset, splitRatio):

trainSize = int(len(dataset) * splitRatio)
trainSet = []
testset = list(dataset)
i=0
while len(trainSet) < trainSize:
trainSet.append(testset.pop(i))
return [trainSet, testset]

def classify(data,test):

total_size = data.shape[0]
print("training data size=",total_size)
print("test data size=",test.shape[0])

countYes = 0
countNo = 0
probYes = 0
probNo = 0
print("target count probability")

for x in range(data.shape[0]):
if data[x,data.shape[1]-1] == 'yes':
countYes +=1
if data[x,data.shape[1]-1] == 'no':
countNo +=1

probYes=countYes/total_size
probNo= countNo / total_size

print('Yes',"\t",countYes,"\t",probYes)
print('No',"\t",countNo,"\t",probNo)

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

prob0 =np.zeros((test.shape[1]-1))
prob1 =np.zeros((test.shape[1]-1))
accuracy=0
print("instance prediction target")

for t in range(test.shape[0]):
for k in range (test.shape[1]-1):
count1=count0=0
for j in range (data.shape[0]):
#how many times appeared with no
if test[t,k] == data[j,k] and data[j,data.shape[1]-1]=='no':
count0+=1
#how many times appeared with yes
if test[t,k]==data[j,k] and data[j,data.shape[1]-1]=='yes':
count1+=1
prob0[k]=count0/countNo
prob1[k]=count1/countYes

probno=probNo
probyes=probYes
for i in range(test.shape[1]-1):
probno=probno*prob0[i]
probyes=probyes*prob1[i]
if probno>probyes:
predict='no'
else:
predict='yes'

print(t+1,"\t",predict,"\t ",test[t,test.shape[1]-1])
if predict == test[t,test.shape[1]-1]:
accuracy+=1
final_accuracy=(accuracy/test.shape[0])*100
print("accuracy",final_accuracy,"%")
return

metadata,traindata= read_data("tennis.csv")
splitRatio=0.6
trainingset, testset=splitDataset(traindata, splitRatio)
training=np.array(trainingset)
testing=np.array(testset)

classify(training,testing)

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

Output:
training data size= 8
test data size= 6
target count probability
Yes 4 0.5
No 4 0.5
instance prediction target
1 no yes
2 yes yes
3 no yes
4 yes yes
5 yes yes
6 no no
accuracy 66.66666666666666 %

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

6. Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program.
Calculate the accuracy, precision, and recall for your data set.

import pandas as pd
import pdb
msg=pd.read_csv('naivetext1.csv',names=['message','label']) #names-> name of the cols
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
Y=msg.labelnum

from sklearn.model_selection import train_test_split

xtrain,xtest,ytrain,ytest=train_test_split(X,Y)

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()

xtrain_dtm = count_vect.fit_transform(xtrain)
xtest_dtm=count_vect.transform(xtest)

df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())

from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)

from sklearn import metrics

print('Accuracy metrics')
print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted))
print('Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))
print('Recall and Precison ')
print(metrics.recall_score(ytest,predicted))
print(metrics.precision_score(ytest,predicted))
#pdb.set_trace()

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

Dataset:
I love this sandwich pos
This is an amazing place pos
I feel very good about these
beers pos
This is my best work pos
What an awesome view pos
I do not like this restaurant neg
I am tired of this stuff neg
I can't deal with this neg
He is my sworn enemy neg
My boss is horrible neg
This is an awesome place pos
I do not like the taste of this
juice neg
I love to dance pos
I am sick and tired of this place neg
What a great holiday pos
That is a bad locality to stay neg
We will have good fun
tomorrow pos
I went to my enemy's house
today neg

Output:
ccuracy metrics
Accuracy of the classifer is 0.8
Confusion matrix
[[1 1]
[0 3]]
Recall and Precison
1.0
0.75

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

7. Write a program to construct aBayesian network considering medical data. Use this
model to demonstrate the diagnosis of heart patients using standard Heart Disease Data Set.
You can use Java/Python ML library classes/API.

import pandas as pd
data=pd.read_csv("heart_disease_data1.csv")
heart_disease=pd.DataFrame(data)
print(heart_disease)

from pgmpy.models import BayesianModel

model=BayesianModel([
('age','Lifestyle'),
('Gender','Lifestyle'),
('Family','heartdisease'),
('diet','cholestrol'),
('Lifestyle','diet'),
('cholestrol','heartdisease'),
('diet','cholestrol')
])

from pgmpy.estimators import MaximumLikelihoodEstimator

model.fit(heart_disease, estimator=MaximumLikelihoodEstimator)

from pgmpy.inference import VariableElimination

HeartDisease_infer = VariableElimination(model)

print('For age enter SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3, Teen:4')

print('For Gender Enter Male:0, Female:1')
print('For Family History Enter yes:1, No:0')
print('For diet Enter High:0, Medium:1')
print('for lifeStyle Enter Athlete:0, Active:1, Moderate:2, Sedentary:3')
print('for cholesterol Enter High:0, BorderLine:1, Normal:2')

q = HeartDisease_infer.query(variables=['heartdisease'], evidence={
'age':int(input('enter age')),
'Gender':int(input('enter Gender')),
'Family':int(input('enter Family history')),
'diet':int(input('enter diet')),
'Lifestyle':int(input('enter Lifestyle')),
'cholestrol':int(input('enter cholestrol'))
})

print(q['heartdisease'])

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

Dataset:
age Gender Family diet Lifestyle cholestrol heartdisease
0 0 1 1 3 0 1
0 1 1 1 3 0 1
1 0 0 0 2 1 1
4 0 1 1 3 2 0
3 1 1 0 0 2 0
2 0 1 1 1 0 1
4 0 1 0 2 0 1
0 0 1 1 3 0 1
3 1 1 0 0 2 0
1 1 0 0 0 2 1
4 1 0 1 2 0 1
4 0 1 1 3 2 0
2 1 0 0 0 0 0
2 0 1 1 1 0 1
3 1 1 0 0 1 0
0 0 1 0 0 2 1
1 1 0 1 2 1 1
3 1 1 1 0 1 0
4 0 1 1 3 2 0

Output:
For age enter SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3,
Teen:4
For Gender Enter Male:0, Female:1
For Family History Enter yes:1, No:0
For diet Enter High:0, Medium:1
for lifeStyle Enter Athlete:0, Active:1, Moderate:2, Sedentary:3
for cholesterol Enter High:0, BorderLine:1, Normal:2
enter age2
enter Gender0
enter Family history1
enter diet1
enter Lifestyle1
enter cholestrol1
+----------------+---------------------+
| heartdisease | phi(heartdisease) |
+================+=====================+
| heartdisease_0 | 1.0000 |
+----------------+---------------------+
| heartdisease_1 | 0.0000 |
+----------------+---------------------+

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using k-Means algorithm. Compare the results of these two algorithms and
comment on the quality of clustering. You can add Java/Python ML library classes/API in the
program.

import matplotlib.pyplot as plt

from sklearn import datasets
from sklearn.cluster import KMeans
import pandas as pd
import numpy as np
# import some data to play with
iris = datasets.load_iris()
X = pd.DataFrame(iris.data)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
# Build the K Means Model
model = KMeans(n_clusters=3)
model.fit(X) # model.labels_ : Gives cluster no for which samples belongs to
# # Visualise the clustering results
plt.figure(figsize=(14,14))
colormap = np.array(['red', 'lime', 'black'])
# Plot the Original Classifications using Petal features
plt.subplot(2, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Real Clusters')
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
# Plot the Models Classifications
plt.subplot(2, 2, 2)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)
plt.title('K-Means Clustering')
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
# General EM for GMM
from sklearn import preprocessing
# transform your data such that its distribution will have a
Department of CSE, BKEC Basavakalyan
Machine Learning Laboratory-15CSL76

# mean value 0 and standard deviation of 1.

scaler = preprocessing.StandardScaler()
scaler.fit(X)
xsa = scaler.transform(X)
xs = pd.DataFrame(xsa, columns = X.columns)
from sklearn.mixture import GaussianMixture
gmm = GaussianMixture(n_components=3)
gmm.fit(xs)
gmm_y = gmm.predict(xs)
plt.subplot(2, 2, 3)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[gmm_y], s=40)
plt.title('GMM Clustering')
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
print('Observation: The GMM using EM algorithm based clustering matched the true labels
more closely than the Kmeans.')

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

OUTPUT:
Observation: The GMM using EM algorithm based clustering matched the true
labels more closely than the Kmeans.

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be used for
this problem.

from sklearn import datasets

iris=datasets.load_iris()
iris_data=iris.data
iris_labels=iris.target

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test=train_test_split(iris_data,iris_labels,test_size=0.30)

from sklearn.neighbors import KNeighborsClassifier

classifier=KNeighborsClassifier(n_neighbors=5)
classifier.fit(x_train,y_train)
y_pred=classifier.predict(x_test)

from sklearn.metrics import classification_report,confusion_matrix

print('Confusion matrix is as follows')
print(confusion_matrix(y_test,y_pred))
print('Accuracy Matrics')
print(classification_report(y_test,y_pred))

Output:
Confusion matrix is as follows
[[10 0 0]
[ 0 16 1]
[ 0 1 17]]
Accuracy Matrics
precision recall f1-score support

0 1.00 1.00 1.00 10

1 0.94 0.94 0.94 17
2 0.94 0.94 0.94 18

avg / total 0.96 0.96 0.96 45

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

10. Implement the non-parametric Locally Weighted Regressionalgorithm in order to fit

data points. Select appropriate data set for your experiment and draw graphs.

import numpy as np
import matplotlib.pyplot as plt
x=np.linspace(-3,3,1000)
y=np.log(np.abs((x**2)-1)+0.5)
x+=np.random.normal(scale=0.05,size=1000)
plt.scatter(x,y,alpha=0.3)
def local_regression(x0,X,Y,tau):
x0=np.r_[1,x0]
X=np.c_[np.ones(len(X)),X]
xw=X.T*radial_kernel(x0,X,tau)
beta=np.linalg.pinv(xw@X)@xw@Y
return x0@beta
def radial_kernel(x0,X,tau):
return np.exp(np.sum((X-x0)**2,axis=1)/(-2*tau**2))
def plot_lwr(tau):
domain=np.linspace(-3,3,num=300)
prediction =[local_regression(x0,x,y,tau) for x0 in domain]
plt.scatter(x,y,alpha=0.3)
plt.plot(domain,prediction,color="red")
return plt
plot_lwr(0.04)
Output:

Department of CSE, BKEC Basavakalyan

Machine Learning Laboratory-15CSL76

VIVA VOCE QUESTIONS

1. What is machine learning?
2. Define supervised learning
3. Define unsupervised learning
4. Define semi supervised learning
5. Define reinforcement learning
6. What do you mean by hypotheses?
7. What is classification?
8. What is clustering?
9. Define precision, accuracy and recall
10. Define entropy
11. Define regression
12. How Knn is different from k-means clustering
13. What is concept learning?
14. Define specific boundary and general boundary
15. Define target function
16. Define decision tree
17. What is ANN
18. Explain gradient descent approximation
19. State Bayes theorem
20. Define Bayesian belief networks
21. Differentiate hard and soft clustering
22. Define variance
23. What is inductive machine learning?
24. Why K nearest neighbor algorithm is lazy learning algorithm
25. Why naïve Bayes is naïve
26. Mention classification algorithms
27. Define pruning
28. Differentiate Clustering and classification
29. Mention clustering algorithms
30. Define Bias
31. What is learning rate? Why it is need.

Department of CSE, BKEC Basavakalyan

Creativity in Education PDF
No ratings yet
Creativity in Education PDF
37 pages
Assessment On Special Program in The Art
100% (1)
Assessment On Special Program in The Art
131 pages
Edited - Edited - Final ML Lab Manual Version11
No ratings yet
Edited - Edited - Final ML Lab Manual Version11
83 pages
My ML Lab Manual
No ratings yet
My ML Lab Manual
21 pages
ML Lab Manual Devansh (1)
No ratings yet
ML Lab Manual Devansh (1)
57 pages
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
No ratings yet
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
49 pages
Lab Manual: Department of Computer Science and Engineering
No ratings yet
Lab Manual: Department of Computer Science and Engineering
30 pages
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
No ratings yet
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
20 pages
Ml_Lab_Manual
No ratings yet
Ml_Lab_Manual
70 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
Machine Learning Pesit Lab Manual
0% (1)
Machine Learning Pesit Lab Manual
35 pages
original ML lab manual (1)
No ratings yet
original ML lab manual (1)
22 pages
ML-LAB-MANUAL-R20
No ratings yet
ML-LAB-MANUAL-R20
77 pages
ML Lab Manual-17csl76
No ratings yet
ML Lab Manual-17csl76
43 pages
15CSL76
No ratings yet
15CSL76
35 pages
ML Lab
No ratings yet
ML Lab
49 pages
22K61A0618_removed_lab manual sasi cld
No ratings yet
22K61A0618_removed_lab manual sasi cld
25 pages
Updated ML LAB Manual-2020-21
No ratings yet
Updated ML LAB Manual-2020-21
57 pages
B.TECH Machine Learning-Lab
No ratings yet
B.TECH Machine Learning-Lab
99 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
31 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
ML Lab R20
No ratings yet
ML Lab R20
42 pages
R20-21NM-III-I-ML-LAB MANUAL (1)
No ratings yet
R20-21NM-III-I-ML-LAB MANUAL (1)
38 pages
ML_LAB Record_final
No ratings yet
ML_LAB Record_final
39 pages
ML New record (5)
No ratings yet
ML New record (5)
51 pages
ML LAB
No ratings yet
ML LAB
51 pages
Outcome Based Lab Report
No ratings yet
Outcome Based Lab Report
22 pages
201CS240-MLLABMANUAL
No ratings yet
201CS240-MLLABMANUAL
20 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Machine Learning Lab Mannual CS 601
No ratings yet
Machine Learning Lab Mannual CS 601
30 pages
IT ML Lab
No ratings yet
IT ML Lab
35 pages
ML Lab
No ratings yet
ML Lab
45 pages
ML RECORD NEW FORMAT
No ratings yet
ML RECORD NEW FORMAT
48 pages
ML Lab Manual
No ratings yet
ML Lab Manual
36 pages
FINAL LAB PROGRAMS (2)
No ratings yet
FINAL LAB PROGRAMS (2)
52 pages
AD3461_ML Lab Manual
No ratings yet
AD3461_ML Lab Manual
54 pages
Fedal #5
No ratings yet
Fedal #5
33 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
43 pages
ML Lab - 231009 - 210335
No ratings yet
ML Lab - 231009 - 210335
38 pages
Ashin ML Record - Merged
No ratings yet
Ashin ML Record - Merged
53 pages
Machine Learning Lab File
No ratings yet
Machine Learning Lab File
48 pages
ML-LAB-MANUAL-R20-1
No ratings yet
ML-LAB-MANUAL-R20-1
63 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
ML Lab
No ratings yet
ML Lab
7 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
Cat 2 Document Likkitha
No ratings yet
Cat 2 Document Likkitha
80 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
ML Lab Manual - Ex No. 1 To 9
No ratings yet
ML Lab Manual - Ex No. 1 To 9
26 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
ML Experiment-1
No ratings yet
ML Experiment-1
3 pages
15CSL76
No ratings yet
15CSL76
3 pages
CS3491 Set3
No ratings yet
CS3491 Set3
2 pages
MLWP LAB Experiment's
No ratings yet
MLWP LAB Experiment's
11 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
34 pages
Lab Manual
No ratings yet
Lab Manual
55 pages
AD3461_ML_MANUAL
No ratings yet
AD3461_ML_MANUAL
34 pages
MLlab Manual LIET
No ratings yet
MLlab Manual LIET
52 pages
AI&ML Lab Report
No ratings yet
AI&ML Lab Report
19 pages
How to Plan Projects with Microsoft Project
From Everand
How to Plan Projects with Microsoft Project
Akram Najjar
5/5 (1)
Blazor and API Example: Classroom Quiz Application
From Everand
Blazor and API Example: Classroom Quiz Application
Taurius Litvinavicius
No ratings yet
Inosluban East, Lipa City, Batangas 4217 Contact Information: 0908-3628-784/0926-366-4911/741-1201
No ratings yet
Inosluban East, Lipa City, Batangas 4217 Contact Information: 0908-3628-784/0926-366-4911/741-1201
4 pages
Learning Chinese Characters Via Stroke-Based Mobile Game in Education
No ratings yet
Learning Chinese Characters Via Stroke-Based Mobile Game in Education
7 pages
ModifiedCRLA_G3_Scoresheet_v5 WISDOM 2024-2025
No ratings yet
ModifiedCRLA_G3_Scoresheet_v5 WISDOM 2024-2025
52 pages
Why Machine Learning Matters
No ratings yet
Why Machine Learning Matters
10 pages
PET For Schools Speaking Overview
No ratings yet
PET For Schools Speaking Overview
12 pages
CursoInglesAC - Mod1Unit 2
No ratings yet
CursoInglesAC - Mod1Unit 2
16 pages
Project About HR
No ratings yet
Project About HR
105 pages
Project Report On E-Content of Teaching and Learning: Page - 1
No ratings yet
Project Report On E-Content of Teaching and Learning: Page - 1
34 pages
Grade 9 / Year 10 Economics Course Syllabus 2020-2021 Course Outline
No ratings yet
Grade 9 / Year 10 Economics Course Syllabus 2020-2021 Course Outline
3 pages
Equity, Reliability, and Validity of SATs in ESL 1
No ratings yet
Equity, Reliability, and Validity of SATs in ESL 1
7 pages
Table of Specifications: Fourth Periodic Examination in Grade 10 English
No ratings yet
Table of Specifications: Fourth Periodic Examination in Grade 10 English
2 pages
2pages 6
No ratings yet
2pages 6
2 pages
DAP School
No ratings yet
DAP School
72 pages
IDO Invitation Letter
No ratings yet
IDO Invitation Letter
1 page
"Portofolio Assessment Rubrics Based Laboratory": Complied By: Margaretha Tri Ulina Panjaitan (4171121017)
No ratings yet
"Portofolio Assessment Rubrics Based Laboratory": Complied By: Margaretha Tri Ulina Panjaitan (4171121017)
5 pages
Resume Keeley Faith
No ratings yet
Resume Keeley Faith
3 pages
CHC 2D1 Unit 2 Lesson 1 - Causes of WWI
No ratings yet
CHC 2D1 Unit 2 Lesson 1 - Causes of WWI
4 pages
2 Grade Heart Rate Lesson Plan: Guiding Objectives
No ratings yet
2 Grade Heart Rate Lesson Plan: Guiding Objectives
4 pages
Syllabus in Group Dynamics Psychology
No ratings yet
Syllabus in Group Dynamics Psychology
7 pages
Pre-Ss1 Scheme of Work
No ratings yet
Pre-Ss1 Scheme of Work
1 page
Iep Checklist
No ratings yet
Iep Checklist
1 page
Unit 4
No ratings yet
Unit 4
12 pages
UPDATED AI Prompts
No ratings yet
UPDATED AI Prompts
2 pages
Narrative Report Reading Report
No ratings yet
Narrative Report Reading Report
4 pages
Course Planning SEMESTER JAN 2022/2022 (JJ221)
No ratings yet
Course Planning SEMESTER JAN 2022/2022 (JJ221)
5 pages
Caring For Children Is Probably The Most Important Job in Any Society
No ratings yet
Caring For Children Is Probably The Most Important Job in Any Society
1 page
On Being A Good Dog Training Student - Susan Garret
100% (1)
On Being A Good Dog Training Student - Susan Garret
16 pages
McDevitt-Resume Updated Feb 11
No ratings yet
McDevitt-Resume Updated Feb 11
2 pages