0% found this document useful (0 votes)

135 views

ML Lab Manual PDF

1. The document describes implementing various machine learning algorithms including FIND-S, Candidate Elimination, ID3 decision tree, backpropagation neural network, naive Bayes classifier, and applying naive Bayes to text classification. Code examples are provided for each algorithm to demonstrate their workings on sample datasets. Performance metrics like accuracy, precision and recall are calculated for some examples.

Uploaded by

Anonymous KvIcYF

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

135 views

ML Lab Manual PDF

Uploaded by

Anonymous KvIcYF

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

1.

Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis based on
a given set of training data samples. Read the training data from a .csv file.

import csv
with open('1.csv', 'r') as f:
reader = csv.reader(f)
your_list = list(reader)
h = [['0', '0', '0', '0', '0', '0']]
for i in your_list:
print(i)
if i[-1] == "Yes":
j = 0
for x in i:
if x != "Yes":
if x != h[0][j] and h[0][j] == '0':
h[0][j] = x
elif x != h[0][j] and h[0][j] != '0':
h[0][j] = '?'
else:
pass
j = j + 1
print("A Maximally Specific hypothesis is")
print(h)

2. For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a description of the set of all hypotheses consistent with
the training examples.

import csv
a = []
print("\n The Given Training Data Set \n")

with open('enjoysport.csv', 'r') as csvFile:

reader = csv.reader(csvFile)
for row in reader:
a.append (row)
print(row)
num_attributes = len(a[0])-1

print("\n The initial value of hypothesis: ")

S = ['0'] * num_attributes
G = ['?'] * num_attributes
print ("\n The most specific hypothesis S0 : [0,0,0,0,0,0]\n")
print (" \n The most general hypothesis G0 : [?,?,?,?,?,?]\n")

# Comparing with First Training Example

for j in range(0,num_attributes):
S[j] = a[0][j];

# Comparing with Remaining Training Examples of Given Data Set

print("\n Candidate Elimination algorithm Hypotheses Version Space Computation\n")

temp=[]

for i in range(0,len(a)):
if a[i][num_attributes]=='Yes':
for j in range(0,num_attributes):
if a[i][j]!=S[j]:
S[j]='?'
for j in range(0,num_attributes):
for k in range(1,len(temp)):
if temp[k][j]!= '?' and temp[k][j] !=S[j]:
del temp[k]

print(" For Training Example No :{0} the hypothesis is S{0} ".format(i+1),S)

if (len(temp)==0):
print(" For Training Example No :{0} the hypothesis is G{0}
".format(i+1),G)
else:
print(" For Training Example No :{0} the hypothesis is
G{0}".format(i+1),temp)

if a[i][num_attributes]=='No':
for j in range(0,num_attributes):
if S[j] != a[i][j] and S[j]!= '?':
G[j]=S[j]
temp.append(G)
G = ['?'] * num_attributes

print(" For Training Example No :{0} the hypothesis is S{0} ".format(i+1),S)

print(" For Training Example No :{0} the hypothesis is G{0}".format(i+1),temp)

3. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new
sample.

import pandas as pd
import numpy as np
dataset= pd.read_csv('P3_Tennis.csv')

def entropy(target_col):
elements,counts = np.unique(target_col,return_counts = True)

entropy = np.sum([(-counts[i]/np.sum(counts))*np.log2(counts[i]/np.sum(counts))
for i in range(len(elements))])

return entropy

def InfoGain(data,split_attribute_name,target_name="PlayTennis"):
total_entropy = entropy(data[target_name])
vals,counts= np.unique(data[split_attribute_name],return_counts=True)

Weighted_Entropy =
np.sum([(counts[i]/np.sum(counts))*entropy(data.where(data[split_attribute_name]==v
als[i]).dropna()[target_name]) for i in range(len(vals))])

InfoGain = total_entropy - Weighted_Entropy

return InfoGain

def
ID3(data,originaldata,features,target_attribute_name="PlayTennis",parent_node_class
= None):

if len(np.unique(data[target_attribute_name])) <= 1:
return np.unique(data[target_attribute_name])[0]
elif len(data)==0:
return
np.unique(originaldata[target_attribute_name])[np.argmax(np.unique(originaldata[tar
get_attribute_name],return_counts=True)[1])]

elif len(features) ==0:

return parent_node_class
else:

parent_node_class =
np.unique(data[target_attribute_name])[np.argmax(np.unique(data[target_attribute_na
me],return_counts=True)[1])]

item_values = [InfoGain(data,feature,target_attribute_name) for feature in

features] #Return the information gain values for the features in the dataset

best_feature_index = np.argmax(item_values)
best_feature = features[best_feature_index]
tree = {best_feature:{}}
features = [i for i in features if i != best_feature]
for value in np.unique(data[best_feature]):
value = value
sub_data = data.where(data[best_feature] == value).dropna()

subtree =
ID3(sub_data,dataset,features,target_attribute_name,parent_node_class)

tree[best_feature][value] = subtree
return(tree)

tree = ID3(dataset,dataset,dataset.columns[:-1])
print(dataset.head())
print(' \nDisplay Tree\n',tree)

4. Build an Artificial Neural Network by implementing the Back propagation algorithm and test the
same using appropriate data sets.

import math

def sigmoid(x):
y= 1/(1+math.exp(-x))
return y
##define inputs and target for xor gate
x1=[0,0,1,1] #input1
x2=[0,1,0,1] #input2
t=[0,1,1,0] #target

## Initialize random weights and biases

# Hidden layer first Perceptron

b1=-0.3
w11=0.21
w21= 0.15
# Hidden Layer Second Perceptron
b2=0.25
w12=-0.4
w22=0.1
# Output layer Perceptron
b3=-0.4
w13=-0.2
w23=0.3
error=0
iteration=0
train=True
print("weight are:")
print("w11 : %4.2f w12: %4.2f w21: %4.2f w22: %4.2f w13: %4.2f w23: %4.2f \n"
%(w11,w12,w21,w22,w13,w23))

## Training Starts

while(train):

for i in range(len(x1)):

##input for each perceptron of hidden layer

z_in1=b1+x1[i]*w11+x2[i]*w21
z_in2=b2+x1[i]*w12+x2[i]*w22
##computing activation function output
z1=round(sigmoid(z_in1),4)
z2=round(sigmoid(z_in2),4)

# Output layer forward pass

y_in=b3+z1*w13+z2*w23
y=round(sigmoid(y_in),4)

##error computation
del_k=round((t[i]-y)*y*(1-y),4)
error=del_k
##Back pass
# weight update for output layer
w13=round(w13+del_k*z1,4)
w23=round(w23+del_k*z2,4)
b3=round(b3+del_k,4)

##error computation for hidden layer

del_1=del_k*w13*z1*(1-z1)
del_2=del_k*w23*z2*(1-z2)

## update weight and biases

b1=round(b1+del_1,4)
w11=round(w11+del_1*x1[i],4)
w12=round(w12+del_1*x1[i],4)

b2=round(b2+del_2,4)
w21=round(w21+del_2*x2[i],4)
w22=round(w22+del_2*x2[i],4)

print("Iteration: ",iteration)
print("w11 : %5.4f w12: %5.4f w21: %5.4f w22: %5.4f w13: %5.4f w23: %5.4f "
%(w11,w12,w21,w22,w13,w23))
print("Error: %5.3f" %del_k)
iteration=iteration+1

if(iteration==1000):
train=False

5. Write a program to implement the naïve Bayesian classifier for a sample training data set stored as a
.csv file. Compute the accuracy of the classifier, considering few test data sets.

import csv
import math
import random
import statistics

def calculate_probability(x, mean, stdev):

exponent = math.exp(-(math.pow(x - mean, 2) / (2 * math.pow(stdev, 2))))
return (1 / (math.sqrt(2 * math.pi) * stdev)) * exponent

dataset = []
dataset_size = 0
with open('lab5.csv') as csvfile:
lines = csv.reader(csvfile)
for row in lines:
dataset.append([float(attr) for attr in row])
dataset_size = len(dataset)
print('Size of dataset is : ', dataset_size)

train_size = int(0.7 * dataset_size) # 70 % as test data

print(train_size)

X_train = []
X_test = dataset.copy()
training_indexes = random.sample(range(dataset_size), train_size)

# Split Data
for i in training_indexes:
X_train.append(dataset[i])
X_test.remove(dataset[i])

# Separate Data based on class value

classes = {}
for samples in X_train:
last = int(samples[-1])
if last not in classes:
classes[last] = []
classes[last].append(samples)

# Find mean and variance of each attribute by adding all attributes

summaries = {}
for classValue, training_data in classes.items():
summary = [(statistics.mean(attribute), statistics.stdev(attribute)) for attribute
in zip(*training_data)]
del summary[-1]
summaries[classValue] = summary

X_prediction = []

# Predict the output of test data

for i in X_test:
probabilities = {}
for classValue, classSummary in summaries.items():
probabilities[classValue] = 1
for index, attr in enumerate(classSummary):
probabilities[classValue] *= calculate_probability(i[index], attr[0],
attr[1])

best_label, best_prob = None, -1

for classValue, probability in probabilities.items():
if best_label is None or probability > best_prob:
best_prob = probability
best_label = classValue
X_prediction.append(best_label)

# Find Accuracy
correct = 0
for index, key in enumerate(X_test):
if X_test[index][-1] == X_prediction[index]:
correct += 1

print("Accuracy : ", correct / (float(len(X_test))) * 100)

6. Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier model to
perform this task. Built-in Java classes/API can be used to write the program. Calculate the accuracy,
precision, and recall for your data set.

import pandas as pd
dataset = pd.read_csv('naivetext1.txt',names =['text','tag'])
dataset.head()
#encoding
from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
dataset['tag']= encoder.fit_transform(dataset['tag'])
#splitting
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset['text'], dataset['tag'],
test_size=0.2)
#vectorization
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
xtrain_dtm = count_vect.fit_transform(X_train)
xtest_dtm=count_vect.transform(X_test)
dataset=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
#prediction
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,y_train)#to load the text data
y_pred= clf.predict(xtest_dtm)
#output
from sklearn.metrics import precision_score,accuracy_score,recall_score
print('Precision',precision_score(y_test, y_pred))
print('Accuracy',accuracy_score(y_test, y_pred))
print('Recall',recall_score(y_test, y_pred))

7. Write a program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using standard Heart Disease Data Set. You can use
Java/Python ML library classes/API.

import pandas as pd
col =['Age','Gender','FamilyHist','Diet','LifeStyle','Cholesterol','HeartDisease']
data = pd.read_csv('heart_disease_data.csv',names =col )
print(data)

#encoding
from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
for i in range(len(col)):
data.iloc[:,i] = encoder.fit_transform(data.iloc[:,i])

#spliting data
X = data.iloc[:,0:6]
y = data.iloc[:,-1]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

#prediction
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)

#confusion mtx output

from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test, y_pred))

8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for clustering
using k-Means algorithm. Compare the results of these two algorithms and comment on the quality
of clustering. You can add Java/Python ML library classes/API in the program.

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.mixture import GaussianMixture
from sklearn.cluster import KMeans
# Importing the dataset
data = pd.read_csv('xclara.csv')
data.head()

# Getting the values and plotting it

f1 = data['V1'].values
f2 = data['V2'].values
X = np.array(list(zip(f1, f2)))

kmeans = KMeans(3, random_state=0)

labels = kmeans.fit(X).predict(X)
centroids = kmeans.cluster_centers_
plt.scatter(X[:, 0], X[:, 1], c=labels, s=40, cmap='viridis');
print('Graph using Kmeans Algorithm')
plt.scatter(centroids[:, 0], centroids[:, 1], marker='*', s=200, c='#050505')
plt.show()
#gmm
gmm = GaussianMixture(n_components=3).fit(X)
labels = gmm.predict(X)

# plot
probs = gmm.predict_proba(X)
size = 10 * probs.max(1) ** 3
print('Graph using EM Algorithm')
#print(probs[:300].round(4))
plt.scatter(X[:, 0], X[:, 1], c=labels, s=size, cmap='viridis');
plt.show()

9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print both
correct and wrong predictions. Java/Python ML library classes can be used for this problem.

from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix
from sklearn import datasets
iris=datasets.load_iris()
iris_data=iris.data
iris_labels=iris.target
print(iris_labels)
x_train, x_test, y_train,
y_test=train_test_split(iris_data,iris_labels,test_size=0.20)

classifier=KNeighborsClassifier(n_neighbors=5)
classifier.fit(x_train,y_train)
y_pred=classifier.predict(x_test)
print('confusion matrix')
print(confusion_matrix(y_test,y_pred))
print('Accuracy metrics')
print(classification_report(y_test,y_pred))

10. Implement the non-parametric Locally Weighted Regression algorithm in order to fit data points.
Select appropriate data set for your experiment and draw graphs.

import matplotlib.pyplot as plt

import pandas as pd
import numpy as np1

def kernel(point,xmat, k):

m,n = np1.shape(xmat)
weights = np1.mat(np1.eye((m)))
for j in range(m):
diff = point - X[j]
weights[j,j] = np1.exp(diff*diff.T/(-2.0*k**2))
return weights

def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)
W=(X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W

def localWeightRegression(xmat,ymat,k):
m,n = np1.shape(xmat)
ypred = np1.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred

# load data points

data = pd.read_csv('data10.csv')
bill = np1.array(data.total_bill)
tip = np1.array(data.tip)

#preparing and add 1 in bill

mbill = np1.mat(bill)
mtip = np1.mat(tip)
m= np1.shape(mbill)[1]
one = np1.mat(np1.ones(m))
X= np1.hstack((one.T,mbill.T))

#set k here
ypred = localWeightRegression(X,mtip,2)
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='green')
ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=5)
plt.xlabel('Total bill')
plt.ylabel('Tip')
plt.show();

Materializing Adaptation Theory
No ratings yet
Materializing Adaptation Theory
18 pages
Ramakanta Chakrabarti - Bange Baishnab Dharma
100% (4)
Ramakanta Chakrabarti - Bange Baishnab Dharma
95 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
code mlt
No ratings yet
code mlt
9 pages
Practical - 1
No ratings yet
Practical - 1
25 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
AIML Prograns
No ratings yet
AIML Prograns
6 pages
AIML
No ratings yet
AIML
12 pages
Machine Learning practical file
No ratings yet
Machine Learning practical file
31 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Machine Learning - Lab Manual
No ratings yet
Machine Learning - Lab Manual
35 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
Machine Learning Laboratory Manual
No ratings yet
Machine Learning Laboratory Manual
11 pages
ML Record
No ratings yet
ML Record
24 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
ML Lab
No ratings yet
ML Lab
7 pages
Lab Manual ML
No ratings yet
Lab Manual ML
28 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Machine Learning Laboratory (21AIL66)
No ratings yet
Machine Learning Laboratory (21AIL66)
7 pages
Aiml Lab
No ratings yet
Aiml Lab
14 pages
ML_lab_programs
No ratings yet
ML_lab_programs
8 pages
ML Lab Record
No ratings yet
ML Lab Record
33 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
Null 0
No ratings yet
Null 0
6 pages
MLPrograma1-5 Py
No ratings yet
MLPrograma1-5 Py
17 pages
ML Lab
No ratings yet
ML Lab
7 pages
Machine File
No ratings yet
Machine File
27 pages
ML File
No ratings yet
ML File
13 pages
22104057_Prakhar_Week 5
No ratings yet
22104057_Prakhar_Week 5
8 pages
ML LAB Rec
No ratings yet
ML LAB Rec
9 pages
(P) Program AIO
No ratings yet
(P) Program AIO
22 pages
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
No ratings yet
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
33 pages
ML Record Print
No ratings yet
ML Record Print
20 pages
ML Ex1
No ratings yet
ML Ex1
12 pages
6 Task RBF
No ratings yet
6 Task RBF
6 pages
MCSL 228
No ratings yet
MCSL 228
26 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
Lab
No ratings yet
Lab
25 pages
ML Lab Manual-99
No ratings yet
ML Lab Manual-99
23 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
AIML Lab Programs
No ratings yet
AIML Lab Programs
13 pages
ml lab
No ratings yet
ml lab
23 pages
Screenshot 2023-12-07 at 11.07.49 AM
No ratings yet
Screenshot 2023-12-07 at 11.07.49 AM
14 pages
B2 40 Practical 5A
No ratings yet
B2 40 Practical 5A
6 pages
Lab6to8
No ratings yet
Lab6to8
7 pages
Naive Bayes
No ratings yet
Naive Bayes
58 pages
Aiml Lab
No ratings yet
Aiml Lab
13 pages
Ex 3
No ratings yet
Ex 3
5 pages
MACHINE LEARNING LAB MANUAL (1)
No ratings yet
MACHINE LEARNING LAB MANUAL (1)
23 pages
ccc
No ratings yet
ccc
25 pages
Import Numpy As NP
No ratings yet
Import Numpy As NP
4 pages
ML - LAB - 7 - Jupyter Notebook
100% (1)
ML - LAB - 7 - Jupyter Notebook
7 pages
aiml sample programs
No ratings yet
aiml sample programs
20 pages
Name: Suprit Darshan Shrestha Reg - no:19BCE2584: Lab DA1 Machine Learning Lab
No ratings yet
Name: Suprit Darshan Shrestha Reg - no:19BCE2584: Lab DA1 Machine Learning Lab
9 pages
ML1 3 Merged
No ratings yet
ML1 3 Merged
19 pages
Code Question1-Adaline
No ratings yet
Code Question1-Adaline
29 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
QM 5
No ratings yet
QM 5
28 pages
Recitatif by Morrison
No ratings yet
Recitatif by Morrison
5 pages
Calbr Cmpanalyzer User
No ratings yet
Calbr Cmpanalyzer User
106 pages
Introduction To Citrix Netscaler Load Balancer
No ratings yet
Introduction To Citrix Netscaler Load Balancer
16 pages
Operations Manual - Us
No ratings yet
Operations Manual - Us
611 pages
12TH English Q 1 (2024)
No ratings yet
12TH English Q 1 (2024)
9 pages
Sosio Xi
No ratings yet
Sosio Xi
8 pages
CARR A Guide To Online Radical Right Symbols Slogan and Slurs
100% (1)
CARR A Guide To Online Radical Right Symbols Slogan and Slurs
88 pages
Cara Menjawab Kertas 2 Bahasa Inggeris
94% (32)
Cara Menjawab Kertas 2 Bahasa Inggeris
95 pages
CTAir-Brochure Monitoreo C02
No ratings yet
CTAir-Brochure Monitoreo C02
26 pages
Dse Paper 2 - Writing Overview of Sentence Pa Erns
No ratings yet
Dse Paper 2 - Writing Overview of Sentence Pa Erns
2 pages
Amcat Test Briefing
No ratings yet
Amcat Test Briefing
2 pages
2PL3 PLC Lecture 0
No ratings yet
2PL3 PLC Lecture 0
19 pages
COM - 113 - INTRO. Basic Program
No ratings yet
COM - 113 - INTRO. Basic Program
3 pages
Unit-4 TRANSPORT, SESSION, PRESENTATION & APPLICATION LAYER
No ratings yet
Unit-4 TRANSPORT, SESSION, PRESENTATION & APPLICATION LAYER
22 pages
Future Will Tenses Exercises
No ratings yet
Future Will Tenses Exercises
1 page
Paragraph #2 - Writing Guide
No ratings yet
Paragraph #2 - Writing Guide
4 pages
Selected Writings of Mahamahopadhyaya Gopinath Kaviraj
100% (6)
Selected Writings of Mahamahopadhyaya Gopinath Kaviraj
205 pages
630i_7159
No ratings yet
630i_7159
38 pages
High School Lesson Plan Powerpoint
No ratings yet
High School Lesson Plan Powerpoint
25 pages
Greek Mythology and Gods Thesis Defense by Slidesgo
No ratings yet
Greek Mythology and Gods Thesis Defense by Slidesgo
53 pages
Flynn's Classification 3.1.2
No ratings yet
Flynn's Classification 3.1.2
4 pages
Lucifer and Ahriman Under The Bed
No ratings yet
Lucifer and Ahriman Under The Bed
20 pages
CS101 Quiz
No ratings yet
CS101 Quiz
6 pages
Ge 1 Uts LP 3
No ratings yet
Ge 1 Uts LP 3
17 pages
T2a_Inno2019_MDOF_Systems_Energy Methods
No ratings yet
T2a_Inno2019_MDOF_Systems_Energy Methods
21 pages
Rumusan
No ratings yet
Rumusan
13 pages
Initation A L Informatique Sous Ubuntu Version Finale Creative Common
No ratings yet
Initation A L Informatique Sous Ubuntu Version Finale Creative Common
585 pages