ML Lab Manual
ML Lab Manual
Of
Machine Learning
DEPARTMENTOFCOMPUTERSCIENCEAND ENGINEERING
Sri Indu Institute of Engineering and Technology
Sheriguda (v),Ibrahimpatnam (M),R.R.Dist-501510
SRIINDU INSTITUTEOFENGINEERING ANDTECHNOLOGY
(AnAutonomousInstitution underUGC)
Accredited by NAAC with A+ Grade, Recognized under 2(f) of UGC Act
1956(ApprovedbyAICTE,New DelhiandAffiliatedtoJNTUH,Hyderabad)
Khalsa Ibrahimpatnam, Sheriguda (V), Ibrahimpatnam (M), Ranga Reddy Dist., Telangana – 501
510Website: https://siiet.ac.in/
INSTITUTE VISION
To become a premier institute of academic excellence by providing the world class education
that transforms individuals into high intellectuals, by evolving them as empathetic and
responsible citizens through continuous improvement.
INSTITUTEMISSION
IM1: To offer outcome-based education and enhancement of technical and practical skills.
IM2: To continuous assess of teaching-learning process through institute-industry
collaboration.
IM3: To be a centre of excellence for innovative and emerging fields in technology
development with state-of-art facilities to faculty and students fraternity.
IM4: To create an enterprising environment to ensure culture, ethics and social responsibility
among the stake holders
SRIINDU INSTITUTEOFENGINEERING ANDTECHNOLOGY
(AnAutonomousInstitution underUGC)
Accredited by NAAC with A+ Grade, Recognized under 2(f) of UGC Act
1956(ApprovedbyAICTE,New DelhiandAffiliatedtoJNTUH,Hyderabad)
Khalsa Ibrahimpatnam, Sheriguda (V), Ibrahimpatnam (M), Ranga Reddy Dist., Telangana – 501
510Website: https://siiet.ac.in/
LIST OF EXPERIMENTS
Given the following data, which specify classifications for nine Combinations
of VAR1 and VAR2 predict a classification for a case where VAR1=0.906
and VAR2=0.606, using the result of k-means clustering with 3 means (i.e., 3
centroids)
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
Input attributes are (from left to right) income, recreation, job, status, age-
group, home-owner. Find the unconditional probability of `golf' and the
conditional probability of `single' given `med Risk' in the dataset?
Machine Learning is used anywhere from automating mundane tasks to offering intelligent insights,
industries in every sector try to benefit from it. You may already be using a device that utilizes it. For
example, a wearable fitness tracker like Fitbit, or an intelligent home assistant like Google Home. But
there are much more examples of ML in use.
• Prediction: Machine learning can also be used in the prediction systems. Considering the loan
example, to compute the probability of a fault, the system will need to classify the available data in
groups.
• Image recognition: Machine learning can be used for face detection in an image as well. There is
aseparate category for each person in a database of several people.
• Speech Recognition: It is the translation of spoken words into the text. It is used in voice searches
and more. Voice user interfaces include voice dialing, call routing, and appliance control. It can
also be used a simple data entry and the preparation of structured documents.
• Medical diagnoses: ML is trained to recognize cancerous tissues.
• Financial industry: Trading companies use ML in fraud investigations and credit checks.
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
In Supervised learning, an AI system is presented with data which is labeled, which means that each data
tagged with the correct label.
The goal is to approximate the mapping function so well that when you have new input
data (x) that you can predict the output variables (Y) for that data.
As shown in the above example, we have initially taken some data and marked them as ‘Spam’ or ‘Not
Spam’. This labeled data is used by the training supervised model, this data is used to train the model.
Once it is trained we can test our model by testing it with some test new mails and checking of the model
is able to predict the right output.
In unsupervised learning, an AI system is presented with unlabeled, uncategorized data and the system’s
algorithms act on the data without prior training. The output is dependent upon the coded algorithms.
Subjecting a system to unsupervised learning is one way of testing AI.
A reinforcement learning algorithm, or agent, learns by interacting with its environment. The agent
receives rewards by performing correctly and penalties for performing incorrectly. The agent learns
without intervention from a human by maximizing its reward and minimizing its penalty. It is a type of
dynamic programming that trains algorithms using a system of reward and punishment.
in the above example, we can see that the agent is given 2 options i.e. a path with water or a path with fire.
A reinforcement algorithm works on reward a system i.e. if the agent uses the fire path then the rewards
are subtracted and agent tries to learn that it should avoid the fire path. If it had chosen the water path or
the safe path then some points would have been added to the reward points, the agent then would try to
learn what path is safe and what path isn’t.
It is basically leveraging the rewards obtained; the agent improves its environment knowledge to select the
next action.
PROGRAM 1
AIM: To find the probability that a student is absent given that today is
Friday.
DESCRIPTION:
Machine learning is a method of data analysis that automates analytical
model building of data set. Using the implemented algorithms that
iteratively learn from data, machine learning allows computers to find
hidden insights without being explicitly programmed where to look. Naive
bayes algorithm is one of the most popular machines learning technique. In
this article we will look how to implement Naive Baye’s algorithm using
python.
Before someone can understand Bayes’ theorem, they need to know a couple of
related concepts first, namely, the idea of Conditional Probability, and
Bayes’ Rule.
Let say we have a collection of people. Some of them are singers. They are
either male or female. If we select a random sample, what is the probability
that this person is a male? what is the probability that this person is a
male and singer? Conditional Probability is the best option here. We can
calculate probability like,
We can simply define Bayes rule like this. Let A1, A2, …, An be a set of
mutually exclusive events that together form the sample space S. Let B be
any event from the same sample space, such that P(B) > 0. Then, P (Ak | B) =
P(Ak ∩ B ) / P( A1 ∩ B ) + P( A2 ∩ B ) + . . . + P( An ∩ B )
and
Then,
===============================
Source Code :
===============================
'''
=================================
Explanation:
=================================
===> First You need to Create a Table (students) in Mysql Database (SampleDB)
---> Open Command prompt and then execute the following command to enter into MySQL p
rompt.
And then, you need to execute the following commands at MySQL prompt to create table
in the database.
===> NeXT, Open Command prompt and then execute the following command to install mysq
l.connector package to connect with mysql database through python.
===============================
Source Code:
===============================. '''
import mysql.connector
# Create the connection object
myconn = mysql.connector.connect(host = "localhost", user = "root",passwd = "",databa
se="SampleDB")
# Creating the cursor object
cur = myconn.cursor()
# Executing the query
cur.execute("select * from students")
# Fetching the rows from the cursor object
result = cur.fetchall()
print("Student Details are:")
# Printing the result
for x in result:
print(x);
# Commit the transaction
myconn.commit()
# Close the connection
myconn.close()
Output:
PROGRAM 3
DESCRIPTION:
================================
Explanation:
=================================
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install sklearn Mo
dule
-----------------------------------------------
In this program, we are going to use iris dataset. And this dataset Split into traini
ng (70%) and test set(30%).
The iris dataset conatins the following features
The Sample data in iris dataset format is [5.4 3.4 1.7 0.2]
===============================
Source Code:
===============================
'''
# Loading data
data_iris = load_iris()
# To get list of target names
label_target = data_iris.target_names
print()
print("Sample Data from Iris Dataset")
print("*"*30)
# to display the sample data from the iris dataset
for i in range(10):
rn = random.randint(0,120)
print(data_iris.data[rn],"===>",label_target[data_iris.target[rn]])
knn.fit(X_train, y_train)
# to display the score
print("The Score is :",knn.score(X_test, y_test))
# To get test data from the user
test_data = input("Enter Test Data:").split(",")
for i in range(len(test_data)):
test_data[i] = float(test_data[i])
print()
v = knn.predict([test_data])
print("Predicted output is:",label_target[v])
except:
print("Please supply valid input......")
Output:
PROGRAM-4
4.Given the following data, which specify classifications for
nine combinations of VAR1 and VAR2 predict a classification for
a case where VAR1=0.906 and VAR2=0.606, using the result of k-
means clustering with 3 means (i.e., 3centroids)
=================================
Explanation:
=================================
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install sklearn Mo
dule
Finally, you need to predict the class for the VAR1=0.906 and VAR2=0.606
===============================
Source Code:
===============================
'''
from sklearn.cluster import KMeans
import numpy as np
X = np.array([[1.713,1.586], [0.180,1.786], [0.353,1.240],
[0.940,1.566], [1.486,0.759], [1.266,1.106],[1.540,0.419],[0.459,1.799],[0.773,0.186]
])
y=np.array([0,1,1,0,1,0,1,1,1])
kmeans = KMeans(n_clusters=3, random_state=0).fit(X,y)
print("The input data is ")
print("VAR1 \t VAR2 \t CLASS")
i=0
for val in X:
print(val[0],"\t",val[1],"\t",y[i])
i+=1
print("="*20)
# To get test data from the user
print("The Test data to predict ")
test_data = []
VAR1 = float(input("Enter Value for VAR1 :"))
VAR2 = float(input("Enter Value for VAR2 :"))
test_data.append(VAR1)
test_data.append(VAR2)
print("="*20)
print("The predicted Class is : ",kmeans.predict([test_data]))
Output: -
PROGRAM 5
Input attributes are (from left to right) income, recreation, job, status, age-group,
home-owner. Find the unconditional probability of 'golf' and the conditional probabil
ity of 'single' given 'medRisk' in the dataset
=================================
Explanation:
=================================
******************************
To find the Conditional probability of single given medRisk,
---> S : single
---> MR : medRisk
P(S ∩ MR) = The number of MedRisk with Single records / total number of Records
= 2 / 10 = 0.2
and
===============================
Source Code :
===============================
'''
total_Records=10
numGolfRecords=4
unConditionalprobGolf=numGolfRecords / total_Records
print("Unconditional probability of golf: ={}".format(unConditionalprobGolf))
#conditional probability of 'single' given 'medRisk'
numMedRiskSingle=2
numMedRisk=3
probMedRiskSingle=numMedRiskSingle/total_Records
probMedRisk=numMedRisk/total_Records
conditionalProb=(probMedRiskSingle/probMedRisk)
print("Conditional probability of single given medRisk: ={}".format(conditionalProb))
Output:
PROGRAM 6:
================================
Explanation:
=================================
===> To run this program you need to install the pandas Module
===> To install, Open Command propmt and then execute the following command
===> To install, Open Command propmt and then execute the following command
===============================
Source Code:
===============================
'''
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# To read data from Age_Income.csv file
dataFrame = pd.read_csv('Age_Income.csv')
# To place data in to age and income vectors
age = dataFrame['Age']
income = dataFrame['Income']
# number of points
num = np.size(age)
# To find the mean of age and income vector
mean_age = np.mean(age)
mean_income = np.mean(income)
Age_Income.csv(Data Set)
Age, Income
25,25000
23,22000
24,26000
28,29000
34,38600
32,36500
42,41000
55,81000
45,47500
Output:
PROGRAM 7:
=================================
Explanation:
=================================
===> To run this program you need to install the pandas Module
===> To install, Open Command propmt and then execute the following command
===> Open Command propmt and then execute the following command to install sklearn Mo
dule
===============================
Source Code :
===============================
'''
import pandas as pd
from sklearn. model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall
_score
msglbl_data = pd.read_csv('Statements_data.csv', names=['Message', 'Label'])
print("The Total instances in the Dataset: ", msglbl_data.shape[0])
msglbl_data['labelnum'] = msglbl_data.Label.map({'pos': 1, 'neg': 0})
# place the data in X and Y Vectors
X = msglbl_data["Message"]
Y = msglbl_data.labelnum
# to split the data into train se and test set
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y)
count_vect = CountVectorizer()
Xtrain_dims = count_vect.fit_transform(Xtrain)
Xtest_dims = count_vect.transform(Xtest)
df = pd.DataFrame(Xtrain_dims.toarray(),columns=count_vect.get_feature_names_out())
clf = MultinomialNB()
# to fit the train data into model
clf.fit(Xtrain_dims, Ytrain)
# to predict the test data
prediction = clf.predict(Xtest_dims)
print('******** Accuracy Metrics *********')
print('Accuracy : ', accuracy_score(Ytest, prediction))
print('Recall : ', recall_score(Ytest, prediction))
print('Precision : ',precision_score(Ytest, prediction))
print('Confusion Matrix : \n', confusion_matrix(Ytest, prediction))
print(10*"-")
# to predict the input statement
test_stmt = [input("Enter any statement to predict :")]
test_dims = count_vect.transform(test_stmt)
pred = clf.predict(test_dims)
for stmt,lbl in zip(test_stmt,pred):
if lbl == 1:
print("Statement is Positive")
else:
print("Statement is Negative")
Statements_data.csv(Data Set)
Output:
PROGRAM 8
Source Code:
import numpy
# Parameter initialization
genes = 2
chromosomes = 10
mattingPoolSize = 6
offspringSize = chromosomes - mattingPoolSize
lb = -5
ub = 5
populationSize = (chromosomes, genes)
generations = 3
#Population initialization
population = numpy.random.uniform(lb, ub, populationSize)
for generation in range(generations):
print(("Generation:", generation+1))
fitness = numpy.sum(population*population, axis=1)
print("\npopulation")
print(population)
print("\nfitness calcuation")
print(fitness)
# Following statement will create an empty two dimensional array to store
parents
parents = numpy.empty((mattingPoolSize, population.shape[1]))
# A loop to extract one parent in each iteration
for p in range(mattingPoolSize):
# Finding index of fittest chromosome in the population
fittestIndex = numpy.where(fitness == numpy.max(fitness))
# Extracting index of fittest chromosome
fittestIndex = fittestIndex[0][0]
# Copying fittest chromosome into parents array
parents[p, :] = population[fittestIndex, :]
# Changing fitness of fittest chromosome to avoid reselection of that
chromosome
fitness[fittestIndex] = -1
print("\nParents:")
print(parents)
# Following statement will create an empty two dimensional array to store
offspring
offspring = numpy.empty((offspringSize, population.shape[1]))
for k in range(offspringSize):
#Determining the crossover point
crossoverPoint = numpy.random.randint(0,genes)
# Index of the first parent.
parent1Index = k%parents.shape[0]
# Index of the second.
parent2Index = (k+1)%parents.shape[0]
# Extracting second half of the offspring
offspring[k, crossoverPoint:] = parents[parent2Index, crossoverPoint:]
print("\nOffspring after crossover:")
print(offspring)
# Implementation of random initialization mutation.
for index in range(offspring.shape[0]):
randomIndex = numpy.random.randint(1,genes)
randomValue = numpy.random.uniform(lb, ub, 1)
offspring [index, randomIndex] = offspring [index, randomIndex] +
randomValue
print("\n Offspring after Mutation")
print(offspring)
population[0:parents.shape[0], :] = parents
population[parents.shape[0]:, :] = offspring
print("\nNew Population for next generation:")
print(population)
fitness = numpy.sum(population*population, axis=1)
fittestIndex = numpy.where(fitness == numpy.max(fitness))
# Extracting index of fittest chromosome
fittestIndex = fittestIndex[0][0]
# Getting Best chromosome
fittestInd = population[fittestIndex, :]
bestFitness = fitness[fittestIndex]
print("\nBest Individual:")
print(fittestInd)
print("\nBest Individual's Fitness:")
print(bestFitness)
Output: -
('Generation:', 1)
population
[[ 4.80681675 2.17457345]
[ 2.68516631 -4.36671398]
[ 0.19027998 -1.92076011]
[-4.51396933 -1.89463461]
[ 0.79755849 3.43265172]
[-1.54352966 3.94293134]
[ 2.63471426 -3.51067942]
[-1.02184282 -4.64715438]
[ 1.49179561 -2.11644882]
[ 1.46421881 4.06826713]]
fitness calcuation
[27.834257 26.27830909 3.72552586 23.96555941 12.41919734 17.92919137
19.26658925 22.64020659 6.70480973 18.69473415]
Parents:
[[ 4.80681675 2.17457345]
[ 2.68516631 -4.36671398]
[-4.51396933 -1.89463461]
[-1.02184282 -4.64715438]
[ 2.63471426 -3.51067942]
[ 1.46421881 4.06826713]]
('Generation:', 2)
population
[[ 4.80681675 2.17457345]
[ 2.68516631 -4.36671398]
[-4.51396933 -1.89463461]
[-1.02184282 -4.64715438]
[ 2.63471426 -3.51067942]
[ 1.46421881 4.06826713]
[ 2.68516631 -4.32266198]
[ 2.68516631 2.13424051]
[-1.02184282 -7.79018038]
[-1.02184282 -5.94231314]]
fitness calcuation
[27.834257 26.27830909 23.96555941 22.64020659 19.26658925 18.69473415
25.89552471 11.76510065 61.73107316 36.35524821]
Parents:
[[-1.02184282 -7.79018038]
[-1.02184282 -5.94231314]
[ 4.80681675 2.17457345]
[ 2.68516631 -4.36671398]
[ 2.68516631 -4.32266198]
[-4.51396933 -1.89463461]]
=================================
Explanation:
=================================
===> To run this program you need to install the pandas Module
===> To install, Open Command propmt and then execute the following command
===> Open Command propmt and then execute the following command to install sklearn Mo
dule
===> Open Command propmt and then execute the following command to install sklearn-ne
ural network Module
===============================
Source Code :
===============================
'''
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall
_score
msglbl_data = pd.read_csv('Statements_data.csv', names=['Message', 'Label'])
print("The Total instances in the Dataset: ", msglbl_data.shape[0])
msglbl_data['labelnum'] = msglbl_data.Label.map({'pos': 1, 'neg': 0})
# place the data in X and Y Vectors
X = msglbl_data["Message"]
Y = msglbl_data.labelnum
# to split the data into train se and test set
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y)
count_vect = CountVectorizer()
Xtrain_dims = count_vect.fit_transform(Xtrain)
Xtest_dims = count_vect.transform(Xtest)
df = pd.DataFrame(Xtrain_dims.toarray(),columns=count_vect.get_feature_names_out())
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(5, 2), random_stat
e=1)
# to fit the train data into model
clf.fit(Xtrain_dims, Ytrain)
# to predict the test data
prediction = clf.predict(Xtest_dims)
print('******** Accuracy Metrics *********')
print('Accuracy : ', accuracy_score(Ytest, prediction))
print('Recall : ', recall_score(Ytest, prediction))
print('Precision : ',precision_score(Ytest, prediction))
print('Confusion Matrix : \n', confusion_matrix(Ytest, prediction))
print(10*"-")
# to predict the input statement
test_stmt = [input("Enter any statement to predict :")]
test_dims = count_vect.transform(test_stmt)
pred = clf.predict(test_dims)
for stmt,lbl in zip(test_stmt,pred):
if lbl == 1:
print("Statement is Positive")
else:
print("Statement is Negative")
Statements_data.csv(Data Set)
Output: